CN114611541A - Invoice image recognition method, device, equipment and storage medium - Google Patents

Invoice image recognition method, device, equipment and storage medium Download PDF

Info

Publication number
CN114611541A
CN114611541A CN202210241320.0A CN202210241320A CN114611541A CN 114611541 A CN114611541 A CN 114611541A CN 202210241320 A CN202210241320 A CN 202210241320A CN 114611541 A CN114611541 A CN 114611541A
Authority
CN
China
Prior art keywords
field
result
image
identification result
invoice
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210241320.0A
Other languages
Chinese (zh)
Inventor
施伟斌
刘鹏
庞烨
刘玉宇
肖京
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Ping An Technology Shenzhen Co Ltd
Original Assignee
Ping An Technology Shenzhen Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Ping An Technology Shenzhen Co Ltd filed Critical Ping An Technology Shenzhen Co Ltd
Priority to CN202210241320.0A priority Critical patent/CN114611541A/en
Publication of CN114611541A publication Critical patent/CN114611541A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06KGRAPHICAL DATA READING; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
    • G06K7/00Methods or arrangements for sensing record carriers, e.g. for reading patterns
    • G06K7/10Methods or arrangements for sensing record carriers, e.g. for reading patterns by electromagnetic radiation, e.g. optical sensing; by corpuscular radiation
    • G06K7/14Methods or arrangements for sensing record carriers, e.g. for reading patterns by electromagnetic radiation, e.g. optical sensing; by corpuscular radiation using light without selection of wavelength, e.g. sensing reflected white light
    • G06K7/1404Methods for optical code recognition
    • G06K7/1408Methods for optical code recognition the method being specifically adapted for the type of code
    • G06K7/14172D bar codes
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/22Matching criteria, e.g. proximity measures

Abstract

The invention relates to the technical field of image processing, in particular to an invoice image identification method, a device, equipment and a storage medium, wherein the method comprises the following steps: acquiring an invoice image to be identified; extracting a field image from the invoice image based on a preset first algorithm and identifying the field image to obtain a field identification result; detecting a two-dimensional code image from the invoice image based on a preset second algorithm, and analyzing the two-dimensional code image based on a preset two-dimensional code analysis tool package; and when the two-dimension code image is successfully analyzed, obtaining a two-dimension code identification result, checking according to the two-dimension code identification result and the field identification result, and obtaining an identification result of the invoice image according to the checking result. Through the mode, the accuracy and the reliability of the overall identification of the invoice image can be improved.

Description

Invoice image identification method, device, equipment and storage medium
Technical Field
The invention relates to the technical field of image processing, in particular to an invoice image identification method, device, equipment and storage medium.
Background
At present, in the aspect of automatic accounting of a financial system, a paper invoice needs to be scanned into an electronic image, and then the content of the invoice image is identified by adopting an OCR (optical character recognition) technology so as to achieve the purpose of automatic accounting. However, in the bill OCR recognition, especially in the value-added tax invoice recognition, the quality of the invoice image will affect the recognition effect of the invoice image to a great extent, for example, the recognition effect is not good due to the quality problems of the invoice image such as blurring, shielding, reflection, deformation, etc., and if the OCR algorithm is simply adopted, the final recognition result may have the problems of wrong key field recognition, field loss, etc.; in addition, for the purposes of reimbursement and the like, the situation that a user counterfeits an invoice or tampers key field information by himself exists, an OCR algorithm can only recognize character information on the invoice surface, and reliability of the field is not evaluated, so that accuracy and reliability of overall recognition of an invoice image are low.
Disclosure of Invention
The invention provides an invoice image identification method, device, equipment and storage medium, which can improve the accuracy and reliability of overall identification of an invoice image and solve the problem of low identification accuracy caused by the fact that the reliability of field identification is not evaluated by the existing invoice image verification.
In order to solve the technical problems, the invention adopts a technical scheme that: an invoice image recognition method is provided, which comprises the following steps:
acquiring an invoice image to be identified;
extracting a field image from the invoice image based on a preset first algorithm and identifying the field image to obtain a field identification result;
detecting a two-dimensional code image from the invoicing image based on a preset second algorithm, and analyzing the two-dimensional code image based on a preset two-dimensional code analysis tool package;
and when the two-dimension code image is successfully analyzed, obtaining a two-dimension code identification result, checking according to the two-dimension code identification result and the field identification result, and obtaining an identification result of the invoice image according to a checking result.
According to an embodiment of the present invention, the extracting a field image from the invoice image based on a preset first algorithm and identifying the field image, and obtaining a field identification result includes:
extracting field images from the invoice images and carrying out character detection on the field images to obtain position information and a first confidence coefficient of each field;
identifying the text content of each field to obtain the text content and a second confidence coefficient corresponding to the field;
carrying out structuring processing on each field and the corresponding text content to obtain structured information and a third confidence coefficient;
and calculating a comprehensive confidence degree according to the first confidence degree, the second confidence degree and the third confidence degree, and taking the comprehensive confidence degree and the structured information as the field recognition result.
According to an embodiment of the present invention, the verifying according to the two-dimensional code recognition result and the field recognition result, and obtaining the recognition result of the invoice image according to the verification result includes:
judging whether the field identification result is empty or not;
if the field identification result is not empty, judging whether the field identification result is complete, checking according to the two-dimensional code identification result and the field identification result based on the judgment result, and obtaining the identification result of the invoice image according to the checking result;
and if the field identification result is empty, taking the two-dimensional code identification result as the identification result of the invoice image.
According to an embodiment of the present invention, if the field identification result is not empty, determining whether the field identification result is complete, performing verification according to the two-dimensional code identification result and the field identification result based on the determination result, and obtaining the identification result of the invoice image according to the verification result further includes:
and if the field identification result is incomplete, correcting the field identification result by using the two-dimensional code identification result, and taking the corrected field identification result as the identification result of the invoice image.
According to an embodiment of the present invention, if the field identification result is not empty, determining whether the field identification result is complete, performing verification according to the two-dimensional code identification result and the field identification result based on the determination result, and obtaining the identification result of the invoice image according to the verification result includes:
if the field identification result is complete, calculating the similarity between the two-dimensional code identification result and the text content corresponding to the same field in the field identification result;
and generating an identification result of the invoice image according to the comprehensive confidence and the similarity of the same field.
According to an embodiment of the present invention, the generating the recognition result of the invoice image according to the combined confidence and the similarity of the same field comprises:
comparing the integrated confidence of the same field with a preset confidence threshold;
if the comprehensive confidence of the field is greater than a preset confidence threshold, comparing the similarity with a preset similarity threshold, and generating an identification result of the invoice image according to a similarity comparison result;
and if the comprehensive confidence of the fields is smaller than a preset confidence threshold, correcting the field recognition result by using the two-dimensional code recognition result, and taking the corrected field recognition result as the recognition result of the invoice image.
According to an embodiment of the present invention, the comparing the similarity with a preset similarity threshold, and the generating the identification result of the invoice image according to the similarity comparison result includes:
if the similarity of the fields is larger than a preset similarity threshold value, taking the two-dimension code recognition result as the recognition result of the invoice image;
and if the similarity of the fields is smaller than a preset similarity threshold value, taking the field identification result as the identification result of the invoice image, and sending early warning information.
In order to solve the technical problem, the invention adopts another technical scheme that: provided is an invoice image recognition device, including:
the acquiring module is used for acquiring an invoice image to be identified;
the identification module is used for extracting a field image from the invoice image based on a preset first algorithm and identifying the field image to obtain a field identification result;
the analysis module is used for detecting a two-dimensional code image from the invoicing image based on a preset second algorithm and analyzing the two-dimensional code image based on a preset two-dimensional code analysis tool package;
and the execution module is used for obtaining a two-dimensional code identification result when the two-dimensional code image is successfully analyzed, checking according to the two-dimensional code identification result and the field identification result, and obtaining the identification result of the invoice image according to the checking result.
In order to solve the technical problems, the invention adopts another technical scheme that: there is provided a computer device comprising: the invoice image recognition method comprises a memory, a processor and a computer program stored on the memory and capable of running on the processor, wherein the processor realizes the invoice image recognition method when executing the computer program.
In order to solve the technical problems, the invention adopts another technical scheme that: there is provided a computer storage medium having stored thereon a computer program which, when executed by a processor, implements the invoice image recognition method described above.
The invention has the beneficial effects that: when the two-dimensional code image is successfully analyzed, field supplementing and checking are carried out on the basis of the field identification result and the two-dimensional code identification result to obtain the identification result of the invoice image, the overall identification accuracy and reliability of the invoice image can be improved, and the problem that the identification accuracy is low due to the fact that the reliability of field identification is not evaluated in the existing invoice image verification is solved.
Drawings
FIG. 1 is a flow chart illustrating an invoice image recognition method according to an embodiment of the invention;
FIG. 2 is a flowchart illustrating step S102 of the invoice image recognition method according to an embodiment of the present invention;
FIG. 3 is a flowchart illustrating step S104 of the invoice image recognition method according to an embodiment of the present invention;
FIG. 4 is a flowchart illustrating the step S302 of the invoice image recognition method according to an embodiment of the present invention;
FIG. 5 is a flowchart illustrating step S403 of the invoice image recognition method according to an embodiment of the present invention;
FIG. 6 is a schematic structural diagram of an invoice image recognition device according to an embodiment of the invention;
FIG. 7 is a schematic structural diagram of a computer device according to an embodiment of the present invention;
fig. 8 is a schematic structural diagram of a computer storage medium according to an embodiment of the present invention.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
The terms "first", "second" and "third" in the present invention are used for descriptive purposes only and are not to be construed as indicating or implying relative importance or to implicitly indicate the number of technical features indicated. Thus, a feature defined as "first," "second," or "third" may explicitly or implicitly include at least one of the feature. In the description of the present invention, "a plurality" means at least two, e.g., two, three, etc., unless specifically limited otherwise. All directional indicators (such as up, down, left, right, front, and rear … …) in the embodiments of the present invention are only used to explain the relative positional relationship between the components, the movement, and the like in a specific posture (as shown in the drawings), and if the specific posture is changed, the directional indicator is changed accordingly. Furthermore, the terms "include" and "have," as well as any variations thereof, are intended to cover non-exclusive inclusions. For example, a process, method, system, article, or apparatus that comprises a list of steps or elements is not limited to only those steps or elements but may alternatively include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.
Reference herein to "an embodiment" means that a particular feature, structure, or characteristic described in connection with the embodiment can be included in at least one embodiment of the invention. The appearances of the phrase in various places in the specification are not necessarily all referring to the same embodiment, nor are separate or alternative embodiments mutually exclusive of other embodiments. It is explicitly and implicitly understood by one skilled in the art that the embodiments described herein can be combined with other embodiments.
Fig. 1 is a flowchart illustrating an invoice image recognition method according to an embodiment of the present invention. It should be noted that the method of the present invention is not limited to the flow sequence shown in fig. 1 if the results are substantially the same. As shown in fig. 1, the method comprises the steps of:
step S101: and acquiring an invoice image to be identified.
In step S101, the invoice image to be identified may be a photographed or scanned piece of a real invoice, and the type of the invoice includes, but is not limited to, a medical invoice. The invoice image can only contain the invoice, also can contain other images beyond the invoice, when the invoice image contains other images, obtain the invoice image that waits to discern, detect the region that the invoice belongs to in the invoice image based on the target detection algorithm, the region that the invoice belongs to includes two-dimensional code image and field image. The invoice information contained in the two-dimensional code image and the field image includes but is not limited to invoice codes, invoice numbers, total amount, invoicing dates, invoice check codes and the like.
Step S102: and extracting a field image from the invoice image based on a preset first algorithm and identifying the field image to obtain a field identification result.
In step S102, the first algorithm may be an OCR algorithm, an OCR (Optical Character Recognition) Recognition technique draws outlines of characters on an image by Optical means, compares the outlines of the characters with characters in a standard word stock, and finally outputs the characters which are considered to be correct or incorrect after the Recognition. The OCR algorithm is capable of recognizing field information and calculating a confidence corresponding to the recognized field, where the confidence represents a saturation of recognition, i.e., a credibility of an OCR recognition result. The field identification result of this embodiment may be null or not null, and if there is no field image in the invoice image, the field image cannot be extracted, resulting in a null field identification result, and if there is a field image in the invoice image, the field image can be extracted and the field image can be identified, then the field identification result is not null. When the field identification result is not empty, the quality of the field image may directly affect the field identification result, and thus, the field identification result may be a complete result or a partial result.
As an embodiment, if the invoice information included in the field image includes five types of invoice codes, invoice numbers, total amount, invoicing date and invoice check code, the OCR algorithm is used to identify the five types of invoice information, and the confidence corresponding to each type of invoice information is further calculated.
In an implementation example, the OCR algorithm includes a text detection module, a text recognition module, and a field structuring module, wherein the text detection module can detect the positions of all the fields on the field image, output the position information of each field (for example, the coordinate information of the area where the field is located), and calculate the confidence corresponding to each position information, the text recognition module can recognize the text content of each area field and calculate the confidence corresponding to each text content, and the field structuring module associates the fields and the text content one-to-one to form structured information, and calculates the confidence corresponding to each structured information. As an embodiment, for example, the text detection module detects the field "invoice code", and the text recognition module recognizes that the text content of the field "invoice code" is "0444816156112", then the field structuring module forms the structured information as "invoice code: 0444816156112".
Step S103: and detecting the two-dimension code image from the invoice image based on a preset second algorithm, and analyzing the two-dimension code image based on a preset two-dimension code analysis tool package.
In step S103, the second algorithm may be an object detection algorithm, more specifically, an image object detection algorithm based on deep learning. The two-dimensional code analysis tool kit can be a zxing tool kit, and the two-dimensional code image can be analyzed through the zxing tool kit, so that the content information recorded by the two-dimensional code image, such as invoice codes, invoice numbers, total amount, invoicing dates, invoice check codes and the like, can be obtained.
Step S104: and when the two-dimension code image is successfully analyzed, obtaining a two-dimension code identification result, verifying according to the two-dimension code identification result and the field identification result, and obtaining an identification result of the invoice image according to the verification result.
In step S104, the two-dimensional code recognition result includes a field and text content corresponding to the field. The two-dimensional code image is successfully analyzed, namely the information recorded by the two-dimensional code image can be accurately acquired, and at the moment, the two-dimensional code identification result is not null.
Generally, the text content recorded in the same field in the two-dimensional code recognition result and the field recognition result should be the same, but there are some factors (such as image definition and recognition accuracy) that may cause the two-dimensional code recognition result and the field recognition result to be inconsistent, so that it is necessary to check according to the two-dimensional code recognition result and the field recognition result, and integrate the two-dimensional code recognition result and the field recognition result, thereby obtaining a reliable invoice image recognition result.
In an implementation embodiment, when the two-dimensional code image is not successfully parsed, the field identification result is used as the identification result of the invoice image.
And if the two-dimensional code is missing or not networked and the like, the two-dimensional code image is not successfully analyzed, namely the information recorded by the two-dimensional code image cannot be acquired, and if the two-dimensional code identification result is empty, the field identification result is used as the identification result of the invoice image.
According to the invoice image identification method provided by the embodiment of the invention, when the two-dimensional code image is successfully analyzed, field supplement and verification are carried out on the basis of the field identification result and the two-dimensional code identification result to obtain the identification result of the invoice image, so that the overall identification accuracy and reliability of the invoice image can be improved, and the problem of low identification accuracy caused by the fact that the reliability of field identification is not evaluated by the existing invoice image verification is solved.
In an embodiment, referring to fig. 2, step S102 further includes:
step S201: and extracting field images from the invoice images and carrying out character detection on the field images to obtain the position information and the first confidence coefficient of each field.
In this embodiment, character detection is performed on a field image based on an OCR algorithm, location information of each field is detected, and a first confidence level is calculated corresponding to each location information, where the first confidence level is a confidence level of location information identification of each field. Specifically, each kind of field information is identified through an OCR identification technology, scanning detection is performed on the field information for a plurality of times in the identification process, a belief value output by each scanning detection, namely, the identification degree, is obtained, a plurality of identification degrees corresponding to each kind of field information are obtained, the belief value represents the identification degree of the field information by the scanning detection, and specifically, the belief value represents the ratio of the number of characters which are not identified by the OCR to the total number of characters of the invoice information. Then, selecting a maximum value max (believe) from a plurality of identification degrees corresponding to the field information, and calculating a confidence corresponding to the field information according to a preset formula, wherein the preset formula may be: confidence 100-max (believe).
Step S202: and identifying the text content of each field to obtain the text content corresponding to the field and a second confidence coefficient.
In this embodiment, the text content of each field is recognized, the text information of each field is recognized, and a second confidence level is calculated corresponding to each text information, where the second confidence level is the confidence level of the recognition of each text information. The calculation method of the second confidence degree is similar to that of the first confidence degree, and is not repeated here.
Step S203: and performing association processing on each field and the corresponding text content to obtain structural information and a third confidence coefficient.
In this embodiment, the fields and the corresponding text contents are associated, and it can be understood that the corresponding relationship between the key and the value of each field is found to form complete structured information, and the third confidence is the confidence of the identification of each structured information. The calculation method of the third confidence coefficient is similar to that of the first confidence coefficient, and is not repeated here.
Step S204: and calculating a comprehensive confidence coefficient according to the first confidence coefficient, the second confidence coefficient and the third confidence coefficient, and taking the comprehensive confidence coefficient and the structured information as a field recognition result.
The integrated confidence of the embodiment is for a single field, and may be a weighted result of the first confidence, the second confidence and the third confidence of the same field, or a product result of the first confidence, the second confidence and the third confidence, not for all fields. The reliability of field identification can be comprehensively evaluated through the comprehensive confidence coefficient, so that the accuracy of field identification is improved.
In an embodiment, referring to fig. 3, step S104 further includes:
step S301: and judging whether the field identification result is empty or not.
Step S302: and if the field identification result is not null, judging whether the field identification result is complete, checking according to the two-dimensional code identification result and the field identification result based on the judgment result, and obtaining the identification result of the invoice image according to the checking result.
In this embodiment, if the field identification result is not empty, the number of fields included in the field identification result is counted, the counted result is compared with the number of standard fields, if the counted result is consistent with the number of standard fields, the field identification result is considered to be complete, and if the counted result is smaller than the number of standard fields, the field identification result is considered to be incomplete.
Step S303: and if the field identification result is empty, taking the two-dimensional code identification result as the identification result of the invoice image.
In an embodiment, referring to fig. 4, step S302 further includes:
step S401: and judging whether the field identification result is complete.
Step S402: and if the field identification result is incomplete, correcting the field identification result by using the two-dimensional code identification result, and taking the corrected field identification result as the identification result of the invoice image.
If the field identification result is incomplete, the situation may be that the invoice image has quality problems (such as blurring, field shielding, light reflection, deformation, and the like) to cause the lack of a part of field identification results; at this time, a partial result of the field recognition result is complemented or replaced with the two-dimensional code recognition result.
Step S403: and if the field identification result is complete, calculating the similarity between the two-dimensional code identification result and the text content corresponding to the same field in the field identification result, and generating the identification result of the invoice image according to the comprehensive confidence and the similarity of the same field.
The similarity of this embodiment may be cosine similarity, specifically cosine similarity between text contents corresponding to the same field. In the embodiment, under the condition that the field identification result is complete, the comprehensive confidence degree and the similarity are double checked, so that the accuracy and the reliability of the identification result of the invoice image are improved.
In an embodiment, referring to fig. 5, step S403 further includes:
step S501: the integrated confidence for the same field is compared to a preset confidence threshold.
Step S502: and if the comprehensive confidence of the field is greater than a preset confidence threshold, comparing the similarity with a preset similarity threshold, and generating an identification result of the invoice image according to the similarity comparison result.
In this embodiment, if the comprehensive confidence of a field is greater than the preset confidence threshold, the field identification result is considered to be reliable, the similarity is further compared, if the similarity between the text content in the two-dimensional code identification result of a field and the text content in the field identification result is greater than the preset similarity threshold, the two-dimensional code identification result of the field is considered to be similar to the field identification result, otherwise, the two-dimensional code identification result of the field is considered to be not similar to the field identification result.
Further, under the condition that the comprehensive confidence of the fields is greater than a preset confidence threshold, if the similarity of the fields is greater than a preset similarity threshold, taking the two-dimensional code recognition result as the recognition result of the invoice image; and if the similarity of the fields is smaller than a preset similarity threshold value, taking the field identification result as the identification result of the invoice image, and sending early warning information.
In this embodiment, if the comprehensive confidence of the field is greater than the preset confidence threshold and the similarity is less than the preset similarity threshold, it indicates that the reliability of the field identification result is higher, but the difference between the field identification result and the two-dimensional code identification result is more, and this may be that the user tampers with the field information of the invoice image, which causes the displayed information on the image to be inconsistent with the result of the actual two-dimensional code query; at this time, the field recognition result is taken as a final result (objective information on the anti-invoice image), and meanwhile, fake early warning information is also output to prompt the user of risk.
Step S503: and if the comprehensive confidence of the fields is smaller than a preset confidence threshold, correcting the field recognition result by using the two-dimensional code recognition result, and taking the corrected field recognition result as the recognition result of the invoice image.
In this embodiment, if the integrated confidence of a field is not greater than the preset confidence threshold, the field identification result is considered to be unreliable, which may be that the identification result of a part of the fields is not reliable due to quality problems (such as blurring, field occlusion, light reflection, deformation, etc.) in the invoice image; at the moment, the two-dimension code recognition result is used for complementing or replacing a partial result of the field recognition result; and for the fields which are not in the two-dimensional code identification result but are in the field identification result, outputting the field identification result, simultaneously prompting a user that the quality of the input invoice image is not good, and suggesting to adjust scanning or photographing equipment to improve the image input quality.
The identification result of the invoice image in this embodiment focuses on the two-dimensional code identification result, and when the two-dimensional code image is successfully analyzed, the two-dimensional code identification result is used as a final identification result or the two-dimensional code identification result is used to replace or complement the field identification result, so as to generate a final identification result.
Fig. 6 is a schematic structural diagram of an invoice image recognition device according to an embodiment of the present invention. As shown in fig. 6, the apparatus 60 includes an obtaining module 61, a recognition module 62, a parsing module 63, and an executing module 64.
The obtaining module 61 is configured to obtain an invoice image to be identified;
the identification module 62 is configured to extract a field image from the invoice image based on a preset first algorithm and identify the field image to obtain a field identification result;
the analysis module 63 is configured to detect a two-dimensional code image from the invoicing image based on a preset second algorithm, and analyze the two-dimensional code image based on a preset two-dimensional code analysis toolkit;
the execution module 64 is configured to obtain a two-dimensional code recognition result when the two-dimensional code image is successfully analyzed, perform verification according to the two-dimensional code recognition result and the field recognition result, and obtain an invoice image recognition result according to the verification result.
Referring to fig. 7, fig. 7 is a schematic structural diagram of a computer device according to an embodiment of the present invention. As shown in fig. 7, the computer device 70 includes a processor 71 and a memory 72 coupled to the processor 71.
The memory 72 stores program instructions for implementing the invoice image recognition method described in any of the above embodiments.
Processor 71 is operative to execute program instructions stored in memory 72 to identify an invoice image.
The processor 71 may also be referred to as a CPU (Central Processing Unit). The processor 71 may be an integrated circuit chip having signal processing capabilities. The processor 71 may also be a general purpose processor, a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), an off-the-shelf programmable gate array (FPGA) or other programmable logic device, discrete gate or transistor logic, discrete hardware components. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like.
Referring to fig. 8, fig. 8 is a schematic structural diagram of a computer storage medium according to an embodiment of the present invention. The computer storage medium of the embodiment of the present invention stores a program file 81 capable of implementing all the methods described above, wherein the program file 81 may be stored in the computer storage medium in the form of a software product, and includes several instructions to enable a computer device (which may be a personal computer, a server, or a network device) or a processor (processor) to execute all or part of the steps of the methods described in the embodiments of the present invention. And the aforementioned computer storage media comprise: various media capable of storing program codes, such as a usb disk, a mobile hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, or terminal devices, such as a computer, a server, a mobile phone, and a tablet.
In the embodiments provided in the present invention, it should be understood that the disclosed system, apparatus and method may be implemented in other ways. For example, the above-described apparatus embodiments are merely illustrative, and for example, a division of a unit is merely a logical division, and an actual implementation may have another division, for example, a plurality of units or components may be combined or integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection through some interfaces, devices or units, and may be in an electrical, mechanical or other form.
In addition, functional units in the embodiments of the present invention may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit. The integrated unit can be realized in a form of hardware, and can also be realized in a form of a software functional unit.
The above description is only an embodiment of the present invention, and not intended to limit the scope of the present invention, and all modifications of equivalent structures and equivalent processes, which are made by using the contents of the present specification and the accompanying drawings, or directly or indirectly applied to other related technical fields, are included in the scope of the present invention.

Claims (10)

1. An invoice image recognition method is characterized by comprising the following steps:
acquiring an invoice image to be identified;
extracting a field image from the invoice image based on a preset first algorithm and identifying the field image to obtain a field identification result;
detecting a two-dimensional code image from the invoicing image based on a preset second algorithm, and analyzing the two-dimensional code image based on a preset two-dimensional code analysis tool package;
and when the two-dimension code image is successfully analyzed, obtaining a two-dimension code identification result, checking according to the two-dimension code identification result and the field identification result, and obtaining an identification result of the invoice image according to a checking result.
2. The invoice image recognition method according to claim 1, wherein the extracting and recognizing the field image from the invoice image based on the preset first algorithm comprises:
extracting field images from the invoice images and carrying out character detection on the field images to obtain position information and a first confidence coefficient of each field;
identifying the text content of each field to obtain the text content and a second confidence coefficient corresponding to the field;
carrying out structuring processing on each field and the corresponding text content to obtain structured information and a third confidence coefficient;
and calculating a comprehensive confidence degree according to the first confidence degree, the second confidence degree and the third confidence degree, and taking the comprehensive confidence degree and the structured information as the field recognition result.
3. The invoice image recognition method according to claim 2, wherein the verifying according to the two-dimensional code recognition result and the field recognition result, and obtaining the recognition result of the invoice image according to the verification result comprises:
judging whether the field identification result is empty or not;
if the field identification result is not empty, judging whether the field identification result is complete, checking according to the two-dimensional code identification result and the field identification result based on the judgment result, and obtaining the identification result of the invoice image according to the checking result;
and if the field identification result is empty, taking the two-dimensional code identification result as the identification result of the invoice image.
4. The invoice image recognition method according to claim 3, wherein if the field recognition result is not empty, judging whether the field recognition result is complete, performing verification according to the two-dimensional code recognition result and the field recognition result based on the judgment result, and obtaining the recognition result of the invoice image according to the verification result further comprises:
and if the field identification result is incomplete, correcting the field identification result by using the two-dimensional code identification result, and taking the corrected field identification result as the identification result of the invoice image.
5. The invoice image recognition method according to claim 3, wherein if the field recognition result is not empty, judging whether the field recognition result is complete, checking according to the two-dimensional code recognition result and the field recognition result based on the judgment result, and obtaining the recognition result of the invoice image according to the checking result comprises:
if the field identification result is complete, calculating the similarity between the two-dimensional code identification result and the text content corresponding to the same field in the field identification result, and generating the identification result of the invoice image according to the comprehensive confidence and the similarity of the same field.
6. The invoice image recognition method of claim 5, wherein the generating a recognition result of the invoice image according to the combined confidence and the similarity of the same field comprises:
comparing the integrated confidence of the same field with a preset confidence threshold;
if the comprehensive confidence degree of the field is greater than a preset confidence degree threshold value, comparing the similarity with a preset similarity degree threshold value, and generating an identification result of the invoice image according to a similarity comparison result;
and if the comprehensive confidence of the fields is smaller than a preset confidence threshold, correcting the field recognition result by using the two-dimensional code recognition result, and taking the corrected field recognition result as the recognition result of the invoice image.
7. The invoice image recognition method of claim 6, wherein the comparing the similarity with a preset similarity threshold, and the generating the recognition result of the invoice image according to the similarity comparison result comprises:
if the similarity of the fields is larger than a preset similarity threshold value, taking the two-dimension code recognition result as the recognition result of the invoice image;
and if the similarity of the fields is smaller than a preset similarity threshold value, taking the field identification result as the identification result of the invoice image, and sending early warning information.
8. An invoice image recognition device, comprising:
the acquiring module is used for acquiring an invoice image to be identified;
the identification module is used for extracting a field image from the invoice image based on a preset first algorithm and identifying the field image to obtain a field identification result;
the analysis module is used for detecting a two-dimensional code image from the invoicing image based on a preset second algorithm and analyzing the two-dimensional code image based on a preset two-dimensional code analysis tool package;
and the execution module is used for obtaining a two-dimensional code identification result when the two-dimensional code image is successfully analyzed, checking according to the two-dimensional code identification result and the field identification result, and obtaining the identification result of the invoice image according to the checking result.
9. A computer device, comprising: memory, processor and computer program stored on the memory and executable on the processor, characterized in that the processor implements the invoice image recognition method according to any one of claims 1-7 when executing the computer program.
10. A computer storage medium having a computer program stored thereon, wherein the computer program, when executed by a processor, implements the invoice image recognition method according to any one of claims 1-7.
CN202210241320.0A 2022-03-11 2022-03-11 Invoice image recognition method, device, equipment and storage medium Pending CN114611541A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210241320.0A CN114611541A (en) 2022-03-11 2022-03-11 Invoice image recognition method, device, equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210241320.0A CN114611541A (en) 2022-03-11 2022-03-11 Invoice image recognition method, device, equipment and storage medium

Publications (1)

Publication Number Publication Date
CN114611541A true CN114611541A (en) 2022-06-10

Family

ID=81862250

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210241320.0A Pending CN114611541A (en) 2022-03-11 2022-03-11 Invoice image recognition method, device, equipment and storage medium

Country Status (1)

Country Link
CN (1) CN114611541A (en)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20140289107A1 (en) * 2011-11-10 2014-09-25 Gelliner Limited Invoice payment system and method
CN110647956A (en) * 2019-08-12 2020-01-03 深圳市华付信息技术有限公司 Invoice information extraction method combined with two-dimensional code recognition
CN112989990A (en) * 2021-03-09 2021-06-18 平安科技(深圳)有限公司 Medical bill identification method, device, equipment and storage medium
CN113657132A (en) * 2021-08-20 2021-11-16 平安科技(深圳)有限公司 Invoice image recognition method, device, equipment and medium based on two-dimensional code recognition

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20140289107A1 (en) * 2011-11-10 2014-09-25 Gelliner Limited Invoice payment system and method
CN110647956A (en) * 2019-08-12 2020-01-03 深圳市华付信息技术有限公司 Invoice information extraction method combined with two-dimensional code recognition
CN112989990A (en) * 2021-03-09 2021-06-18 平安科技(深圳)有限公司 Medical bill identification method, device, equipment and storage medium
CN113657132A (en) * 2021-08-20 2021-11-16 平安科技(深圳)有限公司 Invoice image recognition method, device, equipment and medium based on two-dimensional code recognition

Similar Documents

Publication Publication Date Title
CN110046529B (en) Two-dimensional code identification method, device and equipment
US11055524B2 (en) Data extraction pipeline
CN109658584B (en) Bill information identification method and device
EP2199945B1 (en) Biometric authentication device and method, computer-readable recording medium recorded with biometric authentication computer program, and computer system
US9171204B2 (en) Method of perspective correction for devanagari text
JP5591578B2 (en) Character string recognition apparatus and character string recognition method
US20200372248A1 (en) Certificate recognition method and apparatus, electronic device, and computer-readable storage medium
US7136526B2 (en) Character string recognition apparatus, character string recognizing method, and storage medium therefor
CN111144400A (en) Identification method and device for identity card information, terminal equipment and storage medium
CN110738236A (en) Image matching method and device, computer equipment and storage medium
CN114359553B (en) Signature positioning method and system based on Internet of things and storage medium
CN107622263A (en) The character identifying method and device of document image
CN111858977B (en) Bill information acquisition method, device, computer equipment and storage medium
RU2009124522A (en) INFORMATION PROCESSING DEVICE AND INFORMATION PROCESSING METHOD
CN111079480A (en) Identification method and device of identity card information and terminal equipment
CN112001200A (en) Identification code identification method, device, equipment, storage medium and system
CN108256608A (en) A kind of two dimensional image code and its recognition methods and equipment
CN112766275B (en) Seal character recognition method and device, computer equipment and storage medium
CN110909816B (en) Picture identification method and device
CN114611541A (en) Invoice image recognition method, device, equipment and storage medium
CN109934858B (en) Image registration method and device
CN112286780A (en) Method, device and equipment for testing recognition algorithm and storage medium
CN111081093A (en) Dictation content identification method and electronic equipment
CN111445616B (en) Invoice verification method and device, computer equipment and storage medium
CN108961531B (en) Method, device and equipment for identifying serial number of paper currency and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination