CN113657132A - Invoice image recognition method, device, equipment and medium based on two-dimensional code recognition - Google Patents

Invoice image recognition method, device, equipment and medium based on two-dimensional code recognition Download PDF

Info

Publication number
CN113657132A
CN113657132A CN202110960480.6A CN202110960480A CN113657132A CN 113657132 A CN113657132 A CN 113657132A CN 202110960480 A CN202110960480 A CN 202110960480A CN 113657132 A CN113657132 A CN 113657132A
Authority
CN
China
Prior art keywords
image
dimensional code
invoice
preset
invoice image
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202110960480.6A
Other languages
Chinese (zh)
Inventor
余宪
黄琳钧
刘鹏
肖京
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Ping An Technology Shenzhen Co Ltd
Original Assignee
Ping An Technology Shenzhen Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Ping An Technology Shenzhen Co Ltd filed Critical Ping An Technology Shenzhen Co Ltd
Priority to CN202110960480.6A priority Critical patent/CN113657132A/en
Publication of CN113657132A publication Critical patent/CN113657132A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06KGRAPHICAL DATA READING; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
    • G06K7/00Methods or arrangements for sensing record carriers, e.g. for reading patterns
    • G06K7/10Methods or arrangements for sensing record carriers, e.g. for reading patterns by electromagnetic radiation, e.g. optical sensing; by corpuscular radiation
    • G06K7/14Methods or arrangements for sensing record carriers, e.g. for reading patterns by electromagnetic radiation, e.g. optical sensing; by corpuscular radiation using light without selection of wavelength, e.g. sensing reflected white light
    • G06K7/1404Methods for optical code recognition
    • G06K7/1408Methods for optical code recognition the method being specifically adapted for the type of code
    • G06K7/14172D bar codes
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06KGRAPHICAL DATA READING; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
    • G06K7/00Methods or arrangements for sensing record carriers, e.g. for reading patterns
    • G06K7/10Methods or arrangements for sensing record carriers, e.g. for reading patterns by electromagnetic radiation, e.g. optical sensing; by corpuscular radiation
    • G06K7/14Methods or arrangements for sensing record carriers, e.g. for reading patterns by electromagnetic radiation, e.g. optical sensing; by corpuscular radiation using light without selection of wavelength, e.g. sensing reflected white light
    • G06K7/1404Methods for optical code recognition
    • G06K7/146Methods for optical code recognition the method including quality enhancement steps

Abstract

The invention relates to the field of artificial intelligence, and discloses an invoice image identification method based on two-dimensional code identification, which comprises the following steps: acquiring an invoice image; analyzing the invoice image based on a preset OCR algorithm engine to obtain an analysis result corresponding to the invoice image; analyzing the two-dimensional code image based on a preset two-dimensional code analysis tool package; when the two-dimensional code image is analyzed successfully, executing preset checking operation based on analysis information obtained by analyzing the two-dimensional code image and an analysis result corresponding to the invoice image to obtain an identification result of the invoice image; and when the two-dimensional code image is not successfully analyzed, executing a preset image transformation operation on the two-dimensional code image, and after the image transformation of the two-dimensional code image is completed, triggering and executing a step of analyzing the two-dimensional code image based on a preset two-dimensional code analysis tool package. Therefore, the two-dimensional code recognition success rate of the invoice image recognition method can be improved. The invention can be applied to digital medical systems.

Description

Invoice image recognition method, device, equipment and medium based on two-dimensional code recognition
Technical Field
The invention relates to the field of artificial intelligence, in particular to an invoice image identification method and device based on two-dimensional code identification, computer equipment and a storage medium.
Background
At present, the identification of the two-dimensional code is mostly realized based on a video stream with the two-dimensional code. However, in practical applications, not all practical application scenarios can directly acquire a complete video stream, for example, a system for identifying an invoice image such as a financial invoice or a medical invoice can only identify a two-dimensional code on the invoice through the invoice image (a photographed article or a scanned article) uploaded by a user. The invoice images such as the photographing part and the scanning part are different from the video stream, in the two-dimensional code information identification realized based on the video stream, multi-frame two-dimensional code images can be extracted from the video stream to realize the identification of the two-dimensional code information, at the moment, because the multi-frame two-dimensional code images are extracted to realize the identification of the two-dimensional code information, the image quality of the individual two-dimensional code images can not greatly influence the identification success rate of the two-dimensional code information of the whole video stream. However, in the two-dimensional code information recognition based on the invoice image, the number of the invoice images uploaded by the user is usually limited (e.g., one or two invoice images), at this time, the image quality of the invoice image will greatly affect the recognition success rate of the two-dimensional code information in the invoice image, and for example, image quality problems such as blurring, deformation, and occlusion of the invoice image will greatly affect the recognition success rate of the two-dimensional code information in the invoice image. Therefore, the two-dimensional code recognition success rate of the current invoice image recognition method still has a space for further improvement.
Disclosure of Invention
The invention aims to solve the technical problem that the two-dimensional code recognition success rate of the current invoice image recognition method is low.
In order to solve the technical problem, the first aspect of the invention discloses an invoice image identification method based on two-dimensional code identification, which comprises the following steps:
acquiring an invoice image to be identified;
analyzing the invoice image based on a preset OCR algorithm engine to obtain an analysis result corresponding to the invoice image, wherein the analysis result comprises a two-dimensional code image extracted from the invoice image;
analyzing the two-dimension code image based on a preset two-dimension code analysis tool package;
when the two-dimensional code image is analyzed successfully, executing preset checking operation based on analysis information obtained by analyzing the two-dimensional code image and an analysis result corresponding to the invoice image to obtain an identification result of the invoice image;
and when the two-dimensional code image is not successfully analyzed, executing a preset image transformation operation on the two-dimensional code image, and after the image transformation of the two-dimensional code image is completed, triggering and executing the step of analyzing the two-dimensional code image based on a preset two-dimensional code analysis tool package.
The invention discloses an invoice image recognition device based on two-dimensional code recognition in a second aspect, which comprises:
the acquiring module is used for acquiring an invoice image to be identified;
the OCR analysis module is used for analyzing the invoice image based on a preset OCR algorithm engine to obtain an analysis result corresponding to the invoice image, wherein the analysis result comprises a two-dimensional code image extracted from the invoice image;
the two-dimensional code analysis module is used for analyzing the two-dimensional code image based on a preset two-dimensional code analysis tool package;
the determining module is used for executing preset checking operation based on analysis information obtained by analyzing the two-dimensional code image and an analysis result corresponding to the invoice image when the two-dimensional code image is analyzed successfully so as to obtain an identification result of the invoice image;
and the transformation module is used for executing a preset image transformation operation on the two-dimensional code image when the two-dimensional code image is not successfully analyzed, and triggering and executing the step of analyzing the two-dimensional code image based on a preset two-dimensional code analysis tool package after the image transformation of the two-dimensional code image is completed.
A third aspect of the present invention discloses a computer apparatus, comprising:
a memory storing executable program code;
a processor coupled to the memory;
the processor calls the executable program code stored in the memory to execute part or all of the steps of the invoice image identification method based on two-dimension code identification disclosed by the first aspect of the invention.
The fourth aspect of the present invention discloses a computer storage medium, which stores computer instructions, and when the computer instructions are called, the computer instructions are used to execute part or all of the steps in the invoice image recognition method based on two-dimensional code recognition disclosed in the first aspect of the present invention.
In the embodiment of the invention, an invoice image to be identified is acquired, then a two-dimensional code image in the invoice image is extracted based on an OCR algorithm engine, the two-dimensional code image is analyzed based on a two-dimensional code analysis tool kit, if the two-dimensional code image is successfully analyzed, the identification result of the invoice image is determined according to analysis information obtained by analysis, if the two-dimensional code image is not successfully analyzed, the two-dimensional code image is subjected to image transformation, the two-dimensional code image subjected to image transformation is analyzed by using the two-dimensional code analysis tool kit again, so that when the two-dimensional code image in the invoice image is not successfully analyzed, the two-dimensional code image is subjected to image transformation to obtain a new two-dimensional code image, then the new two-dimensional code image is analyzed again, and the effect of extracting a plurality of frames of two-dimensional code images to perform two-dimensional code identification in two-dimensional code identification based on video streams can be achieved, therefore, the influence of the image quality problem of the invoice image on the two-dimensional code recognition success rate can be effectively reduced, and the two-dimensional code recognition success rate of the invoice image recognition method is improved.
Drawings
In order to more clearly illustrate the technical solutions in the embodiments of the present invention, the drawings needed to be used in the description of the embodiments will be briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without creative efforts.
Fig. 1 is a schematic flow chart of an invoice image recognition method based on two-dimensional code recognition according to an embodiment of the present invention;
fig. 2 is a schematic structural diagram of an invoice image recognition device based on two-dimensional code recognition according to an embodiment of the present invention;
FIG. 3 is a schematic structural diagram of a computer device according to an embodiment of the present invention;
fig. 4 is a schematic structural diagram of a computer storage medium according to an embodiment of the present invention.
Detailed Description
In order to make the technical solutions of the present invention better understood, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
The terms "first," "second," and the like in the description and claims of the present invention and in the above-described drawings are used for distinguishing between different objects and not for describing a particular order. Furthermore, the terms "include" and "have," as well as any variations thereof, are intended to cover non-exclusive inclusions. For example, a process, method, apparatus, article, or article that comprises a list of steps or elements is not limited to only those steps or elements listed, but may alternatively include other steps or elements not listed, or inherent to such process, method, article, or article.
Reference herein to "an embodiment" means that a particular feature, structure, or characteristic described in connection with the embodiment can be included in at least one embodiment of the invention. The appearances of the phrase in various places in the specification are not necessarily all referring to the same embodiment, nor are separate or alternative embodiments mutually exclusive of other embodiments. It is explicitly and implicitly understood by one skilled in the art that the embodiments described herein can be combined with other embodiments.
The embodiment of the application can acquire and process related data based on an artificial intelligence technology. Among them, Artificial Intelligence (AI) is a theory, method, technique and application system that simulates, extends and expands human Intelligence using a digital computer or a machine controlled by a digital computer, senses the environment, acquires knowledge and uses the knowledge to obtain the best result.
The artificial intelligence infrastructure generally includes technologies such as sensors, dedicated artificial intelligence chips, cloud computing, distributed storage, big data processing technologies, operation/interaction systems, mechatronics, and the like. The artificial intelligence software technology mainly comprises a computer vision technology, a robot technology, a biological recognition technology, a voice processing technology, a natural language processing technology, machine learning/deep learning and the like.
The invention discloses an invoice image recognition method, a device, computer equipment and a storage medium based on two-dimensional code recognition, which are used for obtaining an invoice image to be recognized, extracting a two-dimensional code image in the invoice image based on an OCR algorithm engine, analyzing the two-dimensional code image based on a two-dimensional code analysis tool package, determining a recognition result of the invoice image according to analysis information obtained by analysis if the two-dimensional code image is successfully analyzed, performing image transformation on the two-dimensional code image if the two-dimensional code image is not successfully analyzed, and analyzing the two-dimensional code image after image transformation by using the two-dimensional code analysis tool package again, so that when the two-dimensional code image in the invoice image is not successfully analyzed, the two-dimensional code image is subjected to image transformation to obtain a new two-dimensional code image, then the new two-dimensional code image is analyzed again, and the effect of extracting a multi-frame two-dimensional code image from the two-dimensional code recognition based on a video stream to perform two-dimensional code recognition can be simulated, therefore, the influence of the image quality problem of the invoice image on the two-dimensional code recognition success rate can be effectively reduced, and the two-dimensional code recognition success rate of the invoice image recognition method is improved. The following are detailed below.
Example one
Referring to fig. 1, fig. 1 is a schematic flow chart of an invoice image recognition method based on two-dimensional code recognition according to an embodiment of the present invention. As shown in fig. 1, the invoice image recognition method based on two-dimensional code recognition may include the following operations:
101. and acquiring an invoice image to be identified.
In step 101, the invoice image to be identified may be uploaded to the invoice image identification system by the user. The invoice image may be a photographed or scanned piece of a real invoice, including but not limited to a medical invoice.
102. Analyzing the invoice image based on a preset OCR algorithm engine to obtain an analysis result corresponding to the invoice image, wherein the analysis result comprises a two-dimensional code image extracted from the invoice image.
In the step 102, after the OCR algorithm engine analyzes the invoice image (e.g., area detection, text recognition, etc.), the key content information in the invoice image (e.g., coordinate information of each field area in the invoice image, a recognition result of each field area in the invoice image, a confidence value of the recognition result of each field area in the invoice image, coordinate information of the two-dimensional code image in the invoice image, etc.) can be obtained. And cutting out the two-dimensional code image from the invoice image according to the coordinate information of the two-dimensional code image obtained by the analysis of the OCR algorithm engine so as to perform subsequent two-dimensional code analysis. When the invoice image is a medical invoice image, the analysis result of the OCR algorithm engine may include information such as a two-dimensional code image extracted from the medical invoice image, a patient name of the medical invoice, and a medical record of a patient of the medical invoice.
103. And analyzing the two-dimension code image based on a preset two-dimension code analysis tool package.
In the step 103, the two-dimensional code analysis toolkit may be a zxing toolkit, which is an open source item that is proposed by google in 2019 and 11 months and is used for identifying bar codes in multiple formats, and the two-dimensional code image can be analyzed by using the zxing toolkit. When the invoice image is a medical invoice image, the personal information of the patient such as name, height, weight and age, the diagnosis condition of the patient at the visit and the like can be recorded in the two-dimensional code image in the medical invoice image, and the information recorded in the two-dimensional code image can be obtained by analyzing the two-dimensional code image through a zxing toolkit.
104. And when the two-dimensional code image is analyzed successfully, executing preset check operation based on analysis information obtained by analyzing the two-dimensional code image and an analysis result corresponding to the invoice image to obtain an identification result of the invoice image.
In step 104, when the zxing toolkit successfully analyzes the two-dimensional code image, the analysis information obtained by analyzing the two-dimensional code image may be directly used as the identification result of the invoice image, for example, the analysis information obtained by analyzing the two-dimensional code image is "invoice amount: 100 yuan, billing company: XXX company ", the parsed information" invoice amount: 100 yuan, billing company: XXX company "directly as a recognition result of the invoice image. In addition, when the zxing toolkit successfully analyzes the two-dimensional code image, the analysis information obtained by analyzing the two-dimensional code image can be compared with the field information of the invoice image analyzed by the OCR algorithm engine to determine the final identification result of the invoice image, which is described later. When the invoice image is a medical invoice image, after the identification result of the medical invoice image (such as personal information of the patient, such as name, height, weight, age, and the like, and information of the diagnosis condition of the patient at the visit and the like) is obtained, the identification result of the medical invoice image can be uploaded to a medical digital platform and then applied to each process of medical digitization.
105. And when the two-dimensional code image is not successfully analyzed, executing a preset image transformation operation on the two-dimensional code image, and after the image transformation of the two-dimensional code image is completed, triggering and executing the step of analyzing the two-dimensional code image based on a preset two-dimensional code analysis tool package.
In the above step 105, when the two-dimensional code image is not successfully parsed, a preset image transformation operation (e.g., an image affine transformation, an image perspective transformation, an image rigid body transformation, etc.) may be performed on the two-dimensional code image to obtain a new two-dimensional code image, and then the new two-dimensional code image is parsed by using the zxing toolkit, so that it is possible to successfully parse information of the two-dimensional code from the new two-dimensional code image. If the new two-dimensional code image is still not successfully analyzed, the two-dimensional code image can be continuously subjected to image transformation, and then the zxing toolkit is continuously used for analyzing the new transformed two-dimensional code image. Before the two-dimensional code image is successfully analyzed, the image transformation can be continuously and circularly carried out on the two-dimensional code image, and the zxing toolkit is used for analyzing the newly transformed two-dimensional code image until the analysis information is successfully analyzed from the two-dimensional code image. Therefore, before the two-dimension code image is successfully analyzed, the two-dimension code image is continuously subjected to image transformation, and a new two-dimension code image is continuously generated and analyzed (namely, the two-dimension code image is similar to a plurality of two-dimension code images used for identifying the two-dimension code), so that the effect of extracting a plurality of frames of two-dimension code images for identifying the two-dimension code in the two-dimension code identification based on video stream can be simulated, the influence of the image quality problem of the invoice image on the success rate of the two-dimension code identification is effectively reduced, and the success rate of the two-dimension code identification of the invoice image identification method is improved. The following table shows the comparison between zxing open source library and the recognition success rate of the scheme of the embodiment of the invention on the two-dimensional codes in the invoice images when the 485 invoice images are tested:
Figure BDA0003221886640000071
it can be seen that, by implementing the invoice image recognition method based on two-dimensional code recognition described in fig. 1, acquiring an invoice image to be recognized, then extracting a two-dimensional code image in the invoice image based on an OCR algorithm engine, parsing the two-dimensional code image based on a two-dimensional code parsing kit, if the two-dimensional code image is successfully parsed, determining a recognition result of the invoice image according to parsed information obtained by parsing, if the two-dimensional code image is not successfully parsed, performing image transformation on the two-dimensional code image, and parsing the two-dimensional code image after image transformation again by using the two-dimensional code parsing kit, so that when the two-dimensional code image in the invoice image is not successfully parsed, performing image transformation on the two-dimensional code image to obtain a new two-dimensional code image, then parsing the new two-dimensional code image again, and simulating the effect of extracting a multi-frame two-dimensional code image in the two-dimensional code recognition based on video streams to perform two-dimensional code recognition, therefore, the influence of the image quality problem of the invoice image on the two-dimensional code recognition success rate can be effectively reduced, and the two-dimensional code recognition success rate of the invoice image recognition method is improved.
In an optional embodiment, after the two-dimensional code image is not successfully parsed and before the performing a preset image transformation operation on the two-dimensional code image, the method further includes:
judging whether the number of times of image transformation operation executed by the two-dimensional code image is greater than a preset number threshold;
when the number of times of image transformation operation executed by the two-dimensional code image is judged to be larger than the number threshold, outputting error information to a user, wherein the error information is used for prompting the user that the invoice image identification fails;
and triggering and executing the step of executing the preset image transformation operation on the two-dimensional code image when judging that the number of times of the image transformation operation executed on the two-dimensional code image is not greater than the number threshold.
In this optional embodiment, after the two-dimensional code image is not successfully analyzed, although the image transformation is performed on the two-dimensional code image, and then the zxing toolkit is used to analyze the new two-dimensional code image, there is a possibility that the analysis information of the two-dimensional code may be successfully analyzed, but if the image quality of the original two-dimensional code image is too poor, there is a possibility that the analysis cannot be successfully performed after the image transformation and the analysis are performed for many times. In this case, if the loop process of image conversion and analysis is continued, there is a possibility that the loop may be closed or the entire analysis time of the two-dimensional code image may be too long. Therefore, in order to ensure the identification efficiency of the invoice image, in the cyclic process of image transformation and analysis, the number of times of image transformation operation executed by the two-dimensional code image can be recorded, if the number of times of image transformation operation executed by the two-dimensional code image is greater than a preset number threshold (for example, 10 times), the image quality of the original two-dimensional code image is over-poor, the cyclic process of image transformation and analysis can be stopped, error information is output to a user, the failure of invoice image identification is prompted to the user, subsequent manual processing can be carried out by the user, and if the number of times of image transformation operation executed by the two-dimensional code image is not greater than the number threshold, the cyclic process of image transformation and analysis is continued. Therefore, the recognition process of the whole invoice image can be prevented from falling into endless loop or the recognition time is too long, and the recognition efficiency of the invoice image is favorably ensured.
It can be seen that, by implementing the optional embodiment, after the two-dimensional code image is not successfully analyzed, whether the number of times of the image transformation operation executed by the two-dimensional code image is greater than a preset number threshold is determined, if the number of times of the image transformation operation executed by the two-dimensional code image is greater than the number threshold, error information is output to a user, and if the number of times of the image transformation operation executed by the two-dimensional code image is not greater than the number threshold, a cyclic process of image transformation and analysis is continued, so that the whole identification process of the invoice image can be prevented from falling into a dead cycle or the identification time is too long, and the identification efficiency of the invoice image can be ensured.
In an optional embodiment, the analysis result corresponding to the invoice image further comprises field information identified from a field area of the invoice image and a confidence value corresponding to the field information;
and executing a preset check operation based on analysis information obtained by analyzing the two-dimensional code image and an analysis result corresponding to the invoice image to obtain an identification result of the invoice image, wherein the check operation comprises the following steps:
and executing preset checking operation on the basis of the field information, the confidence value corresponding to the field information and the analysis information obtained by analyzing the two-dimensional code image so as to obtain the identification result of the invoice image. In this alternative embodiment, the invoice image typically includes the following five more important fields: the method comprises the steps of recording different field information in different areas (namely field areas) in an invoice image, analyzing the invoice image through an OCR algorithm engine, and analyzing all the field information in the invoice image. Because the OCR algorithm can not guarantee high recognition accuracy, in practical application, when the OCR algorithm engine analyzes all the field information in the invoice image, the OCR algorithm engine also analyzes a confidence value corresponding to each field information, so as to represent the accuracy of the OCR algorithm on the analysis result of the field information. And finally, the identification result of the invoice image is determined by integrating the analytic information of the two-dimensional code image, the field information of the invoice image and the confidence value of the field information, and the specific determination process is described in detail later, so that the identification accuracy of the invoice image can be improved.
Therefore, by implementing the optional embodiment, the analysis result of the OCR algorithm engine on the invoice image further includes the field information identified from the field area of the invoice image and the confidence value corresponding to the field information, and then the identification result of the invoice image is determined according to the confidence value corresponding to the field information, the field information and the analysis information obtained by analyzing the two-dimensional code image, so that the identification accuracy of the invoice image can be improved.
In an optional embodiment, the performing a preset verification operation based on the field information, the confidence value corresponding to the field information, and the parsed information obtained by parsing the two-dimensional code image to obtain the identification result of the invoice image includes:
judging whether the confidence value corresponding to the field information is larger than a confidence threshold value;
when the confidence value corresponding to the field information is judged to be larger than the confidence threshold value, taking the field information as the identification result of the invoice image;
when the confidence value corresponding to the field information is judged to be not larger than the confidence threshold value, judging whether the field information is matched with analysis information obtained by analyzing the two-dimensional code image;
and when the field information is judged to be matched with the analysis information, taking the field information as the identification result of the invoice image, and setting the confidence value corresponding to the field information as a preset target confidence value, wherein the target confidence value is greater than the confidence threshold value.
In this optional embodiment, the final recognition result of the invoice image may be obtained by combining the OCR recognition result of the field area of the invoice image and the recognition result of the two-dimensional code image in the invoice image, and the accuracy of the final recognition result of the invoice image may be improved by verifying the OCR recognition result of the field area and the recognition result of the two-dimensional code image. Specifically, after the OCR algorithm engine analyzes the field information in the invoice image and the confidence value corresponding to the field information, if the confidence value is greater than a preset confidence threshold (i.e., the OCR algorithm has high recognition accuracy on the field information), the field information may be directly used as the recognition result of the invoice image, if the confidence value is not greater than the preset confidence threshold (i.e., the OCR algorithm has low recognition accuracy on the field information), it may be determined whether the field information analyzed by the OCR algorithm engine matches the analysis information obtained by the previous two-dimensional code image (e.g., whether the field information analyzed by the OCR algorithm, such as the bill code, the bill number, the check code, the billing date and the total amount, is the same as the field information analyzed by the two-dimensional code image), if so, although the confidence coefficient of the field information analyzed by the OCR algorithm engine is low, the field information is the same as the analysis information obtained by analyzing the two-dimensional code image, so that the field information can also be directly used as the identification result of the invoice image, and thus, the final identification result of the invoice image can be obtained by combining the OCR identification result of the field area of the invoice image and the identification result of the two-dimensional code image in the invoice image, and the identification accuracy of the invoice image is improved.
In addition, when the field information analyzed by the OCR algorithm engine is matched with the analysis information obtained by analyzing the two-dimensional code image, the confidence value corresponding to the field information is set as a preset target confidence value, so that the verification passing rate of the recognition result can be improved. In the existing invoice image recognition scheme, usually only an OCR algorithm engine is used for recognizing field information of a field area of an invoice image, and then field information is verified through a confidence value of the field information obtained through an OCR algorithm, specifically, during verification, if the confidence value of the field information is greater than a preset threshold (an recognition result of the OCR algorithm is reliable), the verification is passed, and if the confidence value of the field information is not greater than the preset threshold (the recognition result of the OCR algorithm is unreliable), the verification is not passed, and a corresponding invoice image is pushed to manual processing. In the embodiment of the invention, the OCR recognition result of the field area and the recognition result of the two-dimensional code image are combined for verification, when the confidence coefficient of the recognition result of the OCR algorithm is low, the recognition result of the OCR algorithm and the recognition result of the two-dimensional code image are compared, if the two results are consistent, the recognition result of the OCR algorithm is still used as the recognition result of the invoice image, and the confidence value of the recognition result of the OCR algorithm is reset to be the target confidence value, so that the recognition result of the OCR algorithm can pass the verification smoothly, and the verification passing rate of the recognition result of the invoice image is improved while the recognition accuracy of the invoice image is improved. The following table shows the comparison condition of the check passing rate before the two-dimensional code is checked and the check passing rate after the two-dimensional code is checked when the 485 invoice images are tested:
Figure BDA0003221886640000101
it can be seen that, by implementing the optional embodiment, when the confidence value corresponding to the field information is greater than the confidence threshold, the field information is directly used as the identification result of the invoice image, when the confidence value corresponding to the field information is not greater than the confidence threshold, whether the field information is matched with the analytic information obtained by analyzing the two-dimensional code image is judged, when the field information is matched with the analytic information, the field information is used as the identification result of the invoice image, and the confidence value corresponding to the field information is set as the target confidence value, so that the OCR identification result of the field area of the invoice image and the identification result of the two-dimensional code image in the invoice image can be combined, the final identification result of the invoice image can be obtained, the identification accuracy of the invoice image can be improved, and the OCR identification result of the field area of the invoice image and the identification result of the two-dimensional code image in the invoice image can be combined for verification, therefore, the verification passing rate of the identification result of the invoice image is improved, the automation rate of identification of the invoice image is improved, and manpower and material resources are saved.
In an optional embodiment, the method further comprises:
and when the field information is judged not to be matched with the analysis information, outputting error information to a user, and adding the invoice image into a preset error invoice image set, wherein the error invoice image set is used for optimizing the OCR algorithm engine, and the error information is used for prompting the user that the invoice image identification fails.
In this optional embodiment, when the field information and the analysis information are not matched (that is, the OCR recognition result of the field area of the invoice image is not matched with the recognition result of the two-dimensional code image in the invoice image), it is indicated that the recognition result of the invoice image is not reliable, error information may be output to the user, which prompts that the invoice image recognition of the user fails, and then the invoice images with the recognition failures are added to the error invoice image set, and then the error invoice image set may be used to optimize the OCR algorithm engine, so that the recognition performance of the OCR algorithm engine on the invoice image is improved, and the accuracy of the recognition result of the final invoice image is improved.
Therefore, by implementing the optional embodiment, after the field information is judged to be not matched with the analysis information, the error information is output to the user, the invoice image is added into the preset error invoice image set, and then the OCR algorithm engine is optimized by using the error invoice image set, so that the recognition performance of the OCR algorithm engine on the invoice image can be improved, and the accuracy of the final recognition result of the invoice image is improved.
In an optional embodiment, the two-dimensional code parsing kit is a zxing kit supporting two-dimensional code image recognition.
In the optional embodiment, the zxing toolkit is an open source item which is released by google corporation in 2019, 11 months and used for identifying barcodes in various formats, supports barcode identification in various forms and various mobile devices, and only needs to be used for identifying two-dimensional code images, so that configuration parameters of the zxing toolkit can be simplified, the zxing toolkit only supports identification of the two-dimensional code images, the functions of the zxing toolkit are simplified, the resolution efficiency of the zxing toolkit on the two-dimensional code images is improved, and the identification efficiency of the invoice images is improved.
Therefore, by implementing the optional embodiment, the two-dimensional code analysis toolkit is configured into a zxing toolkit supporting two-dimensional code image identification, so that the function of the two-dimensional code analysis toolkit can be simplified, the analysis efficiency of the two-dimensional code analysis toolkit on the two-dimensional code image is improved, and the identification efficiency of the invoice image is improved.
In an alternative embodiment, the image transformation operation is one of an image affine transformation operation, an image perspective transformation operation, an image rigid body transformation operation, and an image similarity transformation operation.
In this alternative embodiment, a new two-dimensional code image can be obtained by performing one of transformations such as image affine transformation, image perspective transformation, image rigid body transformation, and image similarity transformation on the two-dimensional code image, and then performing subsequent analysis. Alternatively, the two-dimensional code image may be subjected to image affine transformation, image perspective transformation, image rigid body transformation, and image similarity transformation sequentially in this order, or may be converted by randomly selecting one image transformation method each time the two-dimensional code image is transformed.
Therefore, by implementing the optional embodiment, when the image transformation of the two-dimensional code image is performed, the two-dimensional code image is subjected to image affine transformation, image perspective transformation, image rigid body transformation or image similarity transformation, so that a new two-dimensional code image can be obtained for subsequent analysis, the effect of extracting a multi-frame two-dimensional code image for performing two-dimensional code identification in the two-dimensional code identification based on video stream can be simulated, the influence of the image quality problem of the invoice image on the success rate of the two-dimensional code identification is effectively reduced, and the success rate of the two-dimensional code identification of the invoice image identification method is improved.
Optionally, it is also possible: and uploading invoice image identification information based on two-dimension code identification of the invoice image identification method based on two-dimension code identification to a block chain.
Specifically, the invoice image identification information based on the two-dimensional code identification is obtained by operating the invoice image identification method based on the two-dimensional code identification, and is used for recording invoice image identification conditions based on the two-dimensional code identification, such as an acquired invoice image, an analyzed two-dimensional code image, analyzed information analyzed from the two-dimensional code image, and the like. The invoice image identification information based on the two-dimension code identification is uploaded to the block chain, so that the safety and the fair transparency to the user can be guaranteed. The user can download the invoice image identification information based on the two-dimension code identification from the blockchain so as to verify whether the invoice image identification information based on the two-dimension code identification of the invoice image identification method based on the two-dimension code identification is tampered. The blockchain referred to in this example is a novel application mode of computer technologies such as distributed data storage, point-to-point transmission, consensus mechanism, encryption algorithm, and the like. A block chain (Blockchain), which is essentially a decentralized database, is a series of data blocks associated by using a cryptographic method, and each data block contains information of a batch of network transactions, so as to verify the validity (anti-counterfeiting) of the information and generate a next block. The blockchain may include a blockchain underlying platform, a platform product service layer, an application service layer, and the like.
Example two
Referring to fig. 2, fig. 2 is a schematic structural diagram of an invoice image recognition device based on two-dimensional code recognition according to an embodiment of the present invention. As shown in fig. 2, the invoice image recognition device based on two-dimensional code recognition may include:
an obtaining module 201, configured to obtain an invoice image to be identified;
the OCR analysis module 202 is configured to analyze the invoice image based on a preset OCR algorithm engine to obtain an analysis result corresponding to the invoice image, where the analysis result includes a two-dimensional code image extracted from the invoice image;
the two-dimensional code analysis module 203 is used for analyzing the two-dimensional code image based on a preset two-dimensional code analysis toolkit;
the determining module 204 is configured to, when the two-dimensional code image is successfully analyzed, perform a preset checking operation based on analysis information obtained by analyzing the two-dimensional code image and an analysis result corresponding to the invoice image to obtain an identification result of the invoice image;
the transformation module 205 is configured to, when the two-dimensional code image is not successfully analyzed, perform a preset image transformation operation on the two-dimensional code image, and after the image transformation of the two-dimensional code image is completed, trigger execution of the step of analyzing the two-dimensional code image based on a preset two-dimensional code analysis toolkit.
For the specific description of the invoice image recognition device based on the two-dimensional code recognition, reference may be made to the specific description of the invoice image recognition method based on the two-dimensional code recognition, and for avoiding repetition, the detailed description is omitted here.
EXAMPLE III
Referring to fig. 3, fig. 3 is a schematic structural diagram of a computer device according to an embodiment of the present invention. As shown in fig. 3, the computer apparatus may include:
a memory 301 storing executable program code;
a processor 302 connected to the memory 301;
the processor 302 calls the executable program code stored in the memory 301 to execute the steps of the invoice image recognition method based on two-dimensional code recognition disclosed in the embodiment of the invention.
Example four
Referring to fig. 4, an embodiment of the present invention discloses a computer storage medium 401, where the computer storage medium 401 stores computer instructions, and the computer instructions are used to execute steps of an invoice image recognition method based on two-dimensional code recognition, when called.
The above-described embodiments of the apparatus are merely illustrative, and the modules described as separate components may or may not be physically separate, and the components shown as modules may or may not be physical modules, may be located in one place, or may be distributed on a plurality of network modules. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of the present embodiment. One of ordinary skill in the art can understand and implement it without inventive effort.
Through the above detailed description of the embodiments, those skilled in the art will clearly understand that the embodiments may be implemented by software plus a necessary general hardware platform, and may also be implemented by hardware. Based on such understanding, the above technical solutions may be embodied in the form of a software product, which may be stored in a computer-readable storage medium, where the storage medium includes a Read-Only Memory (ROM), a Random Access Memory (RAM), a Programmable Read-Only Memory (PROM), an Erasable Programmable Read-Only Memory (EPROM), a One-time Programmable Read-Only Memory (OTPROM), an Electrically Erasable Programmable Read-Only Memory (EEPROM), a Compact Disc-Read-Only Memory (CD-ROM), or other disk memories, CD-ROMs, or other magnetic disks, A tape memory, or any other medium readable by a computer that can be used to carry or store data.
Finally, it should be noted that: the invoice image recognition method, device, computer equipment and storage medium disclosed in the embodiments of the present invention based on two-dimensional code recognition are only preferred embodiments of the present invention, and are only used for illustrating the technical solution of the present invention, not for limiting the same; although the present invention has been described in detail with reference to the foregoing embodiments, it will be understood by those skilled in the art; the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; and the modifications or the substitutions do not make the essence of the corresponding technical solutions depart from the spirit and scope of the technical solutions of the embodiments of the present invention.

Claims (10)

1. An invoice image identification method based on two-dimensional code identification is characterized by comprising the following steps:
acquiring an invoice image to be identified;
analyzing the invoice image based on a preset OCR algorithm engine to obtain an analysis result corresponding to the invoice image, wherein the analysis result comprises a two-dimensional code image extracted from the invoice image;
analyzing the two-dimension code image based on a preset two-dimension code analysis tool package;
when the two-dimensional code image is analyzed successfully, executing preset checking operation based on analysis information obtained by analyzing the two-dimensional code image and an analysis result corresponding to the invoice image to obtain an identification result of the invoice image;
and when the two-dimensional code image is not successfully analyzed, executing a preset image transformation operation on the two-dimensional code image, and after the image transformation of the two-dimensional code image is completed, triggering and executing the step of analyzing the two-dimensional code image based on a preset two-dimensional code analysis tool package.
2. The invoice image recognition method based on two-dimensional code recognition of claim 1, wherein after the two-dimensional code image is not successfully parsed and before the preset image transformation operation is performed on the two-dimensional code image, the method further comprises:
judging whether the number of times of image transformation operation executed by the two-dimensional code image is greater than a preset number threshold;
when the number of times of image transformation operation executed by the two-dimensional code image is judged to be larger than the number threshold, outputting error information to a user, wherein the error information is used for prompting the user that the invoice image identification fails;
and triggering and executing the step of executing the preset image transformation operation on the two-dimensional code image when judging that the number of times of the image transformation operation executed on the two-dimensional code image is not greater than the number threshold.
3. The invoice image recognition method based on two-dimensional code recognition is characterized in that the analysis result corresponding to the invoice image further comprises field information recognized from a field area of the invoice image and a confidence value corresponding to the field information;
and executing a preset check operation based on analysis information obtained by analyzing the two-dimensional code image and an analysis result corresponding to the invoice image to obtain an identification result of the invoice image, wherein the check operation comprises the following steps:
and executing preset checking operation on the basis of the field information, the confidence value corresponding to the field information and the analysis information obtained by analyzing the two-dimensional code image so as to obtain the identification result of the invoice image.
4. The invoice image recognition method based on two-dimensional code recognition of claim 3, wherein the performing a preset checking operation on the basis of the field information, the confidence value corresponding to the field information and the resolution information obtained by resolving the two-dimensional code image to obtain the recognition result of the invoice image comprises:
judging whether the confidence value corresponding to the field information is larger than a confidence threshold value;
when the confidence value corresponding to the field information is judged to be larger than the confidence threshold value, taking the field information as the identification result of the invoice image;
when the confidence value corresponding to the field information is judged to be not larger than the confidence threshold value, judging whether the field information is matched with analysis information obtained by analyzing the two-dimensional code image;
and when the field information is judged to be matched with the analysis information, taking the field information as the identification result of the invoice image, and setting the confidence value corresponding to the field information as a preset target confidence value, wherein the target confidence value is greater than the confidence threshold value.
5. The invoice image recognition method based on two-dimensional code recognition of claim 4, wherein the method further comprises:
and when the field information is judged not to be matched with the analysis information, outputting error information to a user, and adding the invoice image into a preset error invoice image set, wherein the error invoice image set is used for optimizing the OCR algorithm engine, and the error information is used for prompting the user that the invoice image identification fails.
6. The invoice image recognition method based on two-dimensional code recognition of any one of claims 1-5, wherein the two-dimensional code parsing toolkit is a zxing toolkit supporting two-dimensional code image recognition.
7. The invoice image identification method based on two-dimensional code identification according to any one of claims 1-5, characterized in that the image transformation operation is one of an image affine transformation operation, an image perspective transformation operation, an image rigid body transformation operation and an image similarity transformation operation.
8. An invoice image recognition device based on two-dimensional code recognition, its characterized in that, the device includes:
the acquiring module is used for acquiring an invoice image to be identified;
the OCR analysis module is used for analyzing the invoice image based on a preset OCR algorithm engine to obtain an analysis result corresponding to the invoice image, wherein the analysis result comprises a two-dimensional code image extracted from the invoice image;
the two-dimensional code analysis module is used for analyzing the two-dimensional code image based on a preset two-dimensional code analysis tool package;
the determining module is used for executing preset checking operation based on analysis information obtained by analyzing the two-dimensional code image and an analysis result corresponding to the invoice image when the two-dimensional code image is analyzed successfully so as to obtain an identification result of the invoice image;
and the transformation module is used for executing a preset image transformation operation on the two-dimensional code image when the two-dimensional code image is not successfully analyzed, and triggering and executing the step of analyzing the two-dimensional code image based on a preset two-dimensional code analysis tool package after the image transformation of the two-dimensional code image is completed.
9. A computer device, characterized in that the computer device comprises:
a memory storing executable program code;
a processor coupled to the memory;
the processor calls the executable program code stored in the memory to execute the invoice image recognition method based on two-dimensional code recognition according to any one of claims 1-7.
10. A computer-readable storage medium, in which a computer program is stored, which, when being executed by a processor, implements the invoice image recognition method based on two-dimensional code recognition according to any one of claims 1-7.
CN202110960480.6A 2021-08-20 2021-08-20 Invoice image recognition method, device, equipment and medium based on two-dimensional code recognition Pending CN113657132A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110960480.6A CN113657132A (en) 2021-08-20 2021-08-20 Invoice image recognition method, device, equipment and medium based on two-dimensional code recognition

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110960480.6A CN113657132A (en) 2021-08-20 2021-08-20 Invoice image recognition method, device, equipment and medium based on two-dimensional code recognition

Publications (1)

Publication Number Publication Date
CN113657132A true CN113657132A (en) 2021-11-16

Family

ID=78480531

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110960480.6A Pending CN113657132A (en) 2021-08-20 2021-08-20 Invoice image recognition method, device, equipment and medium based on two-dimensional code recognition

Country Status (1)

Country Link
CN (1) CN113657132A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114611541A (en) * 2022-03-11 2022-06-10 平安科技(深圳)有限公司 Invoice image recognition method, device, equipment and storage medium

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114611541A (en) * 2022-03-11 2022-06-10 平安科技(深圳)有限公司 Invoice image recognition method, device, equipment and storage medium

Similar Documents

Publication Publication Date Title
CN107239666B (en) Method and system for desensitizing medical image data
WO2020215573A1 (en) Captcha identification method and apparatus, and computer device and storage medium
WO2021068616A1 (en) Method and device for identity authentication, computer device, and storage medium
CN110795714A (en) Identity authentication method and device, computer equipment and storage medium
CN113377667A (en) Scene-based testing method and device, computer equipment and storage medium
CN113657132A (en) Invoice image recognition method, device, equipment and medium based on two-dimensional code recognition
CN110532543A (en) Analysis and processing method, device, computer equipment and the storage medium of evidence material
CN113918467A (en) Financial system testing method, device, equipment and storage medium
CN113724163A (en) Image correction method, device, equipment and medium based on neural network
US8363885B2 (en) Method, device, and program for embedding, displaying, and recognizing data
CN113705468A (en) Digital image identification method based on artificial intelligence and related equipment
CN115314268B (en) Malicious encryption traffic detection method and system based on traffic fingerprint and behavior
CN113627576B (en) Code scanning information detection method, device, equipment and storage medium
Savchenko et al. Deedp: vulnerability detection and patching based on deep learning
CN112967216A (en) Method, device and equipment for detecting key points of face image and storage medium
CN113158988A (en) Financial statement processing method and device and computer readable storage medium
CN113051561A (en) Application program feature extraction method and device and classification method and device
CN113726576B (en) Method, device, equipment and storage medium for constructing network adaptation framework
CN113408555A (en) Image registration method and device and electronic equipment
CN117539452B (en) Face recognition method and device and electronic equipment
CN112363705B (en) System package generation method, device, computer equipment and storage medium
US20230014400A1 (en) Device, system and method for verified self-diagnosis
CN112989114B (en) Video information generation method and device applied to video screening
CN117251725A (en) Method and device for identifying data based on machine learning
CN109034212B (en) Terminal biological identification performance testing method and device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination