CN111209792B

CN111209792B - Image recognition method and device for value-added tax common invoice

Info

Publication number: CN111209792B
Application number: CN201911218244.6A
Authority: CN
Inventors: 肖文星; 陈军民; 张涛; 李燕; 杜丽丽
Original assignee: Henan Institute of Science and Technology
Current assignee: Henan Institute of Science and Technology
Priority date: 2019-12-02
Filing date: 2019-12-02
Publication date: 2023-08-01
Anticipated expiration: 2039-12-02
Also published as: CN111209792A

Abstract

Aiming at the problem that the identification of the normal invoice of the value-added tax is inaccurate, particularly characters in the invoice are easy to be interfered by noise such as a seal and the like at present, the difficulty of identifying the characters of the invoice is increased, and the identification method and the device of the value-added tax invoice are provided.

Description

Image recognition method and device for value-added tax common invoice

Technical Field

The invention relates to the technical field of financial information electronic processing, in particular to an image recognition method and device for a value-added tax common invoice.

Background

In recent years, with the rapid development of Chinese economy, the types and the quantity of bills are in an annual rising trend, and the value-added tax common invoice is one of them. Many enterprises and institutions need to reimburse after normal purchasing, and a financial system needs to manually process a large number of invoice documents, so that a large number of manpower and material resources are consumed, the working efficiency is low, and the automatic identification processing of bills can efficiently improve the working efficiency of financial departments. However, if the bill automation processing process has low effective recognition rate, not only can bring business risk, but also can increase the workload for the subsequent manual processing, so that the accuracy of bill automation processing is very necessary to be improved.

In the automatic identification process of the invoice image, an identification area is set through customizing a form template, identification attributes are set, special characters are called, option area identification is carried out, identification post-processing is carried out according to the identification attributes, and finally a structured identification result is output; or based on the use of the bloom TH-OCR technology, the invoice is subjected to a plurality of preprocessing operations, and particularly has the functions of rectifying deviation, correcting color cast, filtering color, reducing noise, binarizing, enhancing the contrast of a recognition unit and the like, and the functions can be flexibly configured and freely combined to output the optimal image quality for the later recognition.

However, at present, the recognition of the value-added tax common invoice has a plurality of problems, once the recognition of the invoice is inaccurate, the follow-up processing of finance can be influenced, including the input, approval and the like of the reimbursement amount, wherein characters in the invoice are easily interfered by noise such as a seal and the like, and the difficulty of recognition of the characters of the invoice is increased.

Disclosure of Invention

According to the technical problems, the invention provides a method and a device for identifying value-added tax invoices. The invention mainly utilizes the gray level, denoising, character segmentation and character area segmentation of the input image and the recognition of the template feature matching algorithm, thereby effectively realizing the recognition of the Chinese character module in the value-added tax invoice, effectively removing the noise caused by seals and the like and enhancing the distinguishing capability of the shape close to the characters. The technical scheme is as follows:

according to a first aspect of an embodiment of the present disclosure, there is provided a method for identifying a value added tax invoice, including:

s1, acquiring an image of a normal invoice of a value-added tax, obtaining an original color image of the normal invoice of the value-added tax with 24 bits, extracting an R component of the original color image of the normal invoice of the value-added tax, and taking the R component as a gray image, wherein the gray value of a pixel point on the gray image to be identified is 0 or 255;

s2, regularized denoising treatment is carried out on the gray level image to reduce noise points, a denoised gray level image is obtained, and then binarization treatment of self-adaptive threshold segmentation is carried out on the denoised gray level image, so that a value-added tax common invoice self-adaptive threshold binarization image is obtained;

s3, roughly positioning the areas of the tax payer identification number, the invoice code, the invoice number, the billing date and the amount according to the position priori information of the tax payer identification number, the invoice code, the billing date and the amount, precisely positioning the areas by adopting a horizontal projection and a vertical crossing number body distance method, and carrying out character segmentation normalization processing on the precisely positioned areas to obtain the to-be-identified tax payer identification number, the invoice code, the invoice number, the billing date and the amount;

s4, identifying the identification number, the invoice code, the invoice number, the billing date and the amount of the buyer taxpayer to be identified by using a template feature matching algorithm; and obtaining a recognition result.

Further, the method further comprises the following steps: in step S2, in order to perform denoising better, the present invention selects a non-local mean kernel, so that similarity between pixels can be quantified according to edge metrics derived by blurring edge supplementation, and specifically, the regularized denoising process includes the following steps: establishing a pixel mu in coordinates (x, y) _xy Edge detector ED of (a) _(μxy) The following formula (1): ED (u) _xy )＝|e ^⊥ (x, y) -e (x, y) | (x, y) ∈Ω (1), ED is close to 0 in the smoothed region, ED gets larger near the edge, and ED is close to 0 in the noisy region; denoising model based on total variation and edge detector ED _(μxy) The denoising model is presented as follows:wherein f=μ ^* +ω(μ ^* Original unknown image, ω is gaussian noise), +.>Delta is a positive parameter for controlling the gradual decay of ψ (ED (μ)) from 2 to 1, the regularization parameter λ has the effect of mediating the approximation term, when λ is sufficiently large, the second term in the model is known to have its determining effect, and when λ is>At 0, the first term controls the whole objective function, so that the selection of lambda is important in solving, the selection of regularization parameters is related to the noise variance of the initial addition, and the corresponding lambda expression is:

further, in step S2, the regularized denoising processing includes the steps of using a gradient descent method and obtaining a lagrangian equation of the denoising model formula (2):wherein the diffusion function is->Let Φ(s) =s ^ED(u) 。

Further, in step S2, the lagrangian equation is solved using a partial differential equation based method:

wherein mu _NN Is the second derivative in the N direction, mu _TT Is the second derivative of N in the vertical direction T.

Further, in step S2, the μ _{NN sum} μ _TT The method comprises the following steps of:wherein mu _xx 、μ _yy Sum mu _xy Representing the second derivative, and t is the transpose operator, the discrete model given by equation (4) is as follows:and determining iteration stop time according to the energy check of the images before and after denoising.

The technical scheme provided by the embodiment of the disclosure can comprise the following beneficial effects: by establishing an edge detector ED (μxy) of a pixel μxy on coordinates (x, y), a denoising model is proposed based on a total variation denoising model and the edge detector ED (μxy) to perform denoising treatment, so that a better denoising effect is obtained in the identification process of a common value-added tax invoice, different areas in the invoice identification process can be more accurately positioned by adopting a horizontal projection and vertical crossing number body distance method, and the common value-added tax invoice influenced by noise such as a seal can be more accurately processed.

According to a second aspect of the embodiments of the present disclosure, there is provided an image recognition apparatus including:

the image acquisition module is configured to acquire an original value-added tax common invoice color image with 24 bits, extract an R component of the original value-added tax common invoice color image as a gray image, and the gray value of a pixel point on the gray image to be identified is 0 or 255;

the image denoising module is configured to perform regularization denoising treatment on the gray level image to reduce noise points, obtain a denoised gray level image, and then perform binarization treatment of adaptive threshold segmentation on the denoised gray level image to obtain a value-added tax common invoice adaptive threshold binarization image;

the image positioning module is configured to roughly position the areas of the tax payer identification number, the invoice code, the invoice number, the invoicing date and the amount according to the position priori information of the tax payer identification number, the invoice code, the invoice number, the invoicing date and the amount, precisely position the areas by adopting a horizontal projection and a method for vertically crossing the distance of a number body, and acquire the to-be-identified tax payer identification number, the invoice code, the invoicing date and the amount after character segmentation normalization processing is carried out on the precisely positioned areas;

the image recognition module is configured to recognize the to-be-recognized buyer tax payer identification number, invoice code, invoice number, invoicing date and amount by using a template feature matching algorithm; and obtaining a recognition result.

Further, the image denoising module is further configured to select a non-local mean kernel, determine a denoising model proposed in the application according to the total variation denoising model and the edge detector, and comprises the following steps: establishing a pixel mu in coordinates (x, y) _xy Edge detector ED of (a) _(μxy) The following formula (1): ED (u) _xy )＝|e ^⊥ (x, y) -e (x, y) | (x, y) ∈Ω (1), ED is close to 0 in the smoothed region, ED gets larger near the edge, and ED is close to 0 in the noisy region; denoising model based on total variation and edge detector ED _(μxy) The denoising model is presented as follows:wherein f=μ ^* +ω(μ ^* Original unknown image, ω is gaussian noise), +.>Delta is a positive parameter for controlling the gradual decay of ψ (ED (μ)) from 2 to 1, the regularization parameter λ has the effect of mediating the approximation term, when λ is sufficiently large, the second term in the model is known to determine its effect, and when λ ->At 0, the first term controls the whole objective function, so that the selection of lambda is important in solving, the selection of regularization parameters is related to the noise variance of the initial addition, and the corresponding lambda expression is:

further, the image denoising module is further configured to include the steps of: obtaining a Lagrange equation of the denoising model type (2) by using a gradient descent method:wherein the diffusion function is->Let Φ(s) =s ^ED(u) 。

Further, the image denoising module is further configured to include the steps of: solving the lagrangian equation using a partial differential equation based method:wherein mu _NN Is the second derivative in the N direction, mu _TT Is the second derivative of N in the vertical direction T.

Further, the image denoising module is further configured to include the steps of: said mu _{NN sum} μ _TT The method comprises the following steps of:wherein mu _xx 、μ _yy Sum mu _xy Representing the second orderThe derivative, and t is the transpose operator, gives the discrete model of equation (4) as follows:and determining iteration stop time according to the energy check of the images before and after denoising.

The technical scheme provided by the embodiment of the disclosure can comprise the following beneficial effects: by establishing a pixel mu in coordinates (x, y) _xy Edge detector ED of (a) _(μxy) Denoising model based on total variation and edge detector ED _(μxy) The denoising model is provided for denoising treatment, so that a better denoising effect is obtained in the identification process of the common value-added tax invoice, and different areas in the invoice identification process can be positioned more accurately by adopting a horizontal projection and vertical crossing number distance method, so that the common value-added tax invoice influenced by noise such as a seal can be processed more accurately.

It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the disclosure.

Drawings

The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the disclosure and together with the description, serve to explain the principles of the disclosure.

FIG. 1 is a flow chart illustrating a method of identifying a generic value-added tax invoice, according to an example embodiment.

Fig. 2 is a block diagram illustrating a general value-added tax invoice recognition device, according to an exemplary embodiment.

Detailed Description

Reference will now be made in detail to exemplary embodiments, examples of which are illustrated in the accompanying drawings. When the following description refers to the accompanying drawings, the same numbers in different drawings refer to the same or similar elements, unless otherwise indicated. The implementations described in the following exemplary examples are not representative of all implementations consistent with the present disclosure. Rather, they are merely examples of apparatus and methods consistent with some aspects of the present disclosure as detailed in the accompanying claims.

When the method and the device are used for identifying the common value-added tax invoice, the first: by establishing a pixel mu in coordinates (x, y) _xy Edge detector ED of (a) _(μxy) Denoising model based on total variation and edge detector ED _(μxy) The denoising model is provided for denoising treatment, so that a better denoising effect is obtained in the identification process of the common value-added tax invoice, and different areas in the invoice identification process can be positioned more accurately by adopting a horizontal projection and vertical crossing number distance method, so that the common value-added tax invoice influenced by noise such as a seal can be processed more accurately. The image recognition method of the present disclosure is described in detail below with reference to the accompanying drawings.

Fig. 1 is a flowchart illustrating an image recognition method according to an exemplary embodiment, which may include the steps of:

s1, acquiring an image of a normal invoice of a value-added tax, namely acquiring an original color image of the normal invoice of the value-added tax with 24 bits, extracting an R component of the original color image of the normal invoice of the value-added tax, and taking the R component of the original color image of the normal invoice as a gray level image, wherein the gray level value of a pixel point on the gray level image to be identified is 0 or 255, and the image can be acquired by using modes such as camera shooting during image acquisition;

s3, roughly positioning the areas of the tax payer identification number, the invoice code, the invoice number, the billing date and the amount according to the position priori information of the tax payer identification number, the invoice code, the billing date and the amount, precisely positioning the areas by adopting a horizontal projection and a vertical crossing number body distance method, and carrying out character segmentation normalization processing on the precisely positioned areas to obtain the to-be-identified tax payer identification number, the invoice code, the invoice number, the billing date and the amount; the method has the advantages that the information required to be identified for the invoice to be identified can be better obtained by carrying out regional positioning on the invoice;

Further, the method further comprises the following steps: in step S2, in order to perform denoising better, the present invention selects a non-local mean kernel, so that similarity between pixels can be quantified according to edge metrics derived by blurring edge supplementation, and specifically, the regularized denoising process includes the following steps: establishing a pixel mu in coordinates (x, y) _xy Edge detector ED of (a) _(μxy) The following formula (1): ED (u) _xy )＝|e ^⊥ (x, y) -e (x, y) | (x, y) ∈Ω (1), ED is close to 0 in the smoothed region, ED gets larger near the edge, and ED is close to 0 in the noisy region; denoising model based on total variation and edge detector ED _(μxy) The denoising model is presented as follows:wherein f=μ ^* +ω(μ ^* Original unknown image, ω is gaussian noise), +.>Delta is a positive parameter for controlling the gradual decay of ψ (ED (μ)) from 2 to 1, the regularization parameter λ has the effect of mediating the approximation term, when λ is sufficiently large, the second term in the model is known to determine its effect, and when λ ->At 0, the first term controls the whole objective function, so that the selection of lambda is important in solving, the selection of regularization parameters is related to the noise variance of the initial addition, and the corresponding lambda expression is:

further, in step S2, the regularized denoising process includes the steps of: obtaining a Lagrange equation of the denoising model type (2) by using a gradient descent method:wherein the diffusion function is->Let Φ(s) =s ^ED(u) ：

Further, the image denoising module is further configured to include the steps of: said mu _{NN sum} μ _TT The method comprises the following steps of:wherein mu _xx 、μ _yy Sum mu _xy Represents the second derivative and t is the transpose operator, giving an equation

The discrete model of (2) is as follows:and determining iteration stop time according to the energy check of the images before and after denoising.

The following are device embodiments of the present disclosure that may be used to perform method embodiments of the present disclosure. For details not disclosed in the embodiments of the apparatus of the present disclosure, please refer to the embodiments of the method of the present disclosure.

Fig. 2 is a block diagram illustrating an image recognition apparatus according to an exemplary embodiment. The image recognition means may be implemented as part or all of the terminal device by software, hardware or a combination of both. Referring to fig. 2, the apparatus includes: the system comprises an image acquisition module, an image denoising module, an image positioning module and an image recognition module.

the image area positioning module is configured to roughly position areas of the tax payer identification number, the invoice code, the invoice number, the invoicing date and the amount according to the position priori information of the tax payer identification number, the invoice code, the invoice number, the invoicing date and the amount, accurately position the areas by adopting a horizontal projection and a method for vertically crossing the distance of a number body, and acquire the tax payer identification number, the invoice code, the invoice number, the invoicing date and the amount to be identified after carrying out character segmentation normalization on the accurately positioned areas;

further, the image denoising module is further configured to include the steps of: obtaining a Lagrange equation of the denoising model type (2) by using a gradient descent method:wherein the diffusion function is-> Let Φ(s) =s ^ED(u) 。

In summary, the device provided in this embodiment is configured to create the pixel μ in coordinates (x, y) _xy Edge detector ED of (a) _(μxy) Denoising model based on total variation and edge detector ED _(μxy) The denoising model is provided for denoising treatment, so that a better denoising effect is obtained in the identification process of the common value-added tax invoice, and different areas in the invoice identification process can be positioned more accurately by adopting a horizontal projection and vertical crossing number distance method, so that the common value-added tax invoice influenced by noise such as a seal can be processed more accurately.

The specific manner in which the various modules perform the operations in the apparatus of the above embodiments have been described in detail in connection with the embodiments of the method, and will not be described in detail herein.

The image recognition device of the value-added tax common invoice can be a mobile phone, a computer, a tablet device and the like.

The value added tax plain invoice image recognition device may include one or more of the following components: a processing component, a memory, an image acquisition component, a power supply component, a multimedia component, an audio component, an input/output (I/O) interface, a sensor component, and a communication component.

The processing component generally controls overall operation of the image recognition device of the value added tax plain invoice, such as operations associated with display, telephone call, data communication, camera operations, and recording operations. The processing component may include one or more processors to execute instructions to perform all or part of the steps of the methods described above. Further, the processing component may include one or more modules that facilitate interactions between the processing component and other components.

The memory is configured to store various types of data to support operations at the device. Examples of such data include instructions for any application or method operating on the device, contact data, phonebook data, messages, pictures, videos, and the like. The memory may be implemented by any type of volatile or nonvolatile memory device or combination thereof, such as Static Random Access Memory (SRAM), electrically erasable programmable read-only memory (EEPROM), erasable programmable read-only memory (EPROM), programmable read-only memory (PROM), read-only memory (ROM), magnetic memory, flash memory, magnetic or optical disk

The power supply assembly provides power to the various components of the device. The power components may include a power management system, one or more power sources, and other components associated with generating, managing, and distributing power for the device 800.

The I/O interface provides an interface between the processing assembly and a peripheral interface module, which may be a keyboard, click wheel, button, etc. These buttons may include, but are not limited to: homepage button, volume button, start button, and lock button

The image acquisition component can be a CCD camera or an author scanning component and is used for acquiring the image of the common value-added tax invoice to be identified.

The communication component is configured to facilitate communication between the apparatus and other devices in a wired or wireless manner. The device may access a wireless network based on a communication standard, such as WiFi,2G or 3G, or a combination thereof. In one exemplary embodiment, the communication component receives a broadcast signal or broadcast-related information from an external broadcast management system via a broadcast channel. In one exemplary embodiment, the communication component further comprises a Near Field Communication (NFC) module to facilitate short range communications. For example, the NFC module may be implemented based on Radio Frequency Identification (RFID) technology, infrared data association (IrDA) technology, ultra Wideband (UWB) technology, bluetooth (BT) technology, and other technologies.

In an exemplary embodiment, the apparatus may be implemented by one or more Application Specific Integrated Circuits (ASICs), digital Signal Processors (DSPs), digital Signal Processing Devices (DSPDs), programmable Logic Devices (PLDs), field Programmable Gate Arrays (FPGAs), controllers, microcontrollers, microprocessors, or other electronic elements for executing the methods described above.

In an exemplary embodiment, a non-transitory computer readable storage medium is also provided, such as a memory, comprising instructions executable by a processor of an apparatus to perform the above-described method. For example, the non-transitory computer readable storage medium may be ROM, random Access Memory (RAM), CD-ROM, magnetic tape, floppy disk, optical data storage device, etc.

A non-transitory computer readable storage medium, which when executed by a processor of an apparatus, causes the apparatus to perform an image recognition method.

Other embodiments of the disclosure will be apparent to those skilled in the art from consideration of the specification and practice of the disclosure disclosed herein. This application is intended to cover any adaptations, uses, or adaptations of the disclosure following, in general, the principles of the disclosure and including such departures from the present disclosure as come within known or customary practice within the art to which the disclosure pertains. It is intended that the specification and examples be considered as exemplary only, with a true scope and spirit of the disclosure being indicated by the following claims.

It is to be understood that the present disclosure is not limited to the precise arrangements and instrumentalities shown in the drawings, and that various modifications and changes may be effected without departing from the scope thereof. The scope of the present disclosure is limited only by the appended claims.

Claims

1. An image recognition method of a value-added tax common invoice is characterized by comprising the following steps of:

s1, acquiring an image of a value-added tax common invoice by using a camera to obtain an original value-added tax common invoice color image with 24 bits, and extracting an R component of the original value-added tax common invoice color image to be used as a gray level image to be identified, wherein the gray level value of a pixel point on the gray level image to be identified is 0 or 255;

s2, regularized denoising treatment is carried out on the gray level image to reduce noise points, a denoised gray level image is obtained, and then binarization treatment of self-adaptive threshold segmentation is carried out on the denoised gray level image, so that a value-added tax common invoice self-adaptive threshold binarization image is obtained; the regularized denoising processing process comprises the following steps: an edge detector ED (μxy) of the pixel μxy on the coordinates (x, y) is established as follows formula (1): DE (u) _xy )＝|e ^⊥ (x, y) -e (x, y) | (x, y) ∈Ω (1), ED is close to 0 in the smoothed region, ED gets larger near the edge, and ED is close to 0 in the noisy region; denoising model based on total variation and edge detector ED _(μxy) The denoising model is proposed as follows (2):where λ is regularization parameter, f=μ+ω, where μ is the original unknown image, ω is gaussian noise, +.>Delta is a positive parameter for controlling the gradual decay of ψ (ED (μ)) from 2 to 1;

2. The image recognition method of value added tax plain invoice according to claim 1, wherein the gradient descent method and the lagrangian equation for obtaining the denoising model (2) are utilized:wherein the diffusion function isLet Φ(s) =s ^ED(u) 。

3. The method for image recognition of value added tax plain invoices according to claim 2, wherein the lagrangian equation is solved using a partial differential equation based method:wherein mu _NN Is the second derivative in the N direction, mu _TT Is the second derivative of N in the vertical direction T.

4. The image recognition method of a value added tax plain invoice as claimed in claim 3, wherein the μ is _{NN sum} μ _TT The method comprises the following steps of:wherein mu _xx 、μ _yy Sum mu _xy Representing the second derivative, and t is the transpose operator, the discrete model given by equation (4) is as follows:and determining iteration stop time according to the energy check of the images before and after denoising.

5. An image recognition device of a value-added tax ordinary invoice is characterized in that:

the image acquisition module is used for carrying out image acquisition on the normal invoice of the value-added tax, obtaining an original color image of the normal invoice of the value-added tax with 24 bits, extracting an R component of the original color image of the normal invoice of the value-added tax as a gray level image to be identified, wherein the gray level value of a pixel point on the gray level image to be identified is 0 or 255;

the image denoising module is used for regularizing and denoising the gray level image to reduce noise points, obtaining a denoised gray level image, and then performing binarization processing of self-adaptive threshold segmentation on the denoised gray level image to obtain a value-added tax common invoice self-adaptive threshold binarization image; the regularization denoising processing procedure in the image denoising module comprises the following steps of: establishing a pixel mu in coordinates (x, y) _xy Edge detector ED of (a) _(μxy) The following formula (1): DE (u) _xy )＝|e ^⊥ (x, y) -e (x, y) | (x, y) ∈Ω (1), ED is close to 0 in the smoothed region, ED gets larger near the edge, and ED is close to 0 in the noisy region; denoising model based on total variation and edge detector ED _(μxy) The denoising model is proposed as follows (2):where λ is the regularization parameter, f=μ ^* +ω, wherein μ ^* For the original unknown image, ω is Gaussian noise, ψ +.>Delta is a positive parameter for controlling the gradual decay of ψ (ED (μ)) from 2 to 1;

the image positioning module roughly and roughly positions the areas of the tax payer identification number, the invoice code, the invoice number, the invoicing date and the amount according to the position priori information of the tax payer identification number, the invoice code, the invoicing date and the amount, precisely positions the areas by adopting a horizontal projection and a method for vertically crossing the distance of a number body, and performs character segmentation normalization processing on the precisely positioned areas to obtain the to-be-identified tax payer identification number, the invoice code, the invoice number, the invoicing date and the amount;

the image recognition module is used for recognizing the identification number, the invoice code, the invoice number, the billing date and the amount of the buyer tax payer to be recognized by using a template feature matching algorithm; and obtaining a recognition result.

6. The image recognition device of a value added tax plain invoice as claimed in claim 5, wherein the image denoising module further comprises a lagrangian equation using a gradient descent method and obtaining the denoising model (2):wherein the diffusion function is->Let Φ(s) =s ^ED(u) 。

7. The image recognition device of claim 6, wherein the image denoising module further comprises means for solving the lagrangian equation using a partial differential equation-based method:wherein mu _NN Is the second derivative in the N direction, mu _TT Is the second derivative of N in the vertical direction T.

8. The image recognition device of claim 7, wherein the image denoising module further comprises the μ _{NN sum} μ _TT The method comprises the following steps of: wherein mu _xx 、μ _yy Sum mu _xy Representing the second derivative, and t is the transpose operator, the discrete model given by equation (4) is as follows:and determining iteration stop time according to the energy check of the images before and after denoising.