CN111222412B - Method and device for generating value-added tax ordinary invoice reimbursement information based on image recognition - Google Patents

Method and device for generating value-added tax ordinary invoice reimbursement information based on image recognition Download PDF

Info

Publication number
CN111222412B
CN111222412B CN201911210795.8A CN201911210795A CN111222412B CN 111222412 B CN111222412 B CN 111222412B CN 201911210795 A CN201911210795 A CN 201911210795A CN 111222412 B CN111222412 B CN 111222412B
Authority
CN
China
Prior art keywords
invoice
image
tax
denoising
reimbursement
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201911210795.8A
Other languages
Chinese (zh)
Other versions
CN111222412A (en
Inventor
肖文星
李敏
赵浩宇
陈文斌
赵珂
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Henan Institute of Science and Technology
Original Assignee
Henan Institute of Science and Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Henan Institute of Science and Technology filed Critical Henan Institute of Science and Technology
Priority to CN201911210795.8A priority Critical patent/CN111222412B/en
Publication of CN111222412A publication Critical patent/CN111222412A/en
Application granted granted Critical
Publication of CN111222412B publication Critical patent/CN111222412B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/40Document-oriented image-based pattern recognition
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/25Determination of region of interest [ROI] or a volume of interest [VOI]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/30Noise filtering

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Artificial Intelligence (AREA)
  • Image Analysis (AREA)
  • Character Input (AREA)

Abstract

Aiming at the problems of manual processing and low working efficiency of invoice receipts generated by the value-added tax reimbursement information, the method and the device for generating the value-added tax reimbursement information based on image recognition are provided to improve the accuracy of an invoice automatic processing process, and specifically, the method and the device for generating the value-added tax reimbursement information realize the automatic generation of the value-added tax reimbursement information by establishing a corresponding relation between an electronic invoice and a financial reimbursement subject, acquiring an electronic invoice image, preprocessing the electronic invoice image, denoising, region positioning and template matching operations and comparing the information.

Description

Method and device for generating value-added tax ordinary invoice reimbursement information based on image recognition
Technical Field
The invention relates to the technical field of financial information electronic processing, in particular to a value-added tax ordinary invoice reimbursement information generation method and device based on image recognition.
Background
In recent years, with the rapid development of Chinese economy, the types and the quantity of bills are in an annual rising trend, and the value-added tax common invoice is one of them. The use of a large number of value-added tax invoices brings serious challenges to corresponding invoice recognition technology and invoice automatic generation technology.
In the automatic identification process of the invoice image, an identification area is set through customizing a form template, identification attributes are set, special characters are called, option area identification is carried out, identification post-processing is carried out according to the identification attributes, and finally a structured identification result is output; or based on the use of the bloom TH-OCR technology, the invoice is subjected to a plurality of preprocessing operations, and particularly has the functions of rectifying deviation, correcting color cast, filtering color, reducing noise, binarizing, enhancing the contrast of a recognition unit and the like, and the functions can be flexibly configured and freely combined to output the optimal image quality for the later recognition.
However, at present, the problem of generating a lot of value-added tax common invoice reimbursement information is solved, and a lot of enterprises and public institutions need reimbursement after normal purchasing, and a financial system needs to manually process a lot of invoice receipts, so that a lot of manpower and material resources are consumed, and the working efficiency is low, and therefore, the automatic identification processing of the receipts can efficiently improve the working efficiency of financial departments. However, if the bill automation processing process has low effective recognition rate, not only can bring business risk, but also can increase the workload for the subsequent manual processing, so that the accuracy of bill automation processing is very necessary to be improved.
Disclosure of Invention
According to the technical problems, the invention provides a value-added tax common invoice reimbursement information generation method and device based on image recognition to improve the accuracy of an invoice automatic processing process. The technical scheme is as follows: according to a first aspect of an embodiment of the present disclosure, a method for generating value-added tax ordinary invoice reimbursement information by image recognition is provided, including: step 1, establishing a corresponding relation between an electronic invoice and a financial reimbursement subject, and generating an electronic invoice-reimbursement corresponding table, wherein table fields comprise: buyer tax payer identification number, invoice code, invoice number, goods or tax service/service name, invoice amount.
Step 2, acquiring an electronic invoice image, and performing preprocessing, denoising, region positioning and template matching on the electronic invoice image to obtain seller name, seller tax payer identification number, purchaser name, purchaser tax payer identification number, invoice code, invoice number, goods or tax service/service name, invoice amount, rechecking information and drawer information in the electronic invoice image.
And step 3, comparing the identified tax payer identification number of the buyer with the tax payer identification number of the buyer in the electronic invoice-reimbursement corresponding table, if the comparison result is consistent, entering step 4, otherwise ending the generation of the invoice reimbursement information.
And 4, comparing the checked information obtained by identification with the drawer information, ending the generation of the invoice reimbursement information if the comparison result is consistent, and automatically filling the invoice code data, the invoice number data, the goods or tax service name data and the invoice amount data of the electronic invoice image identification result into corresponding items in an electronic invoice-reimbursement corresponding table if the comparison result is inconsistent.
The step 2 specifically comprises the following steps:
s1, acquiring an image of a normal invoice of a value-added tax, obtaining an original color image of the normal invoice of the value-added tax with 24 bits, and extracting an R component of the original color image of the normal invoice of the value-added tax as a gray level image to be identified, wherein the gray level value of a pixel point on the gray level image to be identified is 0 or 255;
s2, regularized denoising treatment is carried out on the gray level image to reduce noise points, a denoised gray level image is obtained, and then binarization treatment of self-adaptive threshold segmentation is carried out on the denoised gray level image, so that a value-added tax common invoice self-adaptive threshold binarization image is obtained;
s3, roughly positioning the areas of the tax payer identification number, the invoice code, the invoice number, the billing date and the amount according to the position priori information of the tax payer identification number, the invoice code, the billing date and the amount, precisely positioning the areas by adopting a horizontal projection and a vertical crossing number body distance method, and carrying out character segmentation normalization processing on the precisely positioned areas to obtain the to-be-identified tax payer identification number, the invoice code, the invoice number, the billing date and the amount;
s4, identifying the identification number, the invoice code, the invoice number, the billing date and the amount of the buyer taxpayer to be identified by using a template feature matching algorithm; and obtaining a recognition result.
Further, the method further comprises the following steps: in step S2, in order to perform denoising better, the present invention selects a non-local mean kernel, so that similarity between pixels can be quantified according to edge metrics derived by blurring edge supplementation, and specifically, the regularized denoising process includes the following steps: establishing a pixel mu in coordinates (x, y) xy Edge detector ED of (a) (μxy) The following formula (1): ED(u xy )=|e (x, y) -e (x, y) | (x, y) ∈Ω (1), ED is close to 0 in the smoothed region, ED gets larger near the edge, and ED is close to 0 in the noisy region; denoising model based on total variation and edge detector ED (μxy) The denoising model is presented as follows:where λ is the regularization parameter, f=μ * +ω(μ * Original unknown image, ω is gaussian noise), +.>Delta is a positive parameter for controlling the gradual decay of ψ (ED (μ)) from 2 to 1, the regularization parameter λ has the effect of mediating the approximation term, when λ is sufficiently large, the second term in the model is known to determine its effect, and when λ ->At 0, the first term controls the whole objective function, so that the selection of lambda is important in solving, the selection of regularization parameters is related to the noise variance of the initial addition, and the corresponding lambda expression is:
further, in step S2, the regularized denoising processing includes the steps of using a gradient descent method and obtaining a lagrangian equation of the denoising model formula (2):wherein the diffusion function isLet Φ(s) =s ED(u)
Further, in step S2, the lagrangian equation is solved using a partial differential equation based method:wherein mu NN Is the second derivative in the N direction, mu TT Is the second derivative of N in the vertical direction T.
Further, in step S2Said mu NN sum μ TT The method comprises the following steps of:wherein mu xx 、μ yy Sum mu xy Representing the second derivative, and t is the transpose operator, the discrete model given by equation (4) is as follows:and determining iteration stop time according to the energy check of the images before and after denoising.
The technical scheme provided by the embodiment of the disclosure can comprise the following beneficial effects: by establishing an edge detector ED (mu xy) of a pixel mu xy on coordinates (x, y), carrying out denoising processing by proposing a denoising model based on a total variation denoising model and the edge detector ED (mu xy), better denoising effect is obtained in the identification process of the common value-added tax invoice, and different areas in the invoice identification process can be more accurately positioned by adopting a horizontal projection and vertical crossing number body distance method, so that the common value-added tax invoice influenced by noise such as a seal can be more accurately processed
According to a second aspect of the embodiments of the present disclosure, there is provided an image-identified value-added tax ordinary invoice reimbursement information generating device, the generating device including:
the data construction module is used for establishing a corresponding relation between the electronic invoice and the financial reimbursement subjects, generating an electronic invoice-reimbursement corresponding table, and the table fields comprise: buyer tax payer identification number, invoice code, invoice number, goods or tax service/service name, invoice amount.
The data processing module is used for acquiring an electronic invoice image, preprocessing the electronic invoice image, denoising, region positioning and template matching to obtain seller name, seller tax payer identification number, buyer name, buyer tax payer identification number, invoice code, invoice number, goods or tax payment/service name, invoice amount, rechecking information and drawer information in the electronic invoice image.
And the data matching module is used for comparing the identified tax payer identification number of the buyer with the tax payer identification number of the buyer in the electronic invoice-reimbursement corresponding table, if the comparison result is consistent, the data generating module is used, and if not, the generation of the invoice reimbursement information is finished.
And the data generation module is used for comparing the rechecking information obtained by identification with the drawer information, ending the generation of the invoice reimbursement information if the comparison result is consistent, and automatically filling the invoice code data, the invoice number data, the goods or tax service name data and the invoice amount data of the electronic invoice image identification result into the corresponding items in the electronic invoice-reimbursement corresponding table if the comparison result is inconsistent.
The data processing module comprises: the image acquisition module is configured to acquire an original value-added tax common invoice color image with 24 bits, extract an R component of the original value-added tax common invoice color image and serve as a gray image to be identified, wherein the gray value of a pixel point on the gray image to be identified is 0 or 255;
the image denoising module is configured to perform regularization denoising treatment on the gray level image to reduce noise points, obtain a denoised gray level image, and then perform binarization treatment of adaptive threshold segmentation on the denoised gray level image to obtain a value-added tax common invoice adaptive threshold binarization image;
the image positioning module is configured to roughly position the areas of the tax payer identification number, the invoice code, the invoice number, the invoicing date and the amount according to the position priori information of the tax payer identification number, the invoice code, the invoice number, the invoicing date and the amount, precisely position the areas by adopting a horizontal projection and a method for vertically crossing the distance of a number body, and acquire the to-be-identified tax payer identification number, the invoice code, the invoicing date and the amount after character segmentation normalization processing is carried out on the precisely positioned areas;
the image recognition module is configured to recognize the to-be-recognized buyer tax payer identification number, invoice code, invoice number, invoicing date and amount by using a template feature matching algorithm; and obtaining a recognition result.
Further, the image denoising module is further configured to select a non-local mean kernel, determine a denoising model proposed in the application according to the total variation denoising model and the edge detector, and comprises the following steps: establishing a pixel mu in coordinates (x, y) xy Edge detector ED of (a) (μxy) The following formula (1): ED (u) xy )=|e (x, y) -e (x, y) | (x, y) ∈Ω (1), ED is close to 0 in the smoothed region, ED gets larger near the edge, and ED is close to 0 in the noisy region; denoising model based on total variation and edge detector ED (μxy) The denoising model is presented as follows:where λ is the regularization parameter, f=μ * +ω(μ * Original unknown image, ω is gaussian noise), +.>Delta is a positive parameter for controlling the gradual decay of ψ (ED (μ)) from 2 to 1, the regularization parameter λ has the effect of mediating the approximation term, when λ is sufficiently large, the second term in the model is known to determine its effect, and when λ ->At 0, the first term controls the whole objective function, so that the selection of lambda is important in solving, the selection of regularization parameters is related to the noise variance of the initial addition, and the corresponding lambda expression is:
further, the image denoising module is further configured to include the steps of: obtaining a Lagrange equation of the denoising model type (2) by using a gradient descent method:wherein the diffusion function is->Let Φ(s) =s ED(u)
Further, image denoisingThe module is further configured to include the steps of: solving the lagrangian equation using a partial differential equation based method:wherein mu NN Is the second derivative in the N direction, mu TT Is the second derivative of N in the vertical direction T.
Further, the image denoising module is further configured to include the steps of: said mu NN sum μ TT The method comprises the following steps of:wherein mu xx 、μ yy Sum mu xy Representing the second derivative, and t is the transpose operator, the discrete model given by equation (4) is as follows:and determining iteration stop time according to the energy check of the images before and after denoising.
The technical scheme provided by the embodiment of the disclosure can comprise the following beneficial effects: by establishing a pixel mu in coordinates (x, y) xy Edge detector ED of (a) (μxy) Denoising model based on total variation and edge detector ED (μxy) The denoising model is provided for denoising treatment, so that a better denoising effect is obtained in the identification process of the common value-added tax invoice, and different areas in the invoice identification process can be positioned more accurately by adopting a horizontal projection and vertical crossing number distance method, so that the common value-added tax invoice influenced by noise such as a seal can be processed more accurately.
It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the disclosure.
Drawings
The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the disclosure and together with the description, serve to explain the principles of the disclosure.
Fig. 1 is a flowchart illustrating a method for generating image-identified value-added tax general invoice reimbursement information, according to an exemplary embodiment.
Fig. 2 is a block diagram illustrating an image-identified value-added tax general invoice reimbursement information generation apparatus according to an exemplary embodiment.
Detailed Description
Reference will now be made in detail to exemplary embodiments, examples of which are illustrated in the accompanying drawings. When the following description refers to the accompanying drawings, the same numbers in different drawings refer to the same or similar elements, unless otherwise indicated. The implementations described in the following exemplary examples are not representative of all implementations consistent with the present disclosure. Rather, they are merely examples of apparatus and methods consistent with some aspects of the present disclosure as detailed in the accompanying claims.
When the method and the device are used for identifying the common value-added tax invoice, the first: by establishing a pixel mu in coordinates (x, y) xy Edge detector ED of (a) (μxy) Denoising model based on total variation and edge detector ED (μxy) The denoising model is provided for denoising treatment, so that a better denoising effect is obtained in the identification process of the common value-added tax invoice, and different areas in the invoice identification process can be positioned more accurately by adopting a horizontal projection and vertical crossing number distance method, so that the common value-added tax invoice influenced by noise such as a seal can be processed more accurately. The method and the device for generating the value-added tax ordinary invoice reimbursement information based on image recognition are described in detail below with reference to the accompanying drawings.
Fig. 1 is a flowchart illustrating a method of generating image-identified value-added tax general invoice reimbursement information, according to an exemplary embodiment, the method may include the steps of: step 1, establishing a corresponding relation between an electronic invoice and a financial reimbursement subject, and generating an electronic invoice-reimbursement corresponding table, wherein table fields comprise: buyer tax payer identification number, invoice code, invoice number, goods or tax service/service name, invoice amount.
Step 2, acquiring an electronic invoice image, and performing preprocessing, denoising, region positioning and template matching on the electronic invoice image to obtain seller name, seller tax payer identification number, purchaser name, purchaser tax payer identification number, invoice code, invoice number, goods or tax service/service name, invoice amount, rechecking information and drawer information in the electronic invoice image.
And step 3, comparing the identified tax payer identification number of the buyer with the tax payer identification number of the buyer in the electronic invoice-reimbursement corresponding table, if the comparison result is consistent, entering step 4, otherwise ending the generation of the invoice reimbursement information. And 4, comparing the checked information obtained by identification with the drawer information, ending the generation of the invoice reimbursement information if the comparison result is consistent, and automatically filling the invoice code data, the invoice number data, the goods or tax service name data and the invoice amount data of the electronic invoice image identification result into corresponding items in an electronic invoice-reimbursement corresponding table if the comparison result is inconsistent.
The step 2 specifically comprises the following steps:
s1, acquiring an image of a normal invoice of a value-added tax, namely acquiring an original color image of the normal invoice of the value-added tax with 24 bits, extracting an R component of the original color image of the normal invoice of the value-added tax, and taking the R component of the original color image of the normal invoice as a gray level image, wherein the gray level value of a pixel point on the gray level image to be identified is 0 or 255, and the image can be acquired by using modes such as camera shooting during image acquisition;
s2, regularized denoising treatment is carried out on the gray level image to reduce noise points, a denoised gray level image is obtained, and then binarization treatment of self-adaptive threshold segmentation is carried out on the denoised gray level image, so that a value-added tax common invoice self-adaptive threshold binarization image is obtained;
s3, roughly positioning the areas of the tax payer identification number, the invoice code, the invoice number, the billing date and the amount according to the position priori information of the tax payer identification number, the invoice code, the billing date and the amount, precisely positioning the areas by adopting a horizontal projection and a vertical crossing number body distance method, and carrying out character segmentation normalization processing on the precisely positioned areas to obtain the to-be-identified tax payer identification number, the invoice code, the invoice number, the billing date and the amount; the method has the advantages that the information required to be identified for the invoice to be identified can be better obtained by carrying out regional positioning on the invoice;
s4, identifying the identification number, the invoice code, the invoice number, the billing date and the amount of the buyer taxpayer to be identified by using a template feature matching algorithm; and obtaining a recognition result.
Further, the method further comprises the following steps: in step S2, in order to perform denoising better, the present invention selects a non-local mean kernel, so that similarity between pixels can be quantified according to edge metrics derived by blurring edge supplementation, and specifically, the regularized denoising process includes the following steps: establishing a pixel mu in coordinates (x, y) xy Edge detector ED of (a) (μxy) The following formula (1): ED (u) xy )=|e (x, y) -e (x, y) | (x, y) ∈Ω (1), ED is close to 0 in the smoothed region, ED gets larger near the edge, and ED is close to 0 in the noisy region; denoising model based on total variation and edge detector ED (μxy) The denoising model is presented as follows:where λ is the regularization parameter, f=μ * +ω(μ * Original unknown image, ω is gaussian noise), +.>Delta is a positive parameter for controlling the gradual decay of ψ (ED (μ)) from 2 to 1, the regularization parameter λ has the effect of mediating the approximation term, when λ is sufficiently large, the second term in the model is known to determine its effect, and when λ ->At 0, the first term controls the whole objective function, so that the selection of lambda is important in solving, the selection of regularization parameters is related to the noise variance of the initial addition, and the corresponding lambda expression is:
further, in step S2, the regularized denoising process includes the steps of: obtaining the denoising model by using a gradient descent method(2) Lagrangian equation of (c):wherein the diffusion function isLet Φ(s) =s ED(u)
Further, the image denoising module is further configured to include the steps of: solving the lagrangian equation using a partial differential equation based method:wherein mu NN Is the second derivative in the N direction, mu TT Is the second derivative of N in the vertical direction T.
Further, the image denoising module is further configured to include the steps of: said mu NN sum μ TT The method comprises the following steps of:wherein mu xx 、μ yy Sum mu xy Represents the second derivative and t is the transpose operator giving the equation +.>The discrete model of (2) is as follows: />And determining iteration stop time according to the energy check of the images before and after denoising.
The following are device embodiments of the present disclosure that may be used to perform method embodiments of the present disclosure. For details not disclosed in the embodiments of the apparatus of the present disclosure, please refer to the embodiments of the method of the present disclosure.
Fig. 2 is a diagram illustrating an image-identified value-added tax general invoice reimbursement information generation apparatus according to an exemplary embodiment. The generating means may be implemented as part or all of the terminal device by software, hardware or a combination of both. Referring to fig. 2, the apparatus includes: the device comprises a data construction module, a data processing module, a data matching module and a data generation module.
The data construction module is used for establishing a corresponding relation between the electronic invoice and the financial reimbursement subjects, generating an electronic invoice-reimbursement corresponding table, and the table fields comprise: buyer tax payer identification number, invoice code, invoice number, goods or tax service/service name, invoice amount.
The data processing module is used for acquiring an electronic invoice image, preprocessing the electronic invoice image, denoising, region positioning and template matching to obtain seller name, seller tax payer identification number, buyer name, buyer tax payer identification number, invoice code, invoice number, goods or tax payment/service name, invoice amount, rechecking information and drawer information in the electronic invoice image.
And the data matching module is used for comparing the identified tax payer identification number of the buyer with the tax payer identification number of the buyer in the electronic invoice-reimbursement corresponding table, if the comparison result is consistent, the data generating module is used, and if not, the generation of the invoice reimbursement information is finished.
And the data generation module is used for comparing the rechecking information obtained by identification with the drawer information, ending the generation of the invoice reimbursement information if the comparison result is consistent, and automatically filling the invoice code data, the invoice number data, the goods or tax service name data and the invoice amount data of the electronic invoice image identification result into the corresponding items in the electronic invoice-reimbursement corresponding table if the comparison result is inconsistent.
Wherein the data processing module comprises: the image acquisition module is configured to acquire an original value-added tax common invoice color image with 24 bits, extract an R component of the original value-added tax common invoice color image as a gray image, and the gray value of a pixel point on the gray image to be identified is 0 or 255;
the image denoising module is configured to perform regularization denoising treatment on the gray level image to reduce noise points, obtain a denoised gray level image, and then perform binarization treatment of adaptive threshold segmentation on the denoised gray level image to obtain a value-added tax common invoice adaptive threshold binarization image;
the image area positioning module is configured to roughly position areas of the tax payer identification number, the invoice code, the invoice number, the invoicing date and the amount according to the position priori information of the tax payer identification number, the invoice code, the invoice number, the invoicing date and the amount, accurately position the areas by adopting a horizontal projection and a method for vertically crossing the distance of a number body, and acquire the tax payer identification number, the invoice code, the invoice number, the invoicing date and the amount to be identified after carrying out character segmentation normalization on the accurately positioned areas;
the image recognition module is configured to recognize the to-be-recognized buyer tax payer identification number, invoice code, invoice number, invoicing date and amount by using a template feature matching algorithm; and obtaining a recognition result.
Further, the image denoising module is further configured to select a non-local mean kernel, determine a denoising model proposed in the application according to the total variation denoising model and the edge detector, and comprises the following steps: establishing a pixel mu in coordinates (x, y) xy Edge detector ED of (a) (μxy) The following formula (1): ED (u) xy )=|e (x, y) -e (x, y) | (x, y) ∈Ω (1), ED is close to 0 in the smoothed region, ED gets larger near the edge, and ED is close to 0 in the noisy region; denoising model based on total variation and edge detector ED (μxy) The denoising model is presented as follows:where λ is the regularization parameter, f=μ * +ω(μ * Original unknown image, ω is gaussian noise), +.>Delta is a positive parameter for controlling the gradual decay of ψ (ED (μ)) from 2 to 1, the regularization parameter λ has the effect of mediating the approximation term, when λ is sufficiently large, the second term in the model is known to determine its effect, and when λ ->When 0, the first term controls the whole objective function, so that the selection of lambda is very important in solving and the selection of regularization parametersTaking the relation to the noise variance of the initial addition, the corresponding lambda expression is:
further, the image denoising module is further configured to include the steps of: obtaining a Lagrange equation of the denoising model type (2) by using a gradient descent method:wherein the diffusion function is->Let Φ(s) =s ED(u)
Further, the image denoising module is further configured to include the steps of: obtaining a Lagrange equation of the denoising model type (2) by using a gradient descent method:wherein the diffusion function is->Let Φ(s) =s ED(u)
Further, in step S2, the μ NN sum μ TT The method comprises the following steps of:wherein mu xx 、μ yy Sum mu xy Representing the second derivative, and t is the transpose operator, the discrete model given by equation (4) is as follows:and determining iteration stop time according to the energy check of the images before and after denoising.
In summary, the device provided in this embodiment is configured to create the pixel μ in coordinates (x, y) xy Edge detector ED of (a) (μxy) Denoising model based on total variation and edge detector ED (μxy) Extracting a denoising model to perform denoising treatment, and recognizing common value-added tax invoiceBetter denoising effect is obtained in other processes, and different areas in the ticket departure recognition process can be more accurately positioned by adopting a horizontal projection and vertical crossing number distance method, so that the common value-added tax invoice influenced by noise such as a seal can be more accurately processed.
The specific manner in which the various modules perform the operations in the apparatus of the above embodiments have been described in detail in connection with the embodiments of the method, and will not be described in detail herein.
The image recognition device of the value-added tax common invoice can be a mobile phone, a computer, a tablet device and the like.
The value added tax plain invoice image recognition device may include one or more of the following components: a processing component, a memory, an image acquisition component, a power supply component, a multimedia component, an audio component, an input/output (I/O) interface, a sensor component, and a communication component.
The processing component generally controls overall operation of the image recognition device of the value added tax plain invoice, such as operations associated with display, telephone call, data communication, camera operations, and recording operations. The processing component may include one or more processors to execute instructions to perform all or part of the steps of the methods described above. Further, the processing component may include one or more modules that facilitate interactions between the processing component and other components.
The memory is configured to store various types of data to support operations at the device. Examples of such data include instructions for any application or method operating on the device, contact data, phonebook data, messages, pictures, videos, and the like. The memory may be implemented by any type of volatile or nonvolatile memory device or combination thereof, such as Static Random Access Memory (SRAM), electrically erasable programmable read-only memory (EEPROM), erasable programmable read-only memory (EPROM), programmable read-only memory (PROM), read-only memory (ROM), magnetic memory, flash memory, magnetic or optical disk
The power supply assembly provides power to the various components of the device. The power components may include a power management system, one or more power sources, and other components associated with generating, managing, and distributing power for the device 800.
The I/O interface provides an interface between the processing assembly and a peripheral interface module, which may be a keyboard, click wheel, button, etc. These buttons may include, but are not limited to: homepage button, volume button, start button, and lock button
The image acquisition component can be a CCD camera or an author scanning component and is used for acquiring the image of the common value-added tax invoice to be identified.
The communication component is configured to facilitate communication between the apparatus and other devices in a wired or wireless manner. The device may access a wireless network based on a communication standard, such as WiFi,2G or 3G, or a combination thereof. In one exemplary embodiment, the communication component receives a broadcast signal or broadcast-related information from an external broadcast management system via a broadcast channel. In one exemplary embodiment, the communication component further comprises a Near Field Communication (NFC) module to facilitate short range communications. For example, the NFC module may be implemented based on Radio Frequency Identification (RFID) technology, infrared data association (IrDA) technology, ultra Wideband (UWB) technology, bluetooth (BT) technology, and other technologies.
In an exemplary embodiment, the apparatus may be implemented by one or more Application Specific Integrated Circuits (ASICs), digital Signal Processors (DSPs), digital Signal Processing Devices (DSPDs), programmable Logic Devices (PLDs), field Programmable Gate Arrays (FPGAs), controllers, microcontrollers, microprocessors, or other electronic elements for executing the methods described above.
In an exemplary embodiment, a non-transitory computer readable storage medium is also provided, such as a memory, comprising instructions executable by a processor of an apparatus to perform the above-described method. For example, the non-transitory computer readable storage medium may be ROM, random Access Memory (RAM), CD-ROM, magnetic tape, floppy disk, optical data storage device, etc.
A non-transitory computer readable storage medium, which when executed by a processor of an apparatus, causes the apparatus to perform a value-added tax plain invoice reimbursement information generation method of image recognition.
Other embodiments of the disclosure will be apparent to those skilled in the art from consideration of the specification and practice of the disclosure disclosed herein. This application is intended to cover any adaptations, uses, or adaptations of the disclosure following, in general, the principles of the disclosure and including such departures from the present disclosure as come within known or customary practice within the art to which the disclosure pertains. It is intended that the specification and examples be considered as exemplary only, with a true scope and spirit of the disclosure being indicated by the following claims.
It is to be understood that the present disclosure is not limited to the precise arrangements and instrumentalities shown in the drawings, and that various modifications and changes may be effected without departing from the scope thereof. The scope of the present disclosure is limited only by the appended claims.

Claims (5)

1. A value-added tax ordinary invoice reimbursement information generation method based on image recognition is characterized by comprising the following steps of:
step 1, establishing a corresponding relation between an electronic invoice and a financial reimbursement subject, and generating an electronic invoice-reimbursement corresponding table, wherein table fields comprise: buyer tax payer identification number, invoice code, invoice number, goods or tax service/service name, invoice amount;
step 2, acquiring an electronic invoice image, and performing preprocessing, denoising, region positioning and template matching on the electronic invoice image to obtain seller name, seller tax payer identification number, purchaser name, purchaser tax payer identification number, invoice code, invoice number, goods or tax service/service name, invoice amount, rechecking information and drawer information in the electronic invoice image; the step 2 specifically comprises the following steps: s1, acquiring an image of a value-added tax common invoice by using a camera to obtain an original value-added tax common invoice color image with 24 bits, and extracting an R component of the original value-added tax common invoice color image to be used as a gray level image to be identified, wherein the gray level value of a pixel point on the gray level image to be identified is 0 or 255; s2, for the ashRegularization denoising treatment is carried out on the degree image to reduce noise points, a denoised gray level image is obtained, then binarization treatment of self-adaptive threshold segmentation is carried out on the denoised gray level image, and a value-added tax common invoice self-adaptive threshold binarization image is obtained; s3, roughly positioning the areas of the tax payer identification number, the invoice code, the invoice number, the billing date and the amount according to the position priori information of the tax payer identification number, the invoice code, the billing date and the amount, precisely positioning the areas by adopting a horizontal projection and a vertical crossing number body distance method, and carrying out character segmentation normalization processing on the precisely positioned areas to obtain the to-be-identified tax payer identification number, the invoice code, the invoice number, the billing date and the amount; s4, identifying the identification number, the invoice code, the invoice number, the billing date and the amount of the buyer taxpayer to be identified by using a template feature matching algorithm; obtaining an identification result; the regularized denoising processing process comprises the following steps: establishing a pixel mu in coordinates (x, y) xy Edge detector ED of (a) (μxy) The following formula (1): ED (u) xy )=|e (x, y) -e (x, y) | (x, y) ∈Ω (1), ED is close to 0 in the smoothed region, ED gets larger near the edge, and ED is close to 0 in the noisy region; denoising model based on total variation and edge detector ED (μxy) The denoising model is proposed as follows (2):where λ is the regularization parameter, f=μ * +ω, wherein μ * For the original unknown image, ω is Gaussian noise, +.>Delta is a positive parameter for controlling the gradual decay of ψ (ED (μ)) from 2 to 1; using a gradient descent method and obtaining a Lagrange equation of the denoising model type (2): />Wherein the diffusion function is Θ ->Let Φ(s) =s ED(u)
Step 3, comparing the identified tax payer identification number of the buyer with the tax payer identification number of the buyer in the electronic invoice-reimbursement corresponding table, if the comparison result is consistent, entering step 4, otherwise ending the generation of the invoice reimbursement information;
and 4, comparing the checked information obtained by identification with the drawer information, ending the generation of the invoice reimbursement information if the comparison result is consistent, and automatically filling the invoice code data, the invoice number data, the goods or tax service name data and the invoice amount data of the electronic invoice image identification result into corresponding items in an electronic invoice-reimbursement corresponding table if the comparison result is inconsistent.
2. The image recognition-based electronic invoice reimbursement information generation method as recited in claim 1, wherein the lagrangian equation is solved using a partial differential equation-based method: wherein mu NN Is the second derivative in the N direction, mu TT Is the second derivative of N in the vertical direction T.
3. The electronic invoice reimbursement information generation method based on image recognition as claimed in claim 2, wherein the μ is NN sum μ TT The method comprises the following steps of:
wherein mu xx 、μ yy Sum mu xy Representing the second derivative, and t is the transpose operator, the discrete model given by equation (4) is as follows:
and determining iteration stop time according to the energy check of the images before and after denoising.
4. An image recognition-based value-added tax ordinary invoice reimbursement information generation device is characterized in that the generation device comprises:
the data construction module is used for establishing a corresponding relation between the electronic invoice and the financial reimbursement subjects, generating an electronic invoice-reimbursement corresponding table, and the table fields comprise: buyer tax payer identification number, invoice code, invoice number, goods or tax service/service name, invoice amount;
the data processing module is used for acquiring an electronic invoice image, preprocessing the electronic invoice image, denoising, region positioning and template matching to obtain seller name, seller tax payer identification number, buyer name, buyer tax payer identification number, invoice code, invoice number, goods or tax payment/service name, invoice amount, rechecking information and drawer information in the electronic invoice image; the data processing module comprises: the image acquisition module is used for carrying out image acquisition on the normal invoice of the value-added tax, obtaining an original color image of the normal invoice of the value-added tax with 24 bits, extracting an R component of the original color image of the normal invoice of the value-added tax as a gray level image to be identified, wherein the gray level value of a pixel point on the gray level image to be identified is 0 or 255; the image denoising module is used for regularizing and denoising the gray level image to reduce noise points, obtaining a denoised gray level image, and then performing binarization processing of self-adaptive threshold segmentation on the denoised gray level image to obtain a value-added tax common invoice self-adaptive threshold binarization image; the image positioning module is used for greatly positioning the purchasing tax payer according to the position priori information of the identification number, the invoice code, the invoice number, the billing date and the money amount of the purchasing tax payerCoarsely positioning areas of a tax payer identification number, an invoice code, an invoice number, an invoicing date and an amount, precisely positioning the areas by adopting a horizontal projection and vertical crossing number distance method, and obtaining the tax payer identification number, the invoice code, the invoice number, the invoicing date and the amount to be identified after character segmentation normalization processing is carried out on the precisely positioned areas; the image recognition module is used for recognizing the identification number, the invoice code, the invoice number, the billing date and the amount of the buyer tax payer to be recognized by using a template feature matching algorithm; obtaining an identification result; the regularization denoising processing procedure in the image denoising module comprises the following steps of: establishing a pixel mu in coordinates (x, y) xy Edge detector ED of (a) (μxy) The following formula (1): ED (u) xy )=|e (x, y) -e (x, y) | (x, y) ∈Ω (1), ED is close to 0 in the smoothed region, ED gets larger near the edge, and ED is close to 0 in the noisy region; denoising model based on total variation and edge detector ED (μxy) The denoising model is proposed as follows (2):where λ is the regularization parameter, f=μ * +ω, wherein μ * For the original unknown image, ω is Gaussian noise, +.>Delta is a positive parameter for controlling the gradual decay of ψ (ED (μ)) from 2 to 1; the image denoising module further comprises a Lagrange equation for obtaining the denoising model type (2) by using a gradient descent method: />Wherein the diffusion function is->Let Φ(s) =s ED(u)
The data matching module is used for comparing the identified tax payer identification number with the purchase tax payer identification number in the electronic invoice-reimbursement corresponding table, if the comparison result is consistent, the data generating module is used, otherwise, the generation of the invoice reimbursement information is finished;
and the data generation module is used for comparing the rechecking information obtained by identification with the drawer information, ending the generation of the invoice reimbursement information if the comparison result is consistent, and automatically filling the invoice code data, the invoice number data, the goods or tax service name data and the invoice amount data of the electronic invoice image identification result into the corresponding items in the electronic invoice-reimbursement corresponding table if the comparison result is inconsistent.
5. The image recognition-based value-added tax plain invoice reimbursement information generation apparatus as claimed in claim 4, wherein said image denoising module further comprises solving the lagrangian equation using a partial differential equation-based method:wherein mu NN Is the second derivative in the N direction, mu TT Is the second derivative of N in the vertical direction T, and the image denoising module further comprises the mu NN sum μ TT The method comprises the following steps of:wherein mu xx 、μ yy Sum mu xy Representing the second derivative, and t is the transpose operator, the discrete model given by equation (4) is as follows:and determining iteration stop time according to the energy check of the images before and after denoising.
CN201911210795.8A 2019-12-02 2019-12-02 Method and device for generating value-added tax ordinary invoice reimbursement information based on image recognition Active CN111222412B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201911210795.8A CN111222412B (en) 2019-12-02 2019-12-02 Method and device for generating value-added tax ordinary invoice reimbursement information based on image recognition

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201911210795.8A CN111222412B (en) 2019-12-02 2019-12-02 Method and device for generating value-added tax ordinary invoice reimbursement information based on image recognition

Publications (2)

Publication Number Publication Date
CN111222412A CN111222412A (en) 2020-06-02
CN111222412B true CN111222412B (en) 2023-08-01

Family

ID=70827731

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201911210795.8A Active CN111222412B (en) 2019-12-02 2019-12-02 Method and device for generating value-added tax ordinary invoice reimbursement information based on image recognition

Country Status (1)

Country Link
CN (1) CN111222412B (en)

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11030450B2 (en) * 2018-05-31 2021-06-08 Vatbox, Ltd. System and method for determining originality of computer-generated images
CN111861594A (en) * 2020-07-21 2020-10-30 湖南中斯信息科技有限公司 Invoice processing method, invoice processing device, storage medium and processor
CN111784423B (en) * 2020-07-31 2023-08-25 广东电网有限责任公司梅州供电局 Invoice matching method and device, electronic equipment and storage medium
CN113344690A (en) * 2021-06-30 2021-09-03 中国工商银行股份有限公司 Invoice reimbursement processing method and device

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108549843A (en) * 2018-03-22 2018-09-18 南京邮电大学 A kind of VAT invoice recognition methods based on image procossing
CN109543690A (en) * 2018-11-27 2019-03-29 北京百度网讯科技有限公司 Method and apparatus for extracting information

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20090112743A1 (en) * 2007-10-31 2009-04-30 Mullins Christine M System and method for reporting according to eu vat related legal requirements

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108549843A (en) * 2018-03-22 2018-09-18 南京邮电大学 A kind of VAT invoice recognition methods based on image procossing
CN109543690A (en) * 2018-11-27 2019-03-29 北京百度网讯科技有限公司 Method and apparatus for extracting information

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
王淑英 ; 李瑞恒 ; .增值税专用发票的使用和管理.河北水利.2012,(07),全文. *

Also Published As

Publication number Publication date
CN111222412A (en) 2020-06-02

Similar Documents

Publication Publication Date Title
CN111222412B (en) Method and device for generating value-added tax ordinary invoice reimbursement information based on image recognition
US11676285B1 (en) System, computing device, and method for document detection
AU2017302250B2 (en) Optical character recognition in structured documents
US9122953B2 (en) Methods and systems for character segmentation in automated license plate recognition applications
US20200184549A1 (en) Image-based financial processing
US11386417B2 (en) Payment methods and systems by scanning QR codes already present in a user device
US10037581B1 (en) Methods systems and computer program products for motion initiated document capture
US8818107B2 (en) Identification generation and authentication process application
US20140279303A1 (en) Image capture and processing for financial transactions
JP2016517587A (en) Classification of objects in digital images captured using mobile devices
US10339373B1 (en) Optical character recognition utilizing hashed templates
KR101893679B1 (en) Card number recognition method using deep learnig
CN111209792B (en) Image recognition method and device for value-added tax common invoice
US10943226B2 (en) Method and system of capturing an image of a card
CN109344926A (en) Processing method, device and the equipment of service receipt information
US20150120517A1 (en) Data lifting for duplicate elimination
Chen et al. Video-based content recognition of bank cards with mobile devices
US11900755B1 (en) System, computing device, and method for document detection and deposit processing
Pourreza-Shahri et al. Automatic exposure selection and fusion for high-dynamic-range photography via smartphones
US20200027138A1 (en) Lost and found management systems and methods
AU2013100314A4 (en) Computer implemented frameworks and methodologies for providing financial management and payment solutions via mobile devices
Ettl et al. Text and image area classification in mobile scanned digitised documents
CN117218654A (en) Bank card number identification method and device, storage medium and electronic equipment
CN117635762A (en) Cloud wardrobe generation method, device, equipment and storage medium
CN117253248A (en) Bill identification method, device and storage medium based on large language model

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant