CN116052180A - Invoice recognition method and device based on deep learning and electronic equipment - Google Patents

Invoice recognition method and device based on deep learning and electronic equipment Download PDF

Info

Publication number
CN116052180A
CN116052180A CN202211350976.2A CN202211350976A CN116052180A CN 116052180 A CN116052180 A CN 116052180A CN 202211350976 A CN202211350976 A CN 202211350976A CN 116052180 A CN116052180 A CN 116052180A
Authority
CN
China
Prior art keywords
invoice
recognition
image
area
deep learning
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202211350976.2A
Other languages
Chinese (zh)
Inventor
赵小诣
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Bank of China Financial Technology Co Ltd
Original Assignee
Bank of China Financial Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Bank of China Financial Technology Co Ltd filed Critical Bank of China Financial Technology Co Ltd
Priority to CN202211350976.2A priority Critical patent/CN116052180A/en
Publication of CN116052180A publication Critical patent/CN116052180A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition
    • G06V30/14Image acquisition
    • G06V30/148Segmentation of character regions
    • G06V30/153Segmentation of character regions using recognition of characters or words
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition
    • G06V30/14Image acquisition
    • G06V30/146Aligning or centring of the image pick-up or image-field
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition
    • G06V30/14Image acquisition
    • G06V30/148Segmentation of character regions
    • G06V30/15Cutting or merging image elements, e.g. region growing, watershed or clustering-based techniques

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Evolutionary Computation (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Computing Systems (AREA)
  • Databases & Information Systems (AREA)
  • General Health & Medical Sciences (AREA)
  • Medical Informatics (AREA)
  • Software Systems (AREA)
  • Character Input (AREA)

Abstract

The invention provides an invoice recognition method and device based on deep learning and electronic equipment, wherein the method comprises the following steps: acquiring an invoice image to be identified, wherein the invoice image to be identified comprises an entity invoice image or an electronic invoice image; performing image correction and type recognition on the invoice image based on a deep learning model to obtain a correction image and an image type of the invoice image; and carrying out character recognition on each invoice region of the corrected image by combining the image type to obtain an invoice recognition result of the corrected image. The invention can identify invoice images of different types, reduces the hardware development cost and improves the identification efficiency.

Description

Invoice recognition method and device based on deep learning and electronic equipment
Technical Field
The invention relates to the technical field of image recognition of deep learning, in particular to an invoice recognition method and device based on deep learning and electronic equipment.
Background
When the industry performs invoice recognition, the corresponding invoice recognition module is generally utilized to recognize and analyze the invoice image, and various information of the invoice is output. When a user uploads invoice image data, different types of image perspective differences are often generated due to differences in shooting modes. However, most invoice recognition modules on the market require that invoice images are input in a scanning-like form, and are difficult to directly recognize the printed invoice images, and a series of processing is required to be performed on the printed invoice images before recognition, so that the investment of hardware of a user is increased without difference.
Therefore, how to identify different types of invoices in a compatible manner so as to reduce development cost is a technical problem to be solved currently.
Disclosure of Invention
The invention provides an invoice recognition method, device and electronic equipment based on deep learning, which are used for solving the defect that the prior art cannot be compatible with different types of invoice recognition, realizing recognition of electronic version invoices and entity invoices and reducing development cost.
The invention provides an invoice recognition method based on deep learning, which comprises the following steps:
acquiring an invoice image to be identified, wherein the invoice image to be identified comprises an entity invoice image or an electronic invoice image;
performing image correction and type recognition on the invoice image based on a deep learning model to obtain a correction image and an image type of the invoice image;
and carrying out character recognition on each invoice region of the corrected image by combining the image type to obtain an invoice recognition result of the corrected image.
According to the invoice recognition method based on the deep learning, the deep learning model comprises an invoice main body positioning model and an invoice template recognition model;
the image correction and type recognition are carried out on the invoice image based on the deep learning model, so as to obtain a corrected image and an image type of the invoice image, and the method comprises the following steps:
Performing main body positioning on the invoice image based on an invoice main body positioning model to obtain a main body positioning result, and correcting the invoice image based on the main body positioning result to obtain the correction image;
and carrying out template recognition on the correction image based on the invoice template recognition model to obtain a template recognition result, and determining the image type of the correction image based on the template recognition result.
According to the invoice recognition method based on deep learning provided by the invention, the invoice image is subject-positioned based on the invoice subject-positioning model to obtain a subject-positioning result, the invoice image is corrected based on the subject-positioning result to obtain the corrected image, and the invoice recognition method comprises the following steps:
inputting the invoice image into the invoice main body positioning model, and carrying out coordinate recognition on table vertices in the invoice image to obtain recognition coordinate points of the vertices;
and aligning the identification coordinate point with the standard coordinate of the invoice image based on a coordinate transformation method to obtain a main body positioning result, and correcting the invoice image based on the main body positioning result to obtain the correction image.
According to the deep learning-based invoice recognition method provided by the invention, the text recognition is carried out on each invoice region of the corrected image by combining the image types to obtain the invoice recognition result of the corrected image, and the method comprises the following steps:
Performing character recognition on the invoice head area of the correction image by combining the image type to obtain a first recognition result of the invoice head area;
performing character recognition on the invoice end region of the correction image to obtain a second recognition result of the invoice end region;
performing text recognition on the invoice form area of the correction image to obtain a third recognition result of the invoice form area;
and summarizing the first recognition result, the second recognition result and the third recognition result to obtain an invoice recognition result of the correction image.
According to the deep learning-based invoice recognition method provided by the invention, the text recognition is performed on the invoice form area of the correction image to obtain a third recognition result of the invoice form area, and the method comprises the following steps:
intercepting the invoice form area based on the standard coordinates to obtain an invoice main area of the correction image;
dividing the invoice main area into a detail area and a non-detail area;
performing text recognition on the detail area to obtain a detail recognition result, and performing text recognition on the non-detail area to obtain a non-detail recognition result;
Wherein the non-detail area includes a buyer area, a password area, and a seller area.
According to the invoice recognition method based on deep learning provided by the invention, the text recognition is carried out on the detail area to obtain a detail recognition result, and the invoice recognition method comprises the following steps:
dividing the detail area into a plurality of target areas;
performing character recognition on a first target area to obtain character information in the first target area, determining a character detection frame based on the character information, and determining the height and width of the character detection frame;
determining a first recognition range based on the height of the text detection frame, and performing text recognition in the first recognition range to obtain a first detail recognition result in the first recognition range;
translating the first recognition range downwards by a target distance, determining a second recognition range, and performing character recognition in the second recognition range to obtain a second detail recognition result in the second recognition range;
wherein the target distance is obtained based on the height of the text detection frame.
According to the deep learning-based invoice recognition method provided by the invention, text recognition is carried out in the first recognition range to obtain a first detail recognition result in the first recognition range, and the method comprises the following steps:
Performing character recognition in the first recognition range to obtain text content and character detection frame information in the first recognition range;
determining the duty ratio of the text detection frame to each target area based on the text detection frame information;
determining home locations of the text content in the plurality of target areas based on the duty cycle;
and obtaining the first detail recognition result based on the text content and the attribution position of the text content.
The invention also provides an invoice recognition device based on deep learning, which comprises:
the system comprises an acquisition module, a recognition module and a storage module, wherein the acquisition module is used for acquiring an invoice image to be recognized, and the invoice image to be recognized comprises an entity invoice image or an electronic invoice image;
the processing module is used for carrying out image correction and type recognition on the invoice image based on the deep learning model to obtain a correction image and an image type of the invoice image;
and the identification module is used for carrying out character identification on each invoice region of the correction image by combining the image type to obtain an invoice identification result of the correction image.
The invention also provides electronic equipment, which comprises a memory, a processor and a computer program stored on the memory and capable of running on the processor, wherein the processor realizes the invoice recognition method based on the deep learning when executing the program.
The present invention also provides a non-transitory computer readable storage medium having stored thereon a computer program which, when executed by a processor, implements a deep learning based invoice recognition method as described in any one of the above.
The invention also provides a computer program product comprising a computer program which when executed by a processor implements the deep learning based invoice recognition method as described in any one of the above.
According to the deep learning-based invoice identification method, device and electronic equipment, the invoice image to be identified is obtained, wherein the invoice image to be identified comprises the entity invoice image or the electronic invoice image, then the invoice image is input into an established deep learning model, image correction and type identification are carried out on the invoice image based on the deep learning model, correction images and image types of the invoice image are obtained, and finally character identification is carried out on each invoice region of the correction image by combining the image types, so that an invoice identification result of the correction image is obtained. The invention can identify invoice images of different types, reduces the hardware development cost and improves the identification efficiency.
Drawings
In order to more clearly illustrate the invention or the technical solutions of the prior art, the following description will briefly explain the drawings used in the embodiments or the description of the prior art, and it is obvious that the drawings in the following description are some embodiments of the invention, and other drawings can be obtained according to the drawings without inventive effort for a person skilled in the art.
FIG. 1 is a schematic flow chart of an invoice recognition method based on deep learning provided by the invention;
FIG. 2 is one of the model structure diagrams of the deep learning-based invoice recognition method provided by the invention;
FIG. 3 is a second model structure diagram of the deep learning-based invoice recognition method provided by the invention;
FIG. 4 is a schematic diagram of a text detection frame of the deep learning-based invoice recognition method provided by the invention;
FIG. 5 is a third flow chart of the deep learning-based invoice recognition method provided by the invention;
FIG. 6 is a schematic diagram of the structure of the deep learning-based invoice recognition device provided by the invention;
fig. 7 is a schematic structural diagram of an electronic device provided by the present invention.
Detailed Description
For the purpose of making the objects, technical solutions and advantages of the present invention more apparent, the technical solutions of the present invention will be clearly and completely described below with reference to the accompanying drawings, and it is apparent that the described embodiments are some embodiments of the present invention, not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
Referring to fig. 1, the method for identifying an invoice based on deep learning provided by the invention comprises the following steps:
step 110, acquiring an invoice image to be identified, wherein the invoice image to be identified comprises an entity invoice image or an electronic invoice image;
step 120, performing image correction and type recognition on the invoice image based on a deep learning model to obtain a corrected image and an image type of the invoice image;
and 130, performing character recognition on each invoice region of the corrected image by combining the image type to obtain an invoice recognition result of the corrected image.
Firstly, it should be noted that the execution body of the deep learning-based invoice recognition method provided by the invention may be an electronic device, a component in the electronic device, an integrated circuit, or a chip. The electronic device may be a mobile electronic device or a non-mobile electronic device. By way of example, the mobile electronic device may be a cell phone, tablet, notebook, palmtop, ultra-mobile personal computer (ultra-mobile personal computer, UMPC), netbook or personal digital assistant (personal digital assistant, PDA), etc., and the non-mobile electronic device may be a server, network attached storage (Network Attached Storage, NAS) or personal computer (personal computer, PC), etc., the invention is not particularly limited. The steps of the invention will be described in detail below by executing the deep learning-based invoice recognition method provided by the invention with a computer.
In the step 110, the invoice image to be identified may be obtained by photographing the entity invoice, i.e. the printed invoice, or may be an electronic version of the PDF invoice.
The invoice image to be identified comprises an invoice head, an invoice tail and an invoice main body, and specific information of the invoice, such as an invoice issuer, a buyer, a consumption amount, a consumption date and the like, can be recorded.
In step 120, the deep learning model is a neural network model established based on a deep learning algorithm, and the network model is compatible with invoice main body positioning and invoice template recognition of the invoice image to be recognized in the embodiment, and meanwhile, the network has the characteristics of high multiplexing, simple network structure and small parameter number, and can be directly deployed on cpu and other devices for rapid calculation. In addition, the model can reduce the training process of a large number of invoice image samples, and further reduces the development cost.
By using the deep learning model, the invoice image to be identified can be subject-positioned. When the image to be identified is a physical invoice, the final invoice identification result is inaccurate due to the angle problem of manual shooting or the wrinkling of the invoice. Therefore, by using the deep learning model in this embodiment, the main part of the invoice in the invoice image can be positioned, the main part of the invoice is determined and corrected, the corrected invoice image is obtained, and then the corrected invoice image is identified, so that the accuracy of invoice identification can be improved.
In addition, the deep learning model in the embodiment can also perform invoice template recognition, namely, can accurately judge whether the invoice image is a physical invoice or an electronic invoice, and then respectively perform different processing on two different types of invoices.
In step 130, text recognition is performed on each invoice region of the corrected image according to the different types of invoice images obtained in step 120. The text in this embodiment is OCR recognized using OCR (optical character recognition) model.
Note that each invoice region includes an invoice head, an invoice tail, and an invoice body. And (3) for identifying the invoice head area, positioning and identifying the position of the invoice head of the corrected invoice image by utilizing an OCR model, and performing character identification by using different identification area coordinates according to the type of the invoice. For the identification of the invoice tail, the entity invoice is consistent with the electronic invoice in the invoice tail area template, so that the identification can be performed by using a unified OCR model. For the identification of the invoice form, the invoice body can be divided into a detail area and a non-detail area for identification respectively. And finally, summarizing the identification results of the areas to obtain a final identification result, sorting the identification results to obtain json output results, displaying the json output results, and enabling a user to check the identification results of the invoice in time.
According to the deep learning-based invoice identification method, the invoice image to be identified is obtained, wherein the invoice image to be identified comprises an entity invoice image or an electronic invoice image, then the invoice image is input into an established deep learning model, image correction and type identification are carried out on the invoice image based on the deep learning model, the correction image and the image type of the invoice image are obtained, and finally text identification is carried out on each invoice area of the correction image by combining the image types, so that an invoice identification result of the correction image is obtained. The invention can identify invoice images of different types, reduces the hardware development cost and improves the identification efficiency.
In some alternative embodiments, the deep learning model includes an invoice body positioning model and an invoice template recognition model;
the image correction and type recognition are carried out on the invoice image based on the deep learning model, so as to obtain a corrected image and an image type of the invoice image, and the method comprises the following steps:
performing main body positioning on the invoice image based on an invoice main body positioning model to obtain a main body positioning result, and correcting the invoice image based on the main body positioning result to obtain the correction image;
And carrying out template recognition on the correction image based on the invoice template recognition model to obtain a template recognition result, and determining the image type of the correction image based on the template recognition result.
It can be understood that the deep learning model in this embodiment may be two different models, that is, an invoice main body positioning model and an invoice template recognition model, and the two models may perform different processes according to different inputs, so as to respectively correct and recognize the type of the image.
According to the invoice recognition method based on deep learning, the two different models are used for respectively carrying out image correction and type recognition on the images input differently, so that recognition is carried out according to the corrected images and the types of the images, and the accuracy and the efficiency of invoice recognition are ensured.
Referring to fig. 2, the above-mentioned invoice body positioning model and invoice template recognition model mainly consist of three modules, including a cbr convolution module 210, a crc convolution module 220, and a deep convolution module 230; cbr convolution module 210 includes a first convolution layer 211, a bulk normalization layer 212, and a first Relu activation function 213; the crc convolution module 220 includes a second convolution layer 221, a second Relu activation function 222, and a third convolution layer 223; the deep convolution module 230 is comprised of a cbr convolution module 210 and a crc convolution module 220.
Referring to fig. 3, fig. 3 is a network structure diagram of the deep learning model of the present invention, which includes a plurality of cbr convolution modules, a crc convolution module, and a deep convolution module, and performs tensor stitching after performing image processing by the plurality of modules to obtain different linear layers, and then performs tensor stitching on two different linear layers to obtain a final linear layer as an output result.
Specifically, the method is shown in the following formula:
conv cbr :out=relu(bn(conv(in)))
contv crc :out=conv(relu(conv(in)))
conv deep :out=conv cbr (conv cbr (in))
layer 1 :x=pictures
Figure BDA0003918852750000091
Figure BDA0003918852750000092
/>
Figure BDA0003918852750000093
Figure BDA0003918852750000094
Figure BDA0003918852750000095
Figure BDA0003918852750000101
Figure BDA0003918852750000102
Figure BDA0003918852750000103
Figure BDA0003918852750000104
Figure BDA0003918852750000105
Figure BDA0003918852750000106
Figure BDA0003918852750000107
Figure BDA0003918852750000108
Figure BDA0003918852750000109
Figure BDA00039188527500001010
Figure BDA00039188527500001011
Figure BDA00039188527500001012
Figure BDA00039188527500001013
Figure BDA00039188527500001014
Figure BDA00039188527500001015
Figure BDA00039188527500001016
Figure BDA0003918852750000111
/>
Figure BDA0003918852750000112
Figure BDA0003918852750000113
Figure BDA0003918852750000114
Figure BDA0003918852750000115
Figure BDA0003918852750000116
Figure BDA0003918852750000117
Figure BDA0003918852750000118
Figure BDA0003918852750000119
Figure BDA00039188527500001110
Figure BDA00039188527500001111
Figure BDA00039188527500001112
layer out :y=linear(y cat3 ). Wherein the corresponding structure in the formula is explained as follows:
Figure BDA00039188527500001113
/>
Figure BDA0003918852750000121
in some optional embodiments, the performing body positioning on the invoice image based on the invoice body positioning model to obtain a body positioning result, correcting the invoice image based on the body positioning result to obtain the corrected image, and including:
inputting the invoice image into the invoice main body positioning model, and carrying out coordinate recognition on table vertices in the invoice image to obtain recognition coordinate points of the vertices;
and aligning the identification coordinate point with the standard coordinate of the invoice image based on a coordinate transformation method to obtain a main body positioning result, and correcting the invoice image based on the main body positioning result to obtain the correction image.
It can be appreciated that this embodiment is a specific process of image correction based on the invoice body positioning model.
First, the vertex coordinates of the four corners of the invoice form of the input invoice image are identified by using the invoice body positioning model, and the identified coordinate points of the respective vertices are obtained and recorded as (x) 1 , y 1 )、(x 2 ,y 2 )、(x 3 ,y 3 ) (x) 4 ,y 4 )。
Then, using the coordinate transformation method, (x) 1 ,y 1 )、(x 2 ,y 2 )、(x 3 ,y 3 ) (x) 4 ,y 4 ) And (3) correcting the coordinates of the image and aligning the corrected coordinates with the standard coordinates to obtain the corrected image. The standard coordinates are table coordinates of an invoice image to be used as a reference.
According to the deep learning-based invoice identification method, the acquired invoice image to be identified is subjected to coordinate alignment to obtain the corrected image, so that identification errors caused by poor shooting angles or the situation that the invoice is wrinkled can be reduced, and the identification accuracy of the invoice image is improved.
In some optional embodiments, the performing text recognition on each invoice region of the corrected image in combination with the image type to obtain an invoice recognition result of the corrected image includes:
performing character recognition on the invoice head area of the correction image by combining the image type to obtain a first recognition result of the invoice head area;
Performing character recognition on the invoice end region of the correction image to obtain a second recognition result of the invoice end region;
performing text recognition on the invoice form area of the correction image to obtain a third recognition result of the invoice form area;
and summarizing the first recognition result, the second recognition result and the third recognition result to obtain an invoice recognition result of the correction image.
It will be appreciated that this embodiment is a specific process of text recognition for each invoice region of the rectified image.
On the one hand, for the identification of the invoice head area, the corrected invoice image is utilized to carry out positioning identification on the invoice head position, and different identification area coordinates are used for carrying out character identification according to the invoice type.
On the other hand, for the identification of the invoice end, the entity invoice is consistent with the electronic invoice in the invoice end area template, so that the identification can be performed by using a unified OCR model.
In yet another aspect, for identification of the invoice form, the invoice body may be separately identified as being divided into a detail area and a non-detail area.
And finally, summarizing the identification results of the areas to obtain a final identification result, sorting the identification results to obtain json output results, displaying the json output results, and enabling a user to check the identification results of the invoice in time.
In some optional embodiments, the performing text recognition on the invoice table area of the correction image to obtain a third recognition result of the invoice table area includes:
intercepting the invoice form area based on the standard coordinates to obtain an invoice main area of the correction image;
dividing the invoice main area into a detail area and a non-detail area;
performing text recognition on the detail area to obtain a detail recognition result, and performing text recognition on the non-detail area to obtain a non-detail recognition result;
wherein the non-detail area includes a buyer area, a password area, and a seller area.
In the identification of the invoice form, the correction image needs to be first intercepted by using the standard coordinates of the form image in the above embodiment to obtain the main body area of the invoice. The area of the invoice body then needs to be further divided into a detail area and a non-detail area. The non-detail area includes a buyer area, a password area, and a seller area.
Identifying the non-detail area according to the OCR model to obtain the information of the purchaser, wherein the method comprises the following steps: buyer name, tax payer identification number, buyer phone, etc.; the information of the obtained password area is the password information of the invoice; information about the seller is obtained, including the seller's name, taxpayer's identification number, the seller's phone, etc.
And identifying the detail area according to the OCR model, so as to obtain the consumption details of the invoice, such as the consumption type, the consumption quantity, the unit price, the consumption amount, the tax and the like.
According to the deep learning-based invoice recognition method provided by the embodiment, the table area of the invoice is divided, and text recognition is respectively carried out on the detail area and the non-detail area, so that recognition results of information in different aspects in various invoices are obtained, the probability of recognition errors is reduced by utilizing regional processing, and the recognition accuracy is improved.
In some optional embodiments, the text recognition is performed on the detail area to obtain a detail recognition result, including:
dividing the detail area into a plurality of target areas;
performing character recognition on a first target area to obtain character information in the first target area, determining a character detection frame based on the character information, and determining the height and width of the character detection frame;
determining a first recognition range based on the height of the text detection frame, and performing text recognition in the first recognition range to obtain a first detail recognition result in the first recognition range;
translating the first recognition range downwards by a target distance, determining a second recognition range, and performing character recognition in the second recognition range to obtain a second detail recognition result in the second recognition range;
Wherein the target distance is obtained based on the height of the text detection frame.
It will be appreciated that this embodiment is a specific identification process for an invoice detail area.
First, the detail area is divided into a plurality of target areas according to the vertical lines of the form, typically, eight areas according to seven vertical lines and the frame of the form, and different types of data in the invoice are recorded respectively.
The vertex at the upper left of the main body area of the invoice may be defined as the origin of coordinates, the positive x-axis direction from the origin to the right and the positive y-axis direction from the origin to the bottom, and then the abscissas of the seven vertical lines of the form may be respectively marked as x s1 、x s2 、x s3 、x s4 、x s5 、x s6 X s7 I.e. 0-x s1 Is part of a first target area, x s1 -x s2 Is a second target region, and so on … …
Then, for the first target area, the abscissa is 0-x s1 And (3) to obtain text information in the first target area, for example: accommodation service-accommodation fee. Then determining the text detection frame according to the total length and the total width of the fonts of accommodation service, the width w and the height h of the text detection frame and the upper left corner coordinates (m, n) of the text detection frame, and in actual selection, the width and the height of the text detection frame The total length and width of the font should be slightly greater to ensure that the text can be accommodated within the text detection box.
After the text detection frame is determined, the first recognition range can be selected for recognition, and a recognition result in the first recognition range is obtained. The width of the first recognition range may be consistent with the width of the text detection frame, and the height of the first recognition range may be selected from a range of 0-n+1.5h from the ordinate of the text detection frame. It should be noted that, since there may be gaps between the text detection frame and the top of the table, and between the text detection frame and the text in the next row, n and 0.5h, respectively, (n has a value about the same as 0.5 h), the height of the first recognition range is set to n+1.5h, so as to ensure that the text will not be missed. Wherein h is the height of the text detection frame, and n is the ordinate of the vertex of the upper left corner of the text frame.
After the text information of the first recognition range is acquired, a coordinate fusion formula may be used to determine the column to which the text belongs, that is, to determine which target area the text is located in.
After the first recognition range is recognized, the text detection frame is translated downwards, so that whether other services and fees exist or not is ensured. The distance of the downward translation may be selected to be h, resulting in a second identification range. And the ordinate of the second recognition range is in the range of n+h to n+2.5h, so that a recognition result in the second recognition range is obtained. And by analogy, translating downwards for h each time until no text recognition result exists, completing text recognition of the detail area, and obtaining a final detail recognition result.
According to the invoice recognition method based on deep learning, through carrying out line recognition on the detail part of the invoice, translating downwards after the recognition of the first recognition range is completed, and then recognizing the second recognition range, the recognition effect of detail information can be better ensured, and the problems of wrong line, skip and the like caused by adopting a fixed template form are avoided.
In some optional embodiments, the performing text recognition in the first recognition range to obtain a first detail recognition result in the first recognition range includes:
performing character recognition in the first recognition range to obtain text content and character detection frame information in the first recognition range;
determining the duty ratio of the text detection frame to each target area based on the text detection frame information;
determining home locations of the text content in the plurality of target areas based on the duty cycle;
and obtaining the first detail recognition result based on the text content and the attribution position of the text content.
It will be appreciated that this embodiment is a specific procedure for determining to which target area the text information belongs.
First, character recognition is performed in a first recognition range, and text content and character detection frame information in the first recognition range are obtained. For example, the text content in the first recognition range includes information such as accommodation fee, unit-to-unit, number 2, unit price 98, total amount 196, and the like, which are located in different target areas, respectively. And then, determining the duty ratio of the text detection frame and each target area based on the text detection frame information, determining the attribution positions of the text content in a plurality of target areas based on the duty ratio, and finally obtaining a first detail recognition result based on the text content and the attribution positions of the text content.
For example, referring to fig. 4, the text detection box (text box) in fig. 4 is BCLJ, the first target area (invoice box 1) is EGOM, the second target area (invoice box 2) is GIQO, and this embodiment is: and judging which target area among the BCLJ, the EGOM and the GIQO is larger.
As can be seen from the figure, the overlapping areas between the text detection frame BCLJ and the two target areas are FGKJ and GHLK, respectively, and the home position of the text detection frame can be determined by determining the size of the ratio R of FGKJ to the first target area EGOM and the ratio S of GIQO to the second target area GIQO.
Wherein, the calculation formula of R is as follows:
Figure BDA0003918852750000171
Figure BDA0003918852750000172
S bound =S ACPM
Figure BDA0003918852750000173
Figure BDA0003918852750000174
Figure BDA0003918852750000175
Figure BDA0003918852750000176
the invoice recognition method based on deep learning provided by the embodiment can quickly judge how the text detection frame is supposed to be merged into the corresponding table element under different positions. And a quick and accurate judgment basis is provided for data backfilling of texts identified by subsequent characters. Compared with the mode of directly adopting coordinate comparison and relative position calculation, the method provided by the embodiment is more concise, and meanwhile, the method can solve the problem that the table coordinates are very small and very difficult to accurately determine the judgment threshold value.
Referring to fig. 5, fig. 5 is a complete flowchart of an invoice recognition method based on deep learning, provided by the invention, including: invoice acquisition flow 510, identification flow 520, detail identification flow 530, and output flow 540;
The invoice acquisition process 510 includes: step 511, acquiring an invoice image;
the identification process 520 includes:
step 521, inputting the invoice image into an invoice main body positioning model for distortion correction to obtain a corrected image;
522, inputting the correction image into an invoice template recognition model to obtain a machine-issued invoice and a physical invoice;
step 523, inputting the machine ticket to a machine ticket recognition template;
step 524, inputting the electronic invoice to an electronic invoice recognition template;
step 525, ticket head information character recognition;
step 526, identifying words of the ticket tail information;
invoice detail recognition process 530 includes:
step 531, invoice form interception;
step 532, region localization and identification (except for details);
step 533, beginning detail identification;
step 534, identifying the service name and the text detection frame height;
step 535, identifying the present line data;
step 536, translating the text detection frame downwards for h, and identifying the service name;
step 537, judging whether characters exist; if yes, return to step 534; if not, go to step 541;
the output flow 540 includes step 541: and outputting the identification result.
The deep learning-based invoice recognition device provided by the invention is described below, and the deep learning-based invoice recognition device described below and the deep learning-based invoice recognition method described above can be correspondingly referred to each other.
Referring to fig. 6, the deep learning-based invoice recognition device provided by the present invention includes, but is not limited to, the following modules:
an obtaining module 610, configured to obtain an invoice image to be identified, where the invoice image to be identified includes an entity invoice image or an electronic invoice image;
the processing module 620 is configured to perform image correction and type recognition on the invoice image based on a deep learning model, so as to obtain a corrected image and an image type of the invoice image;
and the recognition module 630 is configured to perform text recognition on each invoice region of the corrected image in combination with the image type, so as to obtain an invoice recognition result of the corrected image.
According to the deep learning-based invoice recognition device, the invoice image to be recognized is obtained, wherein the invoice image to be recognized comprises the entity invoice image or the electronic invoice image, then the invoice image is input into the established deep learning model, image correction and type recognition are carried out on the invoice image based on the deep learning model, the correction image and the image type of the invoice image are obtained, and finally the character recognition is carried out on each invoice area of the correction image by combining the image types, so that the invoice recognition result of the correction image is obtained. The invention can identify invoice images of different types, reduces the hardware development cost and improves the identification efficiency.
In some alternative embodiments, the deep learning model includes an invoice body positioning model and an invoice template recognition model;
the processing module comprises a correction sub-module and an identification sub-module:
the correction sub-module is used for carrying out main body positioning on the invoice image based on an invoice main body positioning model to obtain a main body positioning result, and correcting the invoice image based on the main body positioning result to obtain the correction image;
and the identification sub-module is used for carrying out template identification on the correction image based on the invoice template identification model to obtain a template identification result, and determining the image type of the correction image based on the template identification result.
In some alternative embodiments, the correction submodule is specifically configured to:
inputting the invoice image into the invoice main body positioning model, and carrying out coordinate recognition on table vertices in the invoice image to obtain recognition coordinate points of the vertices;
and aligning the identification coordinate point with the standard coordinate of the invoice image based on a coordinate transformation method to obtain a main body positioning result, and correcting the invoice image based on the main body positioning result to obtain the correction image.
In some alternative embodiments, the identification module is specifically configured to:
performing character recognition on the invoice head area of the correction image by combining the image type to obtain a first recognition result of the invoice head area;
performing character recognition on the invoice end region of the correction image to obtain a second recognition result of the invoice end region;
performing text recognition on the invoice form area of the correction image to obtain a third recognition result of the invoice form area;
and summarizing the first recognition result, the second recognition result and the third recognition result to obtain an invoice recognition result of the correction image.
In some alternative embodiments, the identification module is further to:
intercepting the invoice form area based on the standard coordinates to obtain an invoice main area of the correction image;
dividing the invoice main area into a detail area and a non-detail area;
performing text recognition on the detail area to obtain a detail recognition result, and performing text recognition on the non-detail area to obtain a non-detail recognition result;
wherein the non-detail area includes a buyer area, a password area, and a seller area.
In some alternative embodiments, the identification module is further to:
dividing the detail area into a plurality of target areas;
performing character recognition on a first target area to obtain character information in the first target area, determining a character detection frame based on the character information, and determining the height and width of the character detection frame;
determining a first recognition range based on the height of the text detection frame, and performing text recognition in the first recognition range to obtain a first detail recognition result in the first recognition range;
translating the first recognition range downwards by a target distance, determining a second recognition range, and performing character recognition in the second recognition range to obtain a second detail recognition result in the second recognition range;
wherein the target distance is obtained based on the height of the text detection frame.
In some alternative embodiments, the identification module is further configured to:
OC recognition is carried out in the first recognition range, and text content and text detection frame information in the first recognition range are obtained;
determining the duty ratio of the text detection frame to each target area based on the text detection frame information;
Determining home locations of the text content in the plurality of target areas based on the duty cycle;
and obtaining the first detail recognition result based on the text content and the attribution position of the text content.
Fig. 7 illustrates a physical schematic diagram of an electronic device, as shown in fig. 7, which may include: processor 710, communication interface (Communications Interface) 720, memory 730, and communication bus 740, wherein processor 710, communication interface 720, memory 730 communicate with each other via communication bus 740. Processor 710 may invoke logic instructions in memory 730 to perform a deep learning based invoice recognition method, the method comprising:
acquiring an invoice image to be identified, wherein the invoice image to be identified comprises an entity invoice image or an electronic invoice image;
performing image correction and type recognition on the invoice image based on a deep learning model to obtain a correction image and an image type of the invoice image;
and carrying out character recognition on each invoice region of the corrected image by combining the image type to obtain an invoice recognition result of the corrected image.
Further, the logic instructions in the memory 730 described above may be implemented in the form of software functional units and may be stored in a computer readable storage medium when sold or used as a stand alone product. Based on this understanding, the technical solution of the present invention may be embodied essentially or in a part contributing to the prior art or in a part of the technical solution, in the form of a software product stored in a storage medium, comprising several instructions for causing a computer device (which may be a personal computer, a server, a network device, etc.) to perform all or part of the steps of the method according to the embodiments of the present invention. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a random access Memory (RAM, random Access Memory), a magnetic disk, or an optical disk, or other various media capable of storing program codes.
In another aspect, the present invention also provides a computer program product, the computer program product including a computer program, the computer program being storable on a non-transitory computer readable storage medium, the computer program, when executed by a processor, being capable of executing the deep learning-based invoice recognition method provided by the above methods, the method comprising:
Acquiring an invoice image to be identified, wherein the invoice image to be identified comprises an entity invoice image or an electronic invoice image;
performing image correction and type recognition on the invoice image based on a deep learning model to obtain a correction image and an image type of the invoice image;
and carrying out character recognition on each invoice region of the corrected image by combining the image type to obtain an invoice recognition result of the corrected image.
In yet another aspect, the present invention also provides a non-transitory computer readable storage medium having stored thereon a computer program which, when executed by a processor, is implemented to perform the deep learning based invoice recognition method provided by the above methods, the method comprising:
acquiring an invoice image to be identified, wherein the invoice image to be identified comprises an entity invoice image or an electronic invoice image;
performing image correction and type recognition on the invoice image based on a deep learning model to obtain a correction image and an image type of the invoice image;
and carrying out character recognition on each invoice region of the corrected image by combining the image type to obtain an invoice recognition result of the corrected image.
The apparatus embodiments described above are merely illustrative, wherein the elements illustrated as separate elements may or may not be physically separate, and the elements shown as elements may or may not be physical elements, may be located in one place, or may be distributed over a plurality of network elements. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of this embodiment. Those of ordinary skill in the art will understand and implement the present invention without undue burden.
From the above description of the embodiments, it will be apparent to those skilled in the art that the embodiments may be implemented by means of software plus necessary general hardware platforms, or of course may be implemented by means of hardware. Based on this understanding, the foregoing technical solution may be embodied essentially or in a part contributing to the prior art in the form of a software product, which may be stored in a computer readable storage medium, such as ROM/RAM, a magnetic disk, an optical disk, etc., including several instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to execute the method described in the respective embodiments or some parts of the embodiments.
Finally, it should be noted that: the above embodiments are only for illustrating the technical solution of the present invention, and are not limiting; although the invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical scheme described in the foregoing embodiments can be modified or some technical features thereof can be replaced by equivalents; such modifications and substitutions do not depart from the spirit and scope of the technical solutions of the embodiments of the present invention.

Claims (10)

1. An invoice recognition method based on deep learning is characterized by comprising the following steps:
acquiring an invoice image to be identified, wherein the invoice image to be identified comprises an entity invoice image or an electronic invoice image;
performing image correction and type recognition on the invoice image based on a deep learning model to obtain a correction image and an image type of the invoice image;
and carrying out character recognition on each invoice region of the corrected image by combining the image type to obtain an invoice recognition result of the corrected image.
2. The deep learning based invoice recognition method as claimed in claim 1, wherein the deep learning model includes an invoice body positioning model and an invoice template recognition model;
The image correction and type recognition are carried out on the invoice image based on the deep learning model, so as to obtain a corrected image and an image type of the invoice image, and the method comprises the following steps:
performing main body positioning on the invoice image based on an invoice main body positioning model to obtain a main body positioning result, and correcting the invoice image based on the main body positioning result to obtain the correction image;
and carrying out template recognition on the correction image based on the invoice template recognition model to obtain a template recognition result, and determining the image type of the correction image based on the template recognition result.
3. The deep learning-based invoice recognition method according to claim 2, wherein the performing body positioning on the invoice image based on the invoice body positioning model to obtain a body positioning result, correcting the invoice image based on the body positioning result to obtain the corrected image includes:
inputting the invoice image into the invoice main body positioning model, and carrying out coordinate recognition on table vertices in the invoice image to obtain recognition coordinate points of the vertices;
and aligning the identification coordinate point with the standard coordinate of the invoice image based on a coordinate transformation method to obtain a main body positioning result, and correcting the invoice image based on the main body positioning result to obtain the correction image.
4. The deep learning-based invoice recognition method according to claim 1, wherein the performing text recognition on each invoice region of the corrected image in combination with the image type to obtain an invoice recognition result of the corrected image includes:
performing character recognition on the invoice head area of the correction image by combining the image type to obtain a first recognition result of the invoice head area;
performing character recognition on the invoice end region of the correction image to obtain a second recognition result of the invoice end region;
performing text recognition on the invoice form area of the correction image to obtain a third recognition result of the invoice form area;
and summarizing the first recognition result, the second recognition result and the third recognition result to obtain an invoice recognition result of the correction image.
5. The deep learning-based invoice recognition method as claimed in claim 4, wherein said text recognition of the invoice form area of the corrected image, to obtain a third recognition result of the invoice form area, includes:
intercepting the invoice form area based on the standard coordinates to obtain an invoice main area of the correction image;
Dividing the invoice main area into a detail area and a non-detail area;
performing text recognition on the detail area to obtain a detail recognition result, and performing text recognition on the non-detail area to obtain a non-detail recognition result;
wherein the non-detail area includes a buyer area, a password area, and a seller area.
6. The deep learning-based invoice recognition method as claimed in claim 5, wherein the text recognition is performed on the detail area to obtain a detail recognition result, and the method comprises the following steps:
dividing the detail area into a plurality of target areas;
performing character recognition on a first target area to obtain character information in the first target area, determining a character detection frame based on the character information, and determining the height and width of the character detection frame;
determining a first recognition range based on the height of the text detection frame, and performing text recognition in the first recognition range to obtain a first detail recognition result in the first recognition range;
translating the first recognition range downwards by a target distance, determining a second recognition range, and performing character recognition in the second recognition range to obtain a second detail recognition result in the second recognition range;
Wherein the target distance is obtained based on the height of the text detection frame.
7. The deep learning-based invoice recognition method as claimed in claim 6, wherein said performing text recognition in the first recognition range to obtain a first detail recognition result in the first recognition range includes:
performing character recognition in the first recognition range to obtain text content and character detection frame information in the first recognition range;
determining the duty ratio of the text detection frame to each target area based on the text detection frame information;
determining home locations of the text content in the plurality of target areas based on the duty cycle;
and obtaining the first detail recognition result based on the text content and the attribution position of the text content.
8. An invoice recognition device based on deep learning, which is characterized by comprising:
the system comprises an acquisition module, a recognition module and a storage module, wherein the acquisition module is used for acquiring an invoice image to be recognized, and the invoice image to be recognized comprises an entity invoice image or an electronic invoice image;
the processing module is used for carrying out image correction and type recognition on the invoice image based on the deep learning model to obtain a correction image and an image type of the invoice image;
And the identification module is used for carrying out character identification on each invoice region of the correction image by combining the image type to obtain an invoice identification result of the correction image.
9. An electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, wherein the processor implements the deep learning based invoice recognition method of any one of claims 1 to 7 when the program is executed by the processor.
10. A non-transitory computer readable storage medium having stored thereon a computer program, which when executed by a processor implements the deep learning based invoice recognition method of any one of claims 1 to 7.
CN202211350976.2A 2022-10-31 2022-10-31 Invoice recognition method and device based on deep learning and electronic equipment Pending CN116052180A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211350976.2A CN116052180A (en) 2022-10-31 2022-10-31 Invoice recognition method and device based on deep learning and electronic equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211350976.2A CN116052180A (en) 2022-10-31 2022-10-31 Invoice recognition method and device based on deep learning and electronic equipment

Publications (1)

Publication Number Publication Date
CN116052180A true CN116052180A (en) 2023-05-02

Family

ID=86130078

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211350976.2A Pending CN116052180A (en) 2022-10-31 2022-10-31 Invoice recognition method and device based on deep learning and electronic equipment

Country Status (1)

Country Link
CN (1) CN116052180A (en)

Similar Documents

Publication Publication Date Title
CN109977935B (en) Text recognition method and device
CN107798299B (en) Bill information identification method, electronic device and readable storage medium
CN109829453B (en) Method and device for recognizing characters in card and computing equipment
CN110866495A (en) Bill image recognition method, bill image recognition device, bill image recognition equipment, training method and storage medium
CN102799850B (en) A kind of barcode recognition method and device
CN110674815A (en) Invoice image distortion correction method based on deep learning key point detection
US20200372248A1 (en) Certificate recognition method and apparatus, electronic device, and computer-readable storage medium
CN108717543B (en) Invoice identification method and device and computer storage medium
CN110659647A (en) Seal image identification method and device, intelligent invoice identification equipment and storage medium
TW201543377A (en) Method and apparatus of extracting particular information from standard card
CN106326802B (en) Quick Response Code bearing calibration, device and terminal device
US20220222284A1 (en) System and method for automated information extraction from scanned documents
CN113158895B (en) Bill identification method and device, electronic equipment and storage medium
CN110490190A (en) A kind of structured image character recognition method and system
CN110070491A (en) Bank card picture antidote, device, equipment and storage medium
CN113673519B (en) Character recognition method based on character detection model and related equipment thereof
JP2015191382A (en) Image data processing device, method, and program
CN111783763A (en) Text positioning box correction method and system based on convolutional neural network
CN114332883A (en) Invoice information identification method and device, computer equipment and storage medium
CN114495146A (en) Image text detection method and device, computer equipment and storage medium
CN113221897B (en) Image correction method, image text recognition method, identity verification method and device
CN111199240A (en) Training method of bank card identification model, and bank card identification method and device
CN113487702A (en) Template generation method, image recognition method and device
CN117115823A (en) Tamper identification method and device, computer equipment and storage medium
CN116052180A (en) Invoice recognition method and device based on deep learning and electronic equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination