CN114332883A - Invoice information identification method and device, computer equipment and storage medium - Google Patents

Invoice information identification method and device, computer equipment and storage medium Download PDF

Info

Publication number
CN114332883A
CN114332883A CN202210004673.9A CN202210004673A CN114332883A CN 114332883 A CN114332883 A CN 114332883A CN 202210004673 A CN202210004673 A CN 202210004673A CN 114332883 A CN114332883 A CN 114332883A
Authority
CN
China
Prior art keywords
invoice
information
sample
text information
key text
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210004673.9A
Other languages
Chinese (zh)
Inventor
徐敏
李捷
张玉琦
赵逸如
张瑞雪
周丹雅
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shanghai Pudong Development Bank Co Ltd
Original Assignee
Shanghai Pudong Development Bank Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shanghai Pudong Development Bank Co Ltd filed Critical Shanghai Pudong Development Bank Co Ltd
Priority to CN202210004673.9A priority Critical patent/CN114332883A/en
Publication of CN114332883A publication Critical patent/CN114332883A/en
Pending legal-status Critical Current

Links

Images

Abstract

The application relates to an invoice information identification method, an invoice information identification device, a computer readable storage medium and a computer program product, wherein the method comprises the following steps: after acquiring the invoice image to be identified, the server inputs the invoice image to be identified into the vertex positioning model, and acquires vertex coordinate information of an invoice information area of the invoice image to be identified; extracting an invoice information area according to the vertex coordinate information of the invoice information area; correcting the invoice information area based on the vertex coordinate information of the invoice information area; inputting the invoice information area after correction processing into an information box detection model, and obtaining coordinate information of each key text information box in the invoice information area; and extracting the key text information in the key text information box according to the coordinate information of the key text information box. The invoice identification method provided by the application can be used for accurately identifying the key information in the invoice.

Description

Invoice information identification method and device, computer equipment and storage medium
Technical Field
The present application relates to the field of big data processing technologies, and in particular, to an invoice information identification method, apparatus, computer device, storage medium, and computer program product.
Background
With the development of image processing technology, data statistics using information extracted from images is widely used.
Taking the invoice as an example, the invoice is used as a reimbursement voucher, key information is identified for the acquired invoice image through an image processing technology, and subsequent reimbursement and accounting can be quickly realized. In the related art, the key information on the collected invoice image is identified for a plurality of invoices with a simple single format.
However, in practical application, invoices applied in different industries, different organizations and the like have various formats, so that the problem that the invoice key information identification is not accurate enough in the related technology is caused when the complicated invoice formats are faced.
Disclosure of Invention
The application provides an invoice information identification method, an invoice information identification device, computer equipment, a computer readable storage medium and a computer program product, which can accurately identify key information in an invoice.
In a first aspect, the present application provides an invoice information identification method, including:
inputting the invoice image to be identified into the vertex positioning model, and obtaining vertex coordinate information of an invoice information area of the invoice image to be identified; the vertex positioning model is obtained by training a sample invoice image after interference addition processing;
extracting an invoice information area according to the vertex coordinate information of the invoice information area; performing correction processing on the invoice information area based on the vertex coordinate information of the invoice information area, wherein the correction processing comprises performing rotation transformation processing and/or perspective transformation processing on an invoice image;
inputting the invoice information area after correction processing into an information box detection model, and obtaining coordinate information of each key text information box in the invoice information area; the information frame detection model is obtained by training a plurality of sample invoice images with different formats;
and extracting the key text information in the key text information box according to the coordinate information of the key text information box.
In a second aspect, the present application further provides an invoice information recognition apparatus, including:
the first input acquisition module is used for inputting the invoice image to be identified to the vertex positioning model and acquiring vertex coordinate information of an invoice information area of the invoice image to be identified; the vertex positioning model is obtained by training a sample invoice image after interference addition processing;
the extraction and correction module is used for extracting the invoice information area according to the vertex coordinate information of the invoice information area; performing correction processing on the invoice information area based on the vertex coordinate information of the invoice information area, wherein the correction processing comprises performing rotation transformation processing and/or perspective transformation processing on an invoice image;
the second input acquisition module is used for inputting the invoice information area subjected to correction processing into the information frame detection model to acquire coordinate information of each key text information frame in the invoice information area; the information frame detection model is obtained by training a plurality of sample invoice images with different formats;
and the extraction module is used for extracting the key text information in the key text information box according to the coordinate information of the key text information box.
In a third aspect, the present application further provides a computer device comprising a memory and a processor, the memory storing a computer program, the processor implementing the steps of the method of any one of the above when executing the computer program.
In a fourth aspect, the present application also provides a computer-readable storage medium having stored thereon a computer program which, when executed by a processor, carries out the steps of the method of any one of the above.
In a fifth aspect, the present application also provides a computer program product comprising a computer program which, when executed by a processor, performs the steps of the method of any one of the above.
The application provides an invoice information identification method, an invoice information identification device, computer equipment, a computer readable storage medium and a computer program product, wherein the method comprises the following steps: after acquiring the invoice image to be identified, the server inputs the invoice image to be identified into the vertex positioning model to acquire vertex coordinate information of an invoice information area of the invoice image to be identified; extracting an invoice information area according to the vertex coordinate information of the invoice information area; correcting the invoice information area based on the vertex coordinate information of the invoice information area; inputting the invoice information area after correction processing into an information box detection model, and obtaining coordinate information of each key text information box in the invoice information area; and extracting the key text information in the key text information box according to the coordinate information of the key text information box. The invoice identification method provided by the application carries out vertex positioning on the invoice through the vertex positioning model before carrying out invoice key information, and the vertex positioning model is obtained through training of sample invoice images after different types of interference addition processing, so that whether the invoice has a complex background, is folded or not, is deformed or not, has wrinkles or not, has interference factors such as shielding or not, the vertex coordinates of an invoice information area can be quickly and accurately extracted, after the vertex coordinate information of the invoice information area is obtained, the invoice information area can be further quickly extracted, the invoice area is corrected, the invoice information area after correction processing can be displayed in a form which is favorable for subsequent information identification to the utmost extent, so that the coordinate information of each key text information box in the invoice information area can be conveniently detected and obtained through the information box detection model subsequently, and the information frame detection model is obtained based on the training of a plurality of sample invoice images with different formats, so that the coordinate information of the key text information frame can be quickly and accurately obtained through the information frame detection model, and the key text information in the key text information frame is extracted based on the coordinate information of the key text information frame, so that the key information in the invoice can be accurately identified.
Drawings
FIG. 1 is a diagram of an exemplary implementation of a method for identifying invoice information;
FIG. 2 is a flow diagram illustrating a method for identifying invoice information in one embodiment;
FIG. 3 is a flow chart illustrating a method for identifying invoice information in another embodiment;
FIG. 4 is a flow chart illustrating a method for identifying invoice information in another embodiment;
FIG. 5 is a flow chart illustrating a method for identifying invoice information in another embodiment;
FIG. 6 is a flow chart illustrating a method for identifying invoice information in another embodiment;
FIG. 7 is a flowchart illustrating an invoice information recognition method according to another embodiment;
FIG. 8 is a flowchart illustrating an invoice information recognition method according to another embodiment;
FIG. 9 is a block diagram of the invoice information recognition device in one embodiment;
FIG. 10 is a diagram showing an internal structure of a computer device according to an embodiment.
Detailed Description
In order to make the objects, technical solutions and advantages of the present application more apparent, the present application is described in further detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the present application and are not intended to limit the present application.
The invoice information identification method provided by the embodiment of the application can be applied to the application environment shown in fig. 1. Wherein the terminal 102 communicates with the server 104 via a network. The data storage system may store data that the server 104 needs to process. The data storage system may be integrated on the server 104, or may be located on the cloud or other network server. The terminal 102 sends an invoice image to be identified to the server 104 through a network, and the server inputs the invoice image to be identified to the vertex positioning model to obtain vertex coordinate information of an invoice information area of the invoice image to be identified; extracting an invoice information area according to the vertex coordinate information of the invoice information area; correcting the invoice information area based on the vertex coordinate information of the invoice information area; inputting the invoice information area after correction processing into an information box detection model, and obtaining coordinate information of each key text information box in the invoice information area; and extracting the key text information in the key text information box according to the coordinate information of the key text information box. The terminal 102 may be, but is not limited to, various personal computers, notebook computers, smart phones, tablet computers, scanners and portable wearable devices. The portable wearable device can be a smart watch, smart glasses, a smart bracelet, a head-mounted device, and the like. The server 104 may be implemented as a stand-alone server or as a server cluster comprised of multiple servers.
In one embodiment, as shown in fig. 2, an invoice information identification method is provided, which is described by taking the method as an example applied to the server in fig. 1, and includes the following steps:
step S202, inputting the invoice image to be identified into a vertex positioning model, and obtaining vertex coordinate information of an invoice information area of the invoice image to be identified; the vertex positioning model is obtained by training sample invoice images after a plurality of different types of interference addition processing are carried out on the sample invoice images.
The invoice to be identified can be an invoice image shot by a target object needing reimbursement through terminal equipment such as a mobile phone, a camera, a computer, a high-speed shooting instrument, an intelligent bracelet and the like, and the invoice image can only comprise one invoice image or can also comprise a plurality of invoice images; the invoice image may be, for example, a taxi invoice, a dining invoice, a lodging invoice, and the like. And the vertex positioning model is used for obtaining vertex coordinate information of the invoice image in the invoice to be identified. It should be noted that the boundary of the invoice can be determined through the vertex coordinate information of the invoice, so that the information area of the invoice is quickly extracted, the background irrelevant to the information area of the invoice is removed, and the subsequent identification of the key information in the invoice in a more targeted manner is facilitated. For example, the vertex positioning model may output vertex coordinate information of the invoice image from the top left, top right, bottom right, and bottom left, and output the vertex coordinate information in order, which is beneficial to extracting the invoice information area subsequently.
The invoice may have problems of folding deformation, wrinkles, defects and the like due to the fact that the target object fails to properly store the invoice, or the invoice may have a complex background, or the target object needs to obtain an invoice image after stacking and pasting the invoice because the target object needs to reimburse more bills; or when the target object acquires the invoice image, the effect of the shot invoice image is not good due to light problems or equipment problems, and the like, so that the effect of extracting the key information in the invoice image is not good due to the factors. Therefore, before model training, the vertex positioning model provided by the application can perform various types of interference addition processing on the obtained sample invoice images so as to improve the identification accuracy of the vertex positioning model. For example, the sample invoice image may be rotated by different angles; replacing the background of the original invoice with a background with multiple interference factors; sticking a plurality of invoices on a background picture; and carrying out interference addition treatment of multiple invoices in different types such as staggered sticking.
It should be noted that, if the invoice image to be identified includes a plurality of invoice images, vertex coordinate information of a plurality of invoice information areas may be obtained through the vertex positioning model.
Step S204, extracting an invoice information area according to the vertex coordinate information of the invoice information area; and performing correction processing on the invoice information area based on the vertex coordinate information of the invoice information area, wherein the correction processing comprises performing rotation conversion processing and/or perspective conversion processing on the invoice image.
The invoice information area is an area for displaying all key information and non-key information of the invoice, redundant backgrounds of invoice images can be provided by extracting the invoice information area, targeted identification of the key information in the invoice is achieved, identification efficiency of the invoice information is improved, and resource waste is avoided. The method for extracting the invoice information area can be that a plurality of vertex coordinates are connected in sequence according to the vertex coordinate information of the invoice information area to obtain the invoice information area; the vertex coordinates may also be sequentially input to the invoice information area extraction model to obtain the invoice information area, which is not limited in this application.
Based on some factors which may exist in the invoice and affect subsequent identification, the vertex coordinate information of the invoice information area is obtained only through the vertex positioning model, and after the invoice information area is extracted according to the vertex coordinate information of the invoice information area, the obtained invoice information may be distorted and invoice information with a rotation angle, and then the condition that the key information in the invoice information area is folded and the like may occur, which is not beneficial to the subsequent identification of the key information, so after the invoice information area is obtained, the invoice information area is corrected through correction processing, for example, the invoice information area is rotated and subjected to perspective transformation processing, so that the invoice information area with better display effect is obtained, and the key information in the invoice information area is accurately extracted subsequently.
It should be noted that, if the invoice image to be identified includes a plurality of invoice images, a plurality of invoice information areas may be extracted according to the vertex coordinate information obtained by the vertex positioning model, and then a plurality of invoice images in the invoice image to be identified may be segmented according to the extracted invoice information areas, so as to identify the key information in the invoice information areas one by one.
Step S206, inputting the invoice information area after correction processing into an information frame detection model, and obtaining coordinate information of each key text information frame in the invoice information area; the information frame detection model is obtained through training of a plurality of sample invoice images of different formats.
The invoice information area more beneficial to key information identification is obtained through the correction processing, and the invoice information area can be further input into the information frame detection model (if the invoice information area comprises a plurality of invoice information areas, the invoice information areas can be input into the information frame detection model one by one) so as to obtain the coordinate information of the key text information frame in each invoice information area.
The key text information mainly includes a key value and a value, the key value and the value in the image need to be described, the key value corresponds to an invoice number, an invoice code, an amount of money, a car number, mileage, getting-on/off time and the like in each invoice image, and the value represents specific number content 0120389765 corresponding to the invoice number, specific code content 121212121212 corresponding to the invoice code, specific amount content 32 yuan corresponding to the amount of money and the like. Since the key values and the value values in the invoice image need to be limited in position by the text box, the information box detection model in the server is used for detecting all key values in the invoice information area and the coordinates of the text box where the value corresponding to the key value is located, and obtaining the coordinates of each key value in the invoice information area and the coordinates of the text box corresponding to the value corresponding to the key value. The position of the key information in each invoice information area can be obtained through the information frame detection model, so that a premise is provided for subsequently obtaining the key text information corresponding to the position.
It should be noted that the information frame detection model is obtained by training sample invoice images of different formats, for example, taxi invoices across the country, such as taxi invoices in hangzhou, beijing, shanghai, guangzhou, and the like, so that the information frame detection model can quickly obtain the key value and the coordinate information of the text frame where the value corresponding to the key value is located.
And step S208, extracting the key text information in the key text information box according to the coordinate information of the key text information box.
The coordinate information of each key text information box in the invoice information area is obtained through the information box detection module, and the key text information in the coordinate information can be identified. The key text information may be recognized, for example, by inputting the text box into a text information recognition model, which is not limited in the present application.
The application provides an invoice information identification method, which comprises the following steps: after acquiring the invoice image to be identified, the server inputs the invoice image to be identified into the vertex positioning model to acquire vertex coordinate information of an invoice information area of the invoice image to be identified; extracting an invoice information area according to the vertex coordinate information of the invoice information area; correcting the invoice information area based on the vertex coordinate information of the invoice information area; inputting the invoice information area after correction processing into an information box detection model, and obtaining coordinate information of each key text information box in the invoice information area; and extracting the key text information in the key text information box according to the coordinate information of the key text information box. The invoice identification method provided by the application comprises the steps of carrying out vertex positioning on an invoice through a vertex positioning model before carrying out invoice key information, obtaining the invoice through sample invoice image training after interference addition processing of multiple types of the vertex positioning model, rapidly and accurately extracting vertex coordinates of an invoice information area no matter whether the invoice has a complex background, is folded or not, is deformed or not, has wrinkles or not, has interference factors such as shielding or not, further rapidly extracting the invoice information area after obtaining the vertex coordinate information of the invoice information area, and carrying out correction processing on the invoice area, displaying the invoice information area in a form which is favorable for subsequent information identification to the invoice information area to the maximum extent through the corrected invoice information area, and facilitating the subsequent detection of an information frame detection model to obtain coordinates of each key text information frame in the invoice information area Information is obtained through training of the information frame detection model based on a plurality of sample invoice images of different formats, so that the coordinate information of the key text information frame can be quickly and accurately obtained through the information frame detection model, the key text information in the key text information frame is extracted based on the coordinate information of the key text information frame, and accurate identification of the key information in the invoice can be achieved.
In an embodiment, as shown in fig. 3, fig. 3 is an alternative embodiment of a method for extracting key text information in a key text information box provided by the present application, where the method embodiment includes the following steps:
step S302, according to the coordinate information of the key text information box, dividing the area where the key text information box is located;
step S304, inputting the area where the segmented key text information box is located into a text information extraction model, and obtaining the key text information in the key text information box.
The coordinate information of the plurality of key text information boxes in the invoice information area is obtained through the method, the invoice information area can be segmented based on the coordinate information of the key text information boxes to obtain the area where the single key information text box is located, and then the area where the single key information text box is located is sequentially input into the text information extraction model to obtain the key text information corresponding to the key information text box. For example, the area where the key text information box is located is divided to obtain a text box area where an invoice number is located, a text box area where an invoice code is located, a text box area where boarding time is located, a text box area where alighting time is located, a text box area where money is located, a text box area where 32 yuan is located, a text box area where 0120389765 is located, a text box area where 12121212112 is located, and the like, and then the text box areas are sequentially input to the text information extraction module, and key text information such as the invoice number, the invoice code, the boarding time, the alighting time, the money, 32 yuan, 0120389765, 1212121212, and the like is output.
According to the invoice information identification method, the area where the key text information box is located is divided through the coordinate information of the key text information box, then the divided area where the key text information box is located is input into the text information extraction model, text information corresponding to the key text information box is obtained, one-to-one identification of the text information is achieved, the identification efficiency is high, and the obtained key text information is more accurate.
In an embodiment, as shown in fig. 4, fig. 4 is an alternative embodiment of a method for processing key text information according to an embodiment of the present application, where the embodiment of the method includes the following steps:
step S402, extracting preliminary key text information corresponding to the key text information box according to the coordinate information of the key text information box;
step S404, checking and regular extracting operation is carried out on the preliminary key text information to obtain key text information in a key text information box; the checksum regular extraction operation is used to convert the preliminary key text information into information that conforms to a text standard form.
Because of the influence of the distribution, expression mode, expression format and other factors of the key text information in the key text box, a plurality of different expression texts with the same meaning, or a plurality of different distribution modes and different expression forms may occur. Meanwhile, based on some convention colloquial regulations that the invoice code is generally 12 digits, the license plate number comprises characters and digits, the tax number only comprises digits and the like, the initial key text information obtained by identification can be verified to verify whether the obtained initial key text information is correct or not. And the key text information can be subjected to unified processing, unified expression, unified distribution mode, unified format and the like, so that the operations such as financial statement construction, reimbursement and the like can be further performed subsequently according to the key text information. For example, the number of the car, the number plate of the taxi, and the like can be all unified into the number plate; the invoice codes, invoice numbers and the like with upper and lower line feed are all expanded into a one-line distribution mode; unifying the getting on and off vehicle time in the form of 2021/12/30/12:30, etc.
According to the invoice information identification method, after the key text information is identified and obtained through the text information identification model, the key text information is further verified and regularly extracted, so that further operation can be performed according to the key text information in the follow-up process.
In one embodiment, as shown in fig. 5, fig. 5 is an alternative embodiment of the method for training the vertex positioning model provided in the embodiment of the present application, where the embodiment of the method includes the following steps:
step S502, obtaining a plurality of sample invoice images with different formats and vertex coordinates of sample invoice area calibration in each sample invoice image;
and step S504, performing model training on the initial vertex positioning model based on the vertex coordinates calibrated in the sample invoice images and the sample invoice areas in the sample invoice images to obtain the vertex positioning model.
The sample invoice images of different formats can be obtained from the memory address of the server, and the sample invoice images of different formats can be sample invoices from all over the country or various sample invoices from different enterprises and different institutions. After the server obtains a plurality of sample invoice images of different formats, the server can calibrate the vertex coordinates of the sample invoice area in each sample invoice image through a vertex coordinate calibration model or a vertex calibration algorithm, and the vertex positioning model is obtained through training of a neural network model based on the sample invoice images of the plurality of different formats and the vertex coordinates of the sample invoice area in the sample invoice images of the different formats. Because the vertex positioning model is obtained by training the sample invoice images in various different formats, the vertex positioning model can identify the vertex coordinates of the invoices in various different formats, and the application limitation of the invoice identification method provided by the application is avoided.
In one embodiment, as shown in fig. 6, fig. 6 is an alternative method embodiment for processing an initial sample invoice image provided by the present application, and the method embodiment includes the following steps:
step S602, acquiring a plurality of initial sample invoice images with different formats;
step S604, respectively carrying out interference addition processing on a plurality of initial sample invoice images with different formats to obtain each sample invoice image;
step S606, the sample invoice area in each sample invoice image is subjected to vertex coordinate calibration processing to obtain vertex coordinates calibrated by the sample invoice area in each sample invoice image.
In order to improve the identification accuracy of the vertex positioning model, the server performs interference addition processing on the initial sample invoice image before training the obtained initial sample invoice image, performs vertex coordinate calibration processing on an invoice information area of the initial sample image subjected to the interference addition processing, and obtains vertex coordinates calibrated in the total sample invoice area of each sample invoice image and each sample invoice image.
The interference addition process may be, for example: the method includes the steps of rotating invoices at various angles, adding complex background images to invoice images, overlapping a plurality of invoice images, adding a plurality of invoice images to one background image, adding a plurality of invoices to one background image in a staggered mode, folding the invoice images, adding interference elements such as ink marks, seals and character overlapping to the invoice images, adjusting parameters such as brightness and pixel values of the invoice images and the like, and the method is not limited in the application. By using the sample invoice image after the interference addition processing to carry out model training, the accuracy of the obtained vertex positioning model is higher, and the recognition effect is better.
In an embodiment, as shown in fig. 7, fig. 7 is an alternative embodiment of a method for training an information frame detection model provided in an embodiment of the present application, where the embodiment of the method includes the following steps:
step S702, acquiring sample invoice information areas in the sample invoice images according to the vertex coordinates marked by the sample invoice areas in the sample invoice images;
step S704, coordinate calibration processing is carried out on the sample key text information boxes in each sample invoice information area, and coordinates of the sample key text information boxes in each sample invoice information area are obtained;
step S706, model training is carried out on the initial information frame detection model based on the coordinates of the sample invoice information areas and the sample key text information frames in the sample invoice information areas, and an information frame detection model is obtained.
The sample invoice images used for the information detection model training can be the same as the sample invoice images used for the training of the training vertex positioning model in the process of training the information frame detection model, so that resources are reasonably utilized, and the problems of resource waste and low model training efficiency caused by the fact that a plurality of different sample invoice images need to be processed for many times can be solved. However, the sample invoice image may also be obtained again when the information frame detection model is trained, which is not limited in the present application.
For example, in training the information detection model, the same sample invoice image as the vertex localization model is used. Then, the vertex coordinates of each sample invoice image obtained according to the vertex positioning model are used to obtain a sample invoice information area of each sample invoice image, further coordinate calibration processing is performed on the sample key text information box to obtain the coordinates of the sample key text information box in each sample invoice information area, and finally, model training is performed on the initial information box detection model (neural network model) based on the coordinates of the sample key text information box in each sample invoice information area and each sample invoice information area to obtain the information box detection model. The information frame detection model is obtained by training according to sample invoice images of different formats, so that the information frame detection model obtained by training has the capability of accurately identifying the key text information frame of the invoice images of multiple formats, an important premise can be provided for subsequently identifying the key text information, and the effect of improving the accuracy of identifying the key text information is further achieved.
In an embodiment, as shown in fig. 8, fig. 8 is an alternative embodiment of a method for training a text information extraction model provided in the embodiment of the present application, where the method embodiment includes the following steps:
step S802, obtaining the corresponding area of the sample key text information box and the sample key text information in the area of each sample key text information box according to the coordinates of the sample key text information box in each sample invoice information area;
step S804, model training is carried out on the initial text information extraction model based on the area of each sample key text information box and the sample key text information in the area of each sample key text information box, and a text information extraction model is obtained.
In the process of training the text information extraction model, the sample invoice images used for training the text information extraction model can be the same as the sample invoice images used for training the training vertex positioning model, so that resources are reasonably utilized, and the problems of resource waste and low model training efficiency caused by the fact that a plurality of different sample invoice images need to be processed for many times can be solved. However, the sample invoice image may also be obtained again when the text information extraction model is trained, which is not limited in the present application.
For example, in training the text information extraction model, the same sample invoice image as the vertex positioning model is used. Then, the coordinates of the key text information boxes obtained by the information box detection model can be further used to obtain the corresponding areas where the sample key text information boxes are located, the key text information in the areas where the key text information boxes are located is calibrated to obtain the sample key text information, and finally, the initial text information extraction model is subjected to model training according to the areas where the sample key text information boxes are located and the sample key text information in the areas where the sample key text information boxes are located to obtain the text information extraction model. The text information extraction model is obtained by training according to sample invoice images of different formats, so that the information frame detection model obtained by training has the capability of identifying key text information of the invoice images of multiple formats, and the application universality of the application can be improved.
It should be understood that, although the steps in the flowcharts related to the embodiments as described above are sequentially displayed as indicated by arrows, the steps are not necessarily performed sequentially as indicated by the arrows. The steps are not performed in the exact order shown and described, and may be performed in other orders, unless explicitly stated otherwise. Moreover, at least a part of the steps in the flowcharts related to the embodiments described above may include multiple steps or multiple stages, which are not necessarily performed at the same time, but may be performed at different times, and the execution order of the steps or stages is not necessarily sequential, but may be rotated or alternated with other steps or at least a part of the steps or stages in other steps.
Based on the same inventive concept, the embodiment of the application also provides an invoice information identification device for realizing the invoice information identification method. The implementation scheme for solving the problem provided by the device is similar to the implementation scheme recorded in the method, so the specific limitations in one or more embodiments of the financial product recommendation device provided below can be referred to the limitations in the invoice information identification method in the foregoing, and details are not repeated herein.
In one embodiment, as shown in fig. 9, there is provided an invoice information identification apparatus including: a first input obtaining module 902, an extraction rectification module 904, a second input obtaining module 906, and an extraction module 908, wherein:
a first input obtaining module 902, configured to input the invoice image to be identified to the vertex positioning model, and obtain vertex coordinate information of an invoice information area of the invoice image to be identified; the vertex positioning model is obtained by training a sample invoice image after interference addition processing;
the extraction and correction module 904 is used for extracting the invoice information area according to the vertex coordinate information of the invoice information area; performing correction processing on the invoice information area based on the vertex coordinate information of the invoice information area, wherein the correction processing comprises performing rotation transformation processing and/or perspective transformation processing on an invoice image;
a second input obtaining module 906, configured to input the invoice information area after the correction processing into the information box detection model, and obtain coordinate information of each key text information box in the invoice information area; the information frame detection model is obtained by training a plurality of sample invoice images with different formats;
the extracting module 908 is configured to extract the key text information in the key text information box according to the coordinate information of the key text information box.
In an embodiment, the extracting module 908 is specifically configured to segment an area where the key text information box is located according to the coordinate information of the key text information box; and inputting the area where the segmented key text information box is located into a text information extraction model to obtain the key text information in the key text information box.
In one embodiment, the extraction module 908 further comprises a check extraction unit,
the verification extraction unit is used for extracting preliminary key text information corresponding to the key text information box according to the coordinate information of the key text information box; checking and regularly extracting the preliminary key text information to obtain key text information in a key text information box; the checksum regular extraction operation is used to convert the preliminary key text information into information that conforms to a text standard form.
In one embodiment, the apparatus further includes a model training module, configured to obtain a plurality of sample invoice images of different formats and vertex coordinates of sample invoice region calibration in each sample invoice image; and performing model training on the initial vertex positioning model based on the vertex coordinates calibrated in the sample invoice images and the sample invoice areas in the sample invoice images to obtain the vertex positioning model.
In an embodiment, the model training module is specifically configured to obtain a plurality of initial sample invoice images of different formats; respectively carrying out interference addition processing on a plurality of initial sample invoice images with different formats to obtain each sample invoice image; and carrying out vertex coordinate calibration processing on the sample invoice area in each sample invoice image to obtain the vertex coordinate calibrated in the sample invoice area in each sample invoice image.
In one embodiment, the model training module is further configured to obtain a sample invoice information area in each sample invoice image according to a vertex coordinate calibrated for the sample invoice area in each sample invoice image;
carrying out coordinate calibration processing on the sample key text information boxes in each sample invoice information area to obtain the coordinates of the sample key text information boxes in each sample invoice information area;
and performing model training on the initial information frame detection model based on the coordinates of the sample invoice information areas and the sample key text information frames in the sample invoice information areas to obtain an information frame detection model.
In one embodiment, the model training module is further configured to obtain, according to coordinates of sample key text information boxes in each sample invoice information area, an area where a corresponding sample key text information box is located and sample key text information in the area where each sample key text information box is located; and performing model training on the initial text information extraction model based on the area of each sample key text information box and the sample key text information in the area of each sample key text information box to obtain a text information extraction model.
The modules in the invoice information identification device can be wholly or partially implemented by software, hardware and a combination thereof. The modules can be embedded in a hardware form or independent from a processor in the computer device, and can also be stored in a memory in the computer device in a software form, so that the processor can call and execute operations corresponding to the modules.
In one embodiment, a computer device is provided, which may be a server, and its internal structure diagram may be as shown in fig. 10. The computer device includes a processor, a memory, and a network interface connected by a system bus. Wherein the processor of the computer device is configured to provide computing and control capabilities. The memory of the computer device includes a non-volatile storage medium and an internal memory. The non-volatile storage medium stores an operating system, a computer program, and a database. The internal memory provides an environment for the operation of an operating system and computer programs in the non-volatile storage medium. The database of the computer device is used for storing the reference invoice image. The network interface of the computer device is used for communicating with an external terminal through a network connection. The computer program is executed by a processor to implement an invoice information recognition method.
Those skilled in the art will appreciate that the architecture shown in fig. 10 is merely a block diagram of some of the structures associated with the disclosed aspects and is not intended to limit the computing devices to which the disclosed aspects apply, as particular computing devices may include more or less components than those shown, or may combine certain components, or have a different arrangement of components.
In one embodiment, a computer device is provided, comprising a memory and a processor, the memory having a computer program stored therein, the processor implementing the following steps when executing the computer program:
inputting the invoice image to be identified into the vertex positioning model, and obtaining vertex coordinate information of an invoice information area of the invoice image to be identified; the vertex positioning model is obtained by training a sample invoice image after interference addition processing;
extracting an invoice information area according to the vertex coordinate information of the invoice information area; performing correction processing on the invoice information area based on the vertex coordinate information of the invoice information area, wherein the correction processing comprises performing rotation transformation processing and/or perspective transformation processing on an invoice image;
inputting the invoice information area after correction processing into an information box detection model, and obtaining coordinate information of each key text information box in the invoice information area; the information frame detection model is obtained by training a plurality of sample invoice images with different formats;
and extracting the key text information in the key text information box according to the coordinate information of the key text information box.
In one embodiment, the processor, when executing the computer program, further performs the steps of:
dividing the area where the key text information box is located according to the coordinate information of the key text information box; and inputting the area where the segmented key text information box is located into a text information extraction model to obtain the key text information in the key text information box.
In one embodiment, the processor, when executing the computer program, further performs the steps of:
extracting preliminary key text information corresponding to the key text information box according to the coordinate information of the key text information box; checking and regularly extracting the preliminary key text information to obtain key text information in a key text information box; the checksum regular extraction operation is used to convert the preliminary key text information into information that conforms to a text standard form.
In one embodiment, the processor, when executing the computer program, further performs the steps of:
acquiring sample invoice images of a plurality of different formats and vertex coordinates of sample invoice area calibration in each sample invoice image; and performing model training on the initial vertex positioning model based on the vertex coordinates calibrated in the sample invoice images and the sample invoice areas in the sample invoice images to obtain the vertex positioning model.
In one embodiment, the processor, when executing the computer program, further performs the steps of:
acquiring a plurality of initial sample invoice images with different formats; respectively carrying out interference addition processing on a plurality of initial sample invoice images with different formats to obtain each sample invoice image; and carrying out vertex coordinate calibration processing on the sample invoice area in each sample invoice image to obtain the vertex coordinate calibrated in the sample invoice area in each sample invoice image.
In one embodiment, the processor, when executing the computer program, further performs the steps of:
acquiring a sample invoice information area in each sample invoice image according to the vertex coordinates of the sample invoice area calibration in each sample invoice image;
carrying out coordinate calibration processing on the sample key text information boxes in each sample invoice information area to obtain the coordinates of the sample key text information boxes in each sample invoice information area;
and performing model training on the initial information frame detection model based on the coordinates of the sample invoice information areas and the sample key text information frames in the sample invoice information areas to obtain an information frame detection model.
In one embodiment, the processor, when executing the computer program, further performs the steps of:
obtaining the area where the corresponding sample key text information box is located and the sample key text information in the area where the sample key text information box is located according to the coordinates of the sample key text information box in each sample invoice information area; and performing model training on the initial text information extraction model based on the area of each sample key text information box and the sample key text information in the area of each sample key text information box to obtain a text information extraction model.
In one embodiment, a computer-readable storage medium is provided, having a computer program stored thereon, which when executed by a processor, performs the steps of:
inputting the invoice image to be identified into the vertex positioning model, and obtaining vertex coordinate information of an invoice information area of the invoice image to be identified; the vertex positioning model is obtained by training a sample invoice image after interference addition processing;
extracting an invoice information area according to the vertex coordinate information of the invoice information area; performing correction processing on the invoice information area based on the vertex coordinate information of the invoice information area, wherein the correction processing comprises performing rotation transformation processing and/or perspective transformation processing on an invoice image;
inputting the invoice information area after correction processing into an information box detection model, and obtaining coordinate information of each key text information box in the invoice information area; the information frame detection model is obtained by training a plurality of sample invoice images with different formats;
and extracting the key text information in the key text information box according to the coordinate information of the key text information box.
In one embodiment, the computer program when executed by the processor further performs the steps of:
dividing the area where the key text information box is located according to the coordinate information of the key text information box; and inputting the area where the segmented key text information box is located into a text information extraction model to obtain the key text information in the key text information box.
In one embodiment, the computer program when executed by the processor further performs the steps of:
extracting preliminary key text information corresponding to the key text information box according to the coordinate information of the key text information box; checking and regularly extracting the preliminary key text information to obtain key text information in a key text information box; the checksum regular extraction operation is used to convert the preliminary key text information into information that conforms to a text standard form.
In one embodiment, the computer program when executed by the processor further performs the steps of:
acquiring sample invoice images of a plurality of different formats and vertex coordinates of sample invoice area calibration in each sample invoice image; and performing model training on the initial vertex positioning model based on the vertex coordinates calibrated in the sample invoice images and the sample invoice areas in the sample invoice images to obtain the vertex positioning model.
In one embodiment, the computer program when executed by the processor further performs the steps of:
acquiring a plurality of initial sample invoice images with different formats; respectively carrying out interference addition processing on a plurality of initial sample invoice images with different formats to obtain each sample invoice image; and carrying out vertex coordinate calibration processing on the sample invoice area in each sample invoice image to obtain the vertex coordinate calibrated in the sample invoice area in each sample invoice image.
In one embodiment, the computer program when executed by the processor further performs the steps of:
acquiring a sample invoice information area in each sample invoice image according to the vertex coordinates of the sample invoice area calibration in each sample invoice image;
carrying out coordinate calibration processing on the sample key text information boxes in each sample invoice information area to obtain the coordinates of the sample key text information boxes in each sample invoice information area;
and performing model training on the initial information frame detection model based on the coordinates of the sample invoice information areas and the sample key text information frames in the sample invoice information areas to obtain an information frame detection model.
In one embodiment, the computer program when executed by the processor further performs the steps of:
obtaining the area where the corresponding sample key text information box is located and the sample key text information in the area where the sample key text information box is located according to the coordinates of the sample key text information box in each sample invoice information area; and performing model training on the initial text information extraction model based on the area of each sample key text information box and the sample key text information in the area of each sample key text information box to obtain a text information extraction model.
In one embodiment, a computer program product is provided, comprising a computer program which, when executed by a processor, performs the steps of:
inputting the invoice image to be identified into the vertex positioning model, and obtaining vertex coordinate information of an invoice information area of the invoice image to be identified; the vertex positioning model is obtained by training a sample invoice image after interference addition processing;
extracting an invoice information area according to the vertex coordinate information of the invoice information area; performing correction processing on the invoice information area based on the vertex coordinate information of the invoice information area, wherein the correction processing comprises performing rotation transformation processing and/or perspective transformation processing on an invoice image;
inputting the invoice information area after correction processing into an information box detection model, and obtaining coordinate information of each key text information box in the invoice information area; the information frame detection model is obtained by training a plurality of sample invoice images with different formats;
and extracting the key text information in the key text information box according to the coordinate information of the key text information box.
In one embodiment, the computer program when executed by the processor further performs the steps of:
dividing the area where the key text information box is located according to the coordinate information of the key text information box; and inputting the area where the segmented key text information box is located into a text information extraction model to obtain the key text information in the key text information box.
In one embodiment, the computer program when executed by the processor further performs the steps of:
extracting preliminary key text information corresponding to the key text information box according to the coordinate information of the key text information box; checking and regularly extracting the preliminary key text information to obtain key text information in a key text information box; the checksum regular extraction operation is used to convert the preliminary key text information into information that conforms to a text standard form.
In one embodiment, the computer program when executed by the processor further performs the steps of:
acquiring sample invoice images of a plurality of different formats and vertex coordinates of sample invoice area calibration in each sample invoice image; and performing model training on the initial vertex positioning model based on the vertex coordinates calibrated in the sample invoice images and the sample invoice areas in the sample invoice images to obtain the vertex positioning model.
In one embodiment, the computer program when executed by the processor further performs the steps of:
acquiring a plurality of initial sample invoice images with different formats; respectively carrying out interference addition processing on a plurality of initial sample invoice images with different formats to obtain each sample invoice image; and carrying out vertex coordinate calibration processing on the sample invoice area in each sample invoice image to obtain the vertex coordinate calibrated in the sample invoice area in each sample invoice image.
In one embodiment, the computer program when executed by the processor further performs the steps of:
acquiring a sample invoice information area in each sample invoice image according to the vertex coordinates of the sample invoice area calibration in each sample invoice image;
carrying out coordinate calibration processing on the sample key text information boxes in each sample invoice information area to obtain the coordinates of the sample key text information boxes in each sample invoice information area;
and performing model training on the initial information frame detection model based on the coordinates of the sample invoice information areas and the sample key text information frames in the sample invoice information areas to obtain an information frame detection model.
In one embodiment, the computer program when executed by the processor further performs the steps of:
obtaining the area where the corresponding sample key text information box is located and the sample key text information in the area where the sample key text information box is located according to the coordinates of the sample key text information box in each sample invoice information area; and performing model training on the initial text information extraction model based on the area of each sample key text information box and the sample key text information in the area of each sample key text information box to obtain a text information extraction model.
It will be understood by those skilled in the art that all or part of the processes of the methods of the embodiments described above can be implemented by hardware related to instructions of a computer program, which can be stored in a non-volatile computer-readable storage medium, and when executed, can include the processes of the embodiments of the methods described above. Any reference to memory, database, or other medium used in the embodiments provided herein may include at least one of non-volatile and volatile memory. The nonvolatile Memory may include Read-Only Memory (ROM), magnetic tape, floppy disk, flash Memory, optical Memory, high-density embedded nonvolatile Memory, resistive Random Access Memory (ReRAM), Magnetic Random Access Memory (MRAM), Ferroelectric Random Access Memory (FRAM), Phase Change Memory (PCM), graphene Memory, and the like. Volatile Memory can include Random Access Memory (RAM), external cache Memory, and the like. By way of illustration and not limitation, RAM can take many forms, such as Static Random Access Memory (SRAM) or Dynamic Random Access Memory (DRAM), among others. The databases referred to in various embodiments provided herein may include at least one of relational and non-relational databases. The non-relational database may include, but is not limited to, a block chain based distributed database, and the like. The processors referred to in the embodiments provided herein may be general purpose processors, central processing units, graphics processors, digital signal processors, programmable logic devices, quantum computing based data processing logic devices, etc., without limitation.
The technical features of the above embodiments can be arbitrarily combined, and for the sake of brevity, all possible combinations of the technical features in the above embodiments are not described, but should be considered as the scope of the present specification as long as there is no contradiction between the combinations of the technical features.
The above examples only express several embodiments of the present application, and the description thereof is more specific and detailed, but not construed as limiting the scope of the present application. It should be noted that, for a person skilled in the art, several variations and modifications can be made without departing from the concept of the present application, which falls within the scope of protection of the present application. Therefore, the protection scope of the present application shall be subject to the appended claims.

Claims (10)

1. An invoice information identification method, characterized in that the method comprises:
inputting an invoice image to be identified into a vertex positioning model, and obtaining vertex coordinate information of an invoice information area of the invoice image to be identified; the vertex positioning model is obtained by training sample invoice images subjected to interference addition processing of various types;
extracting the invoice information area according to the vertex coordinate information of the invoice information area; performing correction processing on the invoice information area based on the vertex coordinate information of the invoice information area, wherein the correction processing comprises performing rotation transformation processing and/or perspective transformation processing on an invoice image;
inputting the invoice information area after correction processing into an information box detection model, and obtaining coordinate information of each key text information box in the invoice information area; the information frame detection model is obtained by training a plurality of sample invoice images with different formats;
and extracting the key text information in the key text information box according to the coordinate information of the key text information box.
2. The method according to claim 1, wherein extracting the key text information in the key text information box according to the coordinate information of the key text information box comprises:
dividing the area where the key text information box is located according to the coordinate information of the key text information box;
and inputting the area where the segmented key text information box is located into a text information extraction model to obtain the key text information in the key text information box.
3. The method according to claim 1, wherein extracting the key text information in the key text information box according to the coordinate information of the key text information box comprises:
extracting preliminary key text information corresponding to the key text information box according to the coordinate information of the key text information box;
performing checking and regular extraction operation on the preliminary key text information to obtain key text information in the key text information box; the checksum regular extraction operation is used to convert the preliminary key text information into information conforming to a text standard form.
4. The method according to any one of claims 1-3, wherein the training process of the vertex positioning model comprises:
obtaining a plurality of sample invoice images with different formats and vertex coordinates marked in a sample invoice area in each sample invoice image;
and performing model training on an initial vertex positioning model based on each sample invoice image and vertex coordinates calibrated in a sample invoice area in each sample invoice image to obtain the vertex positioning model.
5. The method according to claim 4, wherein the obtaining vertex coordinates of the first sample invoice image of the plurality of different formats and the sample invoice region calibration in each sample invoice image comprises:
acquiring a plurality of initial sample invoice images with different formats;
respectively carrying out interference addition processing on the plurality of initial sample invoice images with different formats to obtain each sample invoice image;
and carrying out vertex coordinate calibration processing on the sample invoice area in each sample invoice image to obtain the vertex coordinate calibrated in the sample invoice area in each sample invoice image.
6. The method of claim 4, wherein the training process of the information frame detection model comprises:
acquiring a sample invoice information area in each sample invoice image according to the vertex coordinates of the sample invoice area calibration in each sample invoice image;
carrying out coordinate calibration processing on the sample key text information boxes in the sample invoice information areas to obtain coordinates of the sample key text information boxes in the sample invoice information areas;
and performing model training on an initial information frame detection model based on the coordinates of the sample invoice information areas and the sample key text information frames in the sample invoice information areas to obtain the information frame detection model.
7. The method of claim 6, wherein the training process of the text information extraction model comprises:
obtaining the area where the corresponding sample key text information box is located and the sample key text information in the area where the sample key text information box is located according to the coordinates of the sample key text information boxes in the sample invoice information area;
and performing model training on an initial text information extraction model based on the area of each sample key text information box and the sample key text information in the area of each sample key text information box to obtain the text information extraction model.
8. An invoice information recognition apparatus, characterized in that the apparatus comprises:
the system comprises a first input acquisition module, a first storage module and a second storage module, wherein the first input acquisition module is used for inputting an invoice image to be identified to a vertex positioning model and acquiring vertex coordinate information of an invoice information area of the invoice image to be identified; the vertex positioning model is obtained by training a sample invoice image after interference addition processing;
the extraction correction module is used for extracting the invoice information area according to the vertex coordinate information of the invoice information area; performing correction processing on the invoice information area based on the vertex coordinate information of the invoice information area, wherein the correction processing comprises performing rotation transformation processing and/or perspective transformation processing on an invoice image;
the second input obtaining module is used for inputting the invoice information area subjected to correction processing into the information box detection model to obtain coordinate information of each key text information box in the invoice information area; the information frame detection model is obtained by training a plurality of sample invoice images with different formats;
and the extraction module is used for extracting the key text information in the key text information box according to the coordinate information of the key text information box.
9. A computer device comprising a memory and a processor, the memory storing a computer program, characterized in that the processor, when executing the computer program, implements the steps of the method of any of claims 1 to 7.
10. A computer-readable storage medium, on which a computer program is stored, which, when being executed by a processor, carries out the steps of the method of any one of claims 1 to 7.
CN202210004673.9A 2022-01-04 2022-01-04 Invoice information identification method and device, computer equipment and storage medium Pending CN114332883A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210004673.9A CN114332883A (en) 2022-01-04 2022-01-04 Invoice information identification method and device, computer equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210004673.9A CN114332883A (en) 2022-01-04 2022-01-04 Invoice information identification method and device, computer equipment and storage medium

Publications (1)

Publication Number Publication Date
CN114332883A true CN114332883A (en) 2022-04-12

Family

ID=81024795

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210004673.9A Pending CN114332883A (en) 2022-01-04 2022-01-04 Invoice information identification method and device, computer equipment and storage medium

Country Status (1)

Country Link
CN (1) CN114332883A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115116060A (en) * 2022-08-25 2022-09-27 深圳前海环融联易信息科技服务有限公司 Key value file processing method, device, equipment, medium and computer program product
CN116824604A (en) * 2023-08-30 2023-09-29 江苏苏宁银行股份有限公司 Financial data management method and system based on image processing

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115116060A (en) * 2022-08-25 2022-09-27 深圳前海环融联易信息科技服务有限公司 Key value file processing method, device, equipment, medium and computer program product
CN116824604A (en) * 2023-08-30 2023-09-29 江苏苏宁银行股份有限公司 Financial data management method and system based on image processing
CN116824604B (en) * 2023-08-30 2023-11-21 江苏苏宁银行股份有限公司 Financial data management method and system based on image processing

Similar Documents

Publication Publication Date Title
CN107798299B (en) Bill information identification method, electronic device and readable storage medium
WO2019174130A1 (en) Bill recognition method, server, and computer readable storage medium
CN112528863A (en) Identification method and device of table structure, electronic equipment and storage medium
CN110866495A (en) Bill image recognition method, bill image recognition device, bill image recognition equipment, training method and storage medium
WO2023015922A1 (en) Image recognition model training method and apparatus, device, and storage medium
WO2019071662A1 (en) Electronic device, bill information identification method, and computer readable storage medium
WO2021012382A1 (en) Method and apparatus for configuring chat robot, computer device and storage medium
CN108491866B (en) Pornographic picture identification method, electronic device and readable storage medium
CN114332883A (en) Invoice information identification method and device, computer equipment and storage medium
CN113837151B (en) Table image processing method and device, computer equipment and readable storage medium
US20220092353A1 (en) Method and device for training image recognition model, equipment and medium
CN110689658A (en) Taxi bill identification method and system based on deep learning
WO2020125062A1 (en) Image fusion method and related device
US11727701B2 (en) Techniques to determine document recognition errors
WO2022126978A1 (en) Invoice information extraction method and apparatus, computer device and storage medium
CN113011144A (en) Form information acquisition method and device and server
CN113158895A (en) Bill identification method and device, electronic equipment and storage medium
CN112132812A (en) Certificate checking method and device, electronic equipment and medium
CN114708461A (en) Multi-modal learning model-based classification method, device, equipment and storage medium
CN112862703B (en) Image correction method and device based on mobile photographing, electronic equipment and medium
CN117115823A (en) Tamper identification method and device, computer equipment and storage medium
CN112581344A (en) Image processing method and device, computer equipment and storage medium
CN114495146A (en) Image text detection method and device, computer equipment and storage medium
CN114550189A (en) Bill recognition method, device, equipment, computer storage medium and program product
CN111241974B (en) Bill information acquisition method, device, computer equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination