CN112801041A - Financial data reimbursement method, device, equipment and storage medium - Google Patents

Financial data reimbursement method, device, equipment and storage medium Download PDF

Info

Publication number
CN112801041A
CN112801041A CN202110249954.6A CN202110249954A CN112801041A CN 112801041 A CN112801041 A CN 112801041A CN 202110249954 A CN202110249954 A CN 202110249954A CN 112801041 A CN112801041 A CN 112801041A
Authority
CN
China
Prior art keywords
reimbursement
data
bill
image
invoice
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202110249954.6A
Other languages
Chinese (zh)
Inventor
王小媞
詹明捷
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Sensetime Technology Development Co Ltd
Original Assignee
Beijing Sensetime Technology Development Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Sensetime Technology Development Co Ltd filed Critical Beijing Sensetime Technology Development Co Ltd
Priority to CN202110249954.6A priority Critical patent/CN112801041A/en
Publication of CN112801041A publication Critical patent/CN112801041A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/40Document-oriented image-based pattern recognition
    • G06V30/41Analysis of document content
    • G06V30/414Extracting the geometrical structure, e.g. layout tree; Block segmentation, e.g. bounding boxes for graphics or text
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/10Text processing
    • G06F40/189Automatic justification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/30Semantic analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q40/00Finance; Insurance; Tax strategies; Processing of corporate or income taxes
    • G06Q40/12Accounting
    • G06Q40/125Finance or payroll
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/60Type of objects
    • G06V20/62Text, e.g. of license plates, overlay texts or captions on TV images
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/40Document-oriented image-based pattern recognition
    • G06V30/41Analysis of document content
    • G06V30/413Classification of content, e.g. text, photographs or tables
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition

Abstract

The embodiment of the application provides a method, a device, equipment and a storage medium for reimbursing financial data, wherein the acquired bill image is identified to obtain bill data; identifying the acquired reimbursement bill image to obtain data to be reimbursed; in response to the outstanding data meeting a first reimbursement requirement and the ticket data meeting a second reimbursement requirement, determining financial data associated with the reimbursement slip image; and performing reimbursement processing on at least part of the data in the bill data based on the financial data.

Description

Financial data reimbursement method, device, equipment and storage medium
Technical Field
The embodiment of the application relates to the technical field of character recognition, and relates to but is not limited to a financial data reimbursement method, device, equipment and storage medium.
Background
The reimbursement process of each company is relatively complicated, and the reimbursement personnel are required to integrate various bills and perform manual examination and verification by financial personnel after completing the on-line examination and approval process. When the financial staff audits the bills, a large amount of time is consumed to check each reimbursement bill and the corresponding bill, and particularly under the condition that a plurality of bills exist in a single reimbursement bill, the financial staff is required to consume more energy to check so as to avoid errors, the auditing difficulty of the reimbursement process is increased, and meanwhile, the reimbursement period is longer for the reimbursement staff.
Disclosure of Invention
The embodiment of the application provides a technical scheme for reimbursement of financial data.
The technical scheme of the embodiment of the application is realized as follows:
in a first aspect, an embodiment of the present application provides a method for reimbursing financial data, where the method includes:
identifying the acquired bill image to obtain bill data;
identifying the acquired reimbursement bill image to obtain data to be reimbursed;
in response to the outstanding data meeting a first reimbursement requirement and the ticket data meeting a second reimbursement requirement, determining financial data associated with the reimbursement slip image;
and performing reimbursement processing on at least part of the data in the bill data based on the financial data.
The embodiment of the application provides a financial data's reimbursement device, the device includes:
the first identification module is used for identifying the acquired bill image to obtain bill data;
the second identification module is used for identifying the acquired reimbursement bill image to obtain data to be reimbursed;
a first determination module to determine financial data associated with the reimbursement slip image in response to the backlog data meeting a first reimbursement requirement and the ticket data meeting a second reimbursement requirement;
and the first reimbursement module is used for reimbursing at least part of data in the bill data based on the financial data.
Correspondingly, an embodiment of the present application provides a computer storage medium, where computer-executable instructions are stored on the computer storage medium, and after being executed, the computer-executable instructions can implement the above-mentioned method steps.
An embodiment of the present application provides an electronic device, where the electronic device includes a memory and a processor, where the memory stores computer-executable instructions, and the processor can implement the steps of the method when executing the computer-executable instructions on the memory.
The embodiment of the application provides a method, a device, equipment and a storage medium for reimbursement of financial data, wherein for an acquired bill image and an acquired reimbursement bill image, the bill image and the reimbursement bill image are automatically identified to obtain bill data and data to be reimbursed; therefore, the identification of the bill image and the reimbursement bill image can be automatically realized, and the association between the bills can be determined; then, judging whether the data to be reimbursed meet a first reimbursement requirement and the bill data meet a second reimbursement requirement, and automatically determining the financial data associated with the reimbursement bill image under the condition that the data to be reimbursed meet the first reimbursement requirement and the bill data meet the second reimbursement requirement, so that the effective check of the bill data can be realized; and finally, adopting the financial data to carry out reimbursement processing on the bill data, thus adopting the checked financial data to carry out reimbursement processing on the bill data, realizing automatic reimbursement and saving manpower and material resources.
Drawings
Fig. 1 is a schematic flow chart illustrating an implementation of a method for reimbursing financial data according to an embodiment of the present application;
FIG. 2 is a schematic flow chart of another implementation of a method for reimbursing financial data according to an embodiment of the present disclosure;
fig. 3 is a schematic view of an application scenario of a reimbursement method for financial data according to an embodiment of the present application;
fig. 4 is a schematic view of another application scenario of the method for reimbursing financial data according to the embodiment of the present application;
fig. 5 is a schematic view of another application scenario of the method for reimbursing financial data according to the embodiment of the present application;
FIG. 6 is a schematic structural diagram of an apparatus for reimbursement of financial data according to an embodiment of the present disclosure;
fig. 7 is a schematic structural diagram of an electronic device according to an embodiment of the present application.
Detailed Description
In order to make the objects, technical solutions and advantages of the embodiments of the present application clearer, specific technical solutions of the present invention will be described in further detail below with reference to the accompanying drawings in the embodiments of the present application. The following examples are intended to illustrate the present application but are not intended to limit the scope of the present application.
In the following description, reference is made to "some embodiments" which describe a subset of all possible embodiments, but it is understood that "some embodiments" may be the same subset or different subsets of all possible embodiments, and may be combined with each other without conflict.
In the following description, references to the terms "first \ second \ third" are only to distinguish similar objects and do not denote a particular order, but rather the terms "first \ second \ third" are used to interchange specific orders or sequences, where appropriate, so as to enable the embodiments of the application described herein to be practiced in other than the order shown or described herein.
Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this application belongs. The terminology used herein is for the purpose of describing embodiments of the present application only and is not intended to be limiting of the application.
Before further detailed description of the embodiments of the present application, terms and expressions referred to in the embodiments of the present application will be described, and the terms and expressions referred to in the embodiments of the present application will be used for the following explanation.
1) Optical Character Recognition (OCR) is a technology for converting characters in a paper document into an image file of a black-and-white dot matrix in an Optical manner for print characters, and converting the characters in the image into a text format through Recognition software for further editing and processing by Character processing software.
2) The document is structured, and the document management structure is composed of logical structures such as titles, chapters, paragraphs and the like. Structuring is the creation of a framework for a document, as if writing an article first written a synopsis. The structuring makes your document look disordered and not, and each part is closely related to form a whole.
An exemplary application of the financial data reimbursement system provided by the embodiments of the present application is described below, wherein the terminal in the system provided by the embodiments of the present application may be implemented as various types of electronic devices such as a notebook computer with an image capture function, a tablet computer, a desktop computer, a mobile device (e.g., a personal digital assistant, a dedicated messaging device, a portable game device), and the like.
In the following, an exemplary application will be described when the system for reimbursement of financial data is implemented as an electronic device.
Fig. 1 is a schematic flow chart of an implementation process of a method for reimbursing financial data according to an embodiment of the present application, as shown in fig. 1, and is described with reference to the steps shown in fig. 1:
and step S101, identifying the acquired bill image to obtain bill data.
In some embodiments, the image of the bill includes the image of the invoice to be identified, wherein the number of the invoice to be identified may be one, two, or more than two. For example, the bill image is an image of a sheet a4 on which a plurality of invoices are pasted.
In some possible implementation manners, each invoice to be identified in the bill image is subjected to character identification of each character area by adopting a bill template matched with the invoice, and meanwhile, the association relation between adjacent character areas in the invoice to be identified is combined to perform document structuring processing on the character identification result, so that structured bill data is obtained. The incidence relation between adjacent character areas in the invoice to be identified can be understood as the corresponding relation between fixed characters in one invoice and variable characters filled in the invoice; for example, if the fixed text in a text area in the invoice is "buyer's account opening row and account", then the text area adjacent to the text area is the specific account opening row and account ", for example, shanghai a division row 2408668-; alternatively, if the fixed character in one character area is "purchaser name", the character area adjacent to the character area is "company a", etc. Therefore, the characters in the character recognition result of the invoice are matched and then output according to the corresponding relation between adjacent character areas in the invoice to be recognized, and structured bill data can be obtained. And under the condition that the bill image comprises a plurality of invoices to be identified, aiming at each invoice to be identified, the data has a structured character identification result, so that bill data is obtained.
And S102, identifying the acquired reimbursement bill image to obtain data to be reimbursed.
In some embodiments, the reimbursement order image is an image whose screen content includes the reimbursement order to be identified; wherein the reimbursement order refers to a credential that the enterprise or company gives to reimburse at an expense. The expense reimbursement bill is paid by staff when the staff transacts for the company with own money, returns the company to reimburse after the affairs are completed, and pastes an invoice needing reimbursement with the invoice, and then reimburses the invoice to the reimburser through finance. For example, the reimbursement slip image is an image collected for a reimbursement slip filled by an employee of company a; only the reimbursement slip may be included in the image, and the reimbursement slip and other background areas may be included in the image.
In some possible implementation manners, the reimbursement slip to be recognized in the reimbursement slip image is subjected to character recognition, and the character recognition result is subjected to structural processing by combining semantic relations among different character areas in the reimbursement slip, so that characters in the character recognition result are matched, namely, the obtained reimbursement data has a structure. In a specific example, the reimbursement bill is divided into a plurality of text areas according to the table lines in the reimbursement bill image; and analyzing the association relationship among the character areas to determine the output position relationship of the characters in the character recognition result, namely which characters are adjacent to which characters. For example, a text area is a fixed text: reimbursement items; then the text area matching the text area is a specific reimbursement item: the production fee of 1 month in 2021; then when outputting the character recognition result, based on the association relation between the two character areas, outputting the "reimbursement item" in the front and immediately following the "reimbursement item" for the production fee of 1 month in 2021; thus, the structured data to be reimbursed is obtained and output.
Step S103, in response to the data to be reimbursed meeting the first reimbursement requirement and the bill data meeting the second reimbursement requirement, determining financial data associated with the reimbursement bill image.
In some embodiments, the first reimbursement requirement is that the number of approval pass information included in the data to be reimbursed is equal to a preset value; determining that the data to be reimbursed meets the first reimbursement requirement in response to the number being equal to a preset value by determining the number of approval pass information included in the data to be reimbursed. The preset value may be determined based on the number of important approval items, for example, if there are 10 important items in the approval items, the preset value may be set to 10. Based on this, the data to be reimbursed meets the first reimbursement requirement, which indicates that the data to be reimbursed has passed the layer-by-layer examination and approval. And outputting bill data after performing character detection and character recognition on the bill image and performing document structurization on a character recognition result. And outputting data to be reimbursed after performing character detection and character recognition on the reimbursement bill image and performing document structuralization on a character recognition result. And judging the matching degree between the bill data and the data to be reimbursed, and if the bill data is matched with the data to be reimbursed, further judging whether the bill data meets external reimbursement limits, for example, whether the bill data meets the stipulations of the financial department on reimbursement items.
In some possible implementations, the second reimbursement requirement is that the ticket data matches the reimbursement data, and the ticket data satisfies the external reimbursement limit, and whether the ticket data satisfies the second reimbursement requirement may be checked by the following process:
first, a first matching degree between bill data and data to be reimbursed is determined.
In some embodiments, first, the backlog items in the backlog data are classified; then, aiming at each classified type, searching data belonging to the type in the bill data, and finally judging whether the data of the type in the bill data is matched with the data of the type in the data to be reimbursed.
And secondly, responding to the condition that the first matching degree is larger than or equal to a preset matching degree threshold value, determining that the bill data is matched with the data to be reimbursed, and determining whether the bill data meets the external reimbursement limit.
In some embodiments, for each category in the data to be reimbursed, a matching degree determination is performed, and if the bill data of each category matches the data to be reimbursed, it is determined that the bill data matches the data to be reimbursed. For example, if the catering invoice reimbursement amount in the data to be reimbursed is 1000 yuan, the invoice amount belonging to the catering invoice is searched in the bill data, and the total amount of the catering invoice is determined; if the total amount of the catering invoice is less than or equal to 1000 yuan, the catering data to be reimbursed is matched with the bill data; if the total amount of the catering invoice is larger than 1000 yuan, the catering data to be reimbursed is partially matched with the bill data. If the invoice belonging to the catering class is not searched in the bill data, the data to be reimbursed is not matched with the bill data.
And thirdly, in response to the bill data meeting the external reimbursement limit, determining that the bill data meets a second reimbursement requirement.
In some embodiments, if the ticket data both matches the reimbursement data and satisfies the external definition, the ticket data is determined to satisfy the second reimbursement requirement. Therefore, the invoice to be identified can be automatically checked through the first step to the third step.
In some possible implementation manners, the checking of the data to be reimbursed may be implemented through a process of checking whether the data to be reimbursed includes approval passing information meeting certain data, that is, checking whether the reimbursement bill passes layer-by-layer approval. And if the data to be reimbursed comprises the approval passing information of the preset quantity, determining that the data to be reimbursed meets the first reimbursement requirement, namely indicating that the reimbursement bill passes the layer-by-layer approval. And finally, under the condition that the data to be reimbursed meet the first reimbursement requirement and the bill data meet the second reimbursement requirement, determining the financial data. The financial data associated with the reimbursement slip image includes financial information for implementing reimbursement of the bill data, such as information of an applicant determined based on the reimbursement slip image, a bank account number and a payment account number bound by the applicant. External reimbursement limits may be understood as requirements set by the financial department for invoices that need reimbursement; for example, the amount of a single bill, the type of invoice, the date of invoice, etc.; and judging whether the invoice to be identified meets the set requirements or not so as to realize the check of the invoice to be identified.
And step S104, performing reimbursement processing on at least part of data in the bill data based on the financial data.
In some embodiments, for a category of reimbursement transactions, if the total amount of invoices belonging to the category in the instrument data is less than or equal to the pending reimbursement amount for the category presented in the invoice image, it is indicated that the total amount of invoices for the category in the instrument data does not exceed the amount of the category filled in the invoice. For example, if the amount of money to be reimbursed for the catering service filled in the reimbursement bill is 1000, and the total amount of invoices belonging to the catering service in the bill data is 900, then all invoices to be identified in the bill data are reimbursed. If the amount to be reimbursed of the catering invoices filled in the reimbursement bill is 1000 and the total amount of the invoices belonging to the catering invoices in the bill data is 1200, determining the catering invoices with the total amount of 1000 in the bill data, reimbursing the catering invoices with the total amount of 1000 and not reimbursing the rest 200 catering invoices. In some possible implementation manners, if the partial data of the bill data is reimbursed, the invoice to be identified of the residual data can be determined based on the residual data; and the invoice to be identified of the residual data is fed back to the declaration terminal, so that the declaration person withdraws the invoice to be identified of the residual data.
In the embodiment of the application, for the acquired bill image and the acquired reimbursement bill image, the bill image and the reimbursement bill image are automatically identified to obtain bill data and data to be reimbursed; therefore, the identification of the bill image and the reimbursement bill image can be automatically realized, and the association between the bills can be determined; then, after the data to be reimbursed and the bill data are checked, financial data related to the reimbursement bill image is automatically determined, so that the effective check of the bill data can be realized; and finally, adopting the financial data to carry out reimbursement processing on the bill data, thus adopting the checked financial data to carry out reimbursement processing on the bill data, realizing automatic reimbursement and saving manpower and material resources.
In some embodiments, the confidence of the bill is obtained by matching the bill with the bill templates in the bill template library, so that the bill template with high confidence is determined as the bill template to be called, and bill identification is automatically implemented, which can save manpower and material resources, that is, the step S101 may be implemented by the step shown in fig. 2, and fig. 2 is another implementation flow diagram of the method for reimbursing financial data provided in the embodiment of the present application, and the following description is performed with reference to the steps shown in fig. 1 and 2:
step S201, extracting an image area where the invoice to be identified is located in the bill image to obtain at least one area image.
In some embodiments, one invoice to be identified corresponds to one area image. The invoice to be identified in the bill image can be one or more; acquiring an image of the paper on which one or more invoices to be identified are stuck to obtain a bill image; or, image acquisition is carried out on a plurality of invoices to be identified, and the acquired images are spliced together to form a bill image comprising the invoices to be identified. For example, the categories of invoices to be identified in the bill image include: value-added tax invoices, electronic invoices, special invoices, common invoices (such as catering invoices, accommodation invoices, taxi invoices, fuel-filling invoices of gas stations or stationery invoices and the like), machine-printed invoices and the like.
In the bill image, the image area where each invoice to be identified is located is determined, and the image areas are subjected to matting, so that a plurality of area images of which the picture content comprises one invoice to be identified can be obtained. In some possible implementation manners, if the bill image includes 3 invoices to be identified, the image areas where the 3 invoices to be identified are located are extracted respectively, so as to obtain the area image to which each invoice to be identified belongs. For example, the bill image includes 3 invoices to be identified, which are a dining ticket, a lodging ticket and a ticket for ticket printing, and then the image areas where the three invoices are located are respectively subjected to matting to obtain 3 area images, namely an area image including the dining ticket, an area image including the lodging ticket and an area image including the ticket for ticket printing.
In some possible implementation manners, after an image area where an invoice to be identified is located in a note image is subjected to matting, a plurality of image areas are obtained, in response to that the area image is in a non-positive state, image content in the area image is subjected to correction processing, and an image obtained after the correction processing is used as the area image. In this way, the image content of each image area is corrected so that the image content is overall and easy to identify, and the method can be realized through the following processes:
firstly, in a bill image, carrying out cutout on an image area where each invoice to be identified is located to obtain at least two cutout images.
In some embodiments, the vertex of the image area where each invoice to be identified is located in the bill image is found through target detection, and the image area is subjected to matting through the vertex to obtain the area image. For example, 4 vertexes of the invoices to be identified are detected in the bill image, and the region is subjected to matting through the 4 vertexes to obtain a matting image of each invoice to be identified. For example, the bill image includes catering tickets, accommodation tickets and ticket printing tickets, 3 invoices to be identified are subjected to matting by detecting 4 vertexes of each invoice to be identified, and the image area where the invoice to be identified is located is subjected to matting; namely, an image area where the catering tickets are located is extracted by detecting 4 vertexes of the catering tickets in the ticket image; the method comprises the steps of picking out an image area where an accommodation ticket is located by detecting 4 vertexes of the accommodation ticket in a ticket image; and (4) detecting 4 vertexes of the ticket in the ticket image to scratch out the image area where the ticket is played.
And secondly, responding to the situation that the cutout image is in a non-positive state, carrying out correction processing on the picture content in the cutout image, and taking the image obtained after the correction processing as an area image.
In some embodiments, each invoice to be identified in the bill image is subjected to matting, and when the matte image is obtained, the image content of the matte image, such as characters, numbers or patterns, is subjected to angle correction, the inclination angle of the image content is corrected to be 0-degree inclination, that is, the image content of the matte image is turned right, so that the characters or the numbers in the inclined state in the matte image can be in the vertical direction. In other implementations, if the matte image is oblique, the oblique angle of the matte image is adjusted to 0 degrees, i.e., so that the matte image is also vertical. Therefore, the inclination of the image contents including the sectional image of the invoice to be identified and characters in the sectional image is corrected, so that the image contents in the obtained area image are vertical and are easier to identify.
And S202, identifying the area image to obtain the bill data.
In some embodiments, the overall recognition is performed on each regional image, a bill template with a high confidence coefficient is called from a preset bill target library for character detection and character recognition, and the association relationship between different character regions in the regional image is combined to perform structural processing on the character recognition result, so that structured bill data is obtained.
In some possible implementation manners, the identification of the region image is implemented by searching a ticket template with a higher matching degree with the region image in a ticket template library, that is, the step S202 may be implemented by:
step S231, obtains the invoice category to which the area image belongs.
In some embodiments, for the obtained multiple area images, the invoice category to which the area image belongs is obtained by classifying the invoices to be identified presented in the area image. Such as value-added tax invoice, electronic ticket, special invoice or general invoice, etc.
And step S232, searching a target bill template matched with the invoice type in a preset bill template library.
In some embodiments, according to the invoice category, in a preset bill template library, the confidence of the bill template belonging to the category and the area image is determined, and the bill template with the confidence greater than or equal to a preset confidence threshold value is used as the target bill template of the area image.
Step S233, in response to finding the target bill template, performing character recognition on the character region in the region image based on the target bill template to obtain a character recognition result.
In some embodiments, a target bill template is found in the preset bill template library, that is, a bill template whose confidence with the region image is greater than a preset confidence threshold exists in the preset bill template library, and the target bill template is called to perform character detection and character recognition on each character region in the region image, so as to obtain a character recognition result. As shown in fig. 3, the bill image includes 3 invoices to be identified, and for the invoice 301 to be identified, the invoice category of the invoice 301 to be identified is a special invoice, and then in the common bill template of the preset bill template, a target bill template with a higher confidence with the area image of the invoice 301 to be identified is searched. By calling the target bill template, the text detection is performed on the area image of the invoice 301 to be recognized, so as to obtain each text area including text in the area image, for example, a rectangular frame or other graphic frame frames can be used to select the text area, and then, by performing text recognition on each text area, the OCR technology can be used to perform text recognition on each text area, so as to obtain a text recognition result. The character recognition result comprises characters in any character area in the area image.
Step S234, obtaining bill data based on the character recognition result and the incidence relation between different character areas.
In some embodiments, in the case that one region image is included in the bill image, that is, a single region image, the association relationship between adjacent text regions in the invoice to be identified of the region image is determined according to the target bill template of the region image. Based on the incidence relation between adjacent character areas, carrying out structural processing on the character recognition result to realize character matching, thereby obtaining a character output result with the structure; taking the invoice 303 to be identified in fig. 3 as an example, the invoice 303 to be identified is a special invoice for the xxx road and bridge toll; wherein, the text area is a text area 331 to which the 'invoice code' belongs, and the adjacent text areas comprise a text area 332 to which the '111111111111' belongs and a text area 333 to which the 'invoice number' belongs; by analyzing that the association relationship between the adjacent text areas 331 and 332 is a dependency relationship, the output position relationship of the text in the two areas is determined, the text in the adjacent areas in the text recognition result is matched, that is, the text in the text area 332 and the text in the text area 331 are output in a row, and the text in the text area 332 is output behind the text in the text area 331. By analyzing that the association relationship between the adjacent text areas 331 and 332 is an independent relationship, the text output in the text area 331 is respectively output in two rows in the text area 333, which are independent from each other.
For the case that the bill image includes multiple region images, the relationship between the data needs to be correlated to match the characters in the character recognition result. In some possible implementations, different invoices to be identified are associated by dates in the plurality of region images. For example, the travel invoices are included in the invoices to be identified, and then the departure time and the expiration time are determined from the invoices to be identified so as to associate the travel invoices together. In other implementation manners, the output dimension of the required character recognition result can be set based on the requirement of the reimbursement items in the reimbursement process. Thus, the character recognition result is integrally output according to the requirements of different reimbursement items; in this way, subsequent accounting of the amount of each type of invoice is facilitated. For example, for a traffic invoice, whether the reimbursement amount of the traffic invoice is excessive or whether the traffic charge generated in one day is within a limit is judged; whether the total sum of the bills meets the requirement can be judged in a certain period, for example, whether the travel invoices generated in the week are excessive is judged in a week period; or checking whether the amount of the single invoice is excessive from the structured character output result. For example, check if the amount of the food and beverage invoice is within the amount limit.
In the method, the target bill template with higher confidence is called in the preset bill template library, and the operations such as character detection, character recognition, document structuring processing and the like are performed on the regional image, so that the character recognition and association of the bill can be automatically realized, and manpower and material resources are saved.
In some embodiments, if the target ticket object is not found in the preset ticket template library, the ticket data can be obtained in the following two ways, where one way is as shown in step S235 to step S237:
and S235, in response to the target bill template not found, performing character recognition on the regional image to obtain a first global recognition result.
In some embodiments, according to the confidence degrees of the region image and the bill template, a target bill template with a higher confidence degree is searched in the preset bill template library, and if the confidence degrees of the bill templates in the preset bill template library are all lower than a preset confidence degree threshold value, it is indicated that the target bill template is not found. For example, different types of invoices in different regions are different, or the preset bill template library is updated and iterated more slowly, so that the requirement of the same type of bill types cannot be met, or invoices (for example, invoices of the same category) do not exist in the preset bill template library; then when the document template matching is performed on the region image, even if the document template with the highest confidence coefficient is found, the confidence coefficient of the document template still does not reach the preset confidence coefficient threshold value. Under the condition, performing overall character recognition on the region image to obtain an overall character recognition result, namely a first overall recognition result; in this way, the artificial correction and verification can be performed by combining the semantic information of the region image with the first global recognition result to obtain an accurate recognition result.
Step S236, based on the semantic information in the region image, adjusting the first global recognition result to obtain an intermediate output result, and taking the intermediate output result as the bill data.
In some embodiments, the semantic information in the region image is used to describe the picture content of the region image and indicate the semantics of each object in the picture content of the region image, including describing underlying feature semantics such as color, texture, and shape of the region image, and attribute features, etc. In some possible implementation manners, after the region image is subjected to overall character recognition, the character typesetting of the characters in the first overall recognition result is adjusted according to the semantic information by combining the semantic information in the region image, so that the obtained intermediate output result meets the semantic information, and the intermediate output result can be used as bill data. Or the intermediate output result is fed back to the check node pair so as to obtain the bill data from the check node; the process proceeds to step S237.
Step S237, sending the region image and the intermediate output result to a collation node to acquire the bill data from the collation node.
In some embodiments, the reconciliation node may be a manual reconciliation node to which the region image and the intermediate output result are sent. For example, the financial staff check node transmits the intermediate output result and the area image to a manual check node of the financial staff, and the actual ticket surface content of the invoice is automatically input in a manual check mode; or the financial staff corrects the obtained intermediate output result based on the regional image to obtain accurate bill data.
In the above steps S235 to S237, a manner of "recognizing the at least two region images to obtain the ticket data" is provided, in which if the preset ticket template library does not include the target ticket template, the region images are manually checked to obtain the ticket data with higher accuracy.
The second method comprises the following steps: as shown in steps S238 to S240:
in step S238, in response to that the target bill template is not found, return prompt information is output to obtain an invoice image corresponding to the area image.
In some embodiments, in the case that the preset bill template library does not include the target bill template, the area image matched with the target bill template may be returned to the input end, so that the reimburser reenters the area image with higher picture quality, that is, the high-quality invoice image corresponding to the area image. Namely, outputting prompt information which can prompt the reimburser that an invoice is returned; if the type of the invoice to be identified in the area image is identifiable, that is, the type of the invoice to be identified can be identified, the prompt information is generated based on the picture content of the area image, so that the prompt information can be matched with the area image, for example, if the type of the invoice to be identified in the area image is a passing ticket, the prompt information can be information such as 'passing ticket is returned'. The prompt message can be output in the form of text, voice or image.
And step S239, determining invoice information of the invoice image.
In some embodiments, the invoice information includes: the ticket head, the character track number, the number and the purpose of association, the name of a client, the account number of bank account opening, the name of a business (product) or an operation item, a metering unit, the quantity, the unit price, the amount of money, capital and small amount of money, a passer-by, a unit seal or an invoicing date and the like.
Step S240, searching a bill template matched with the invoice information in the preset bill template library, taking the bill template as the target bill template, performing character recognition on the area image, and obtaining the bill data.
In some embodiments, the bill template matched with the invoice information is a bill template in which template layout information is matched with the invoice information, for example, if the invoice information indicates that the invoice is a traffic ticket, a traffic bill template is searched in a bill template library, and further a bill template with higher typesetting similarity to the specific bill content is found from the traffic bill template by analyzing the specific bill content included in the invoice information, so that a target bill template is obtained; finally, character recognition can be carried out on the region image in the target bill template based on the target bill template, and the bill data can be obtained.
In the step S238 to the step S240, another way of obtaining the target bill template is provided, in which an area image matched with the target bill template in the preset bill template library is returned, and a reimburser is prompted to re-input an invoice image of the area image, so that the target bill template can be matched for the invoice to be identified through the re-input high-quality invoice image, so as to improve the matching success rate of the bill template, and further improve the accuracy of the character recognition performed on the area image and the obtained recognition result.
In other embodiments, after returning the area image that is not matched to the target bill template in step S238, the user may be prompted to re-enter the invoice image, that is, the reimburser is prompted to autonomously enter invoice information corresponding to the invoice in the area image. For example, the reimburser selects an option matching the area image from a plurality of invoice information options provided by the system, or manually enters invoice information. Therefore, when the bill template matching is carried out on the regional image, the confidence coefficient between the regional image and the bill template does not need to be judged, the target bill template matched with the bill information can be directly called, and the speed and the accuracy of the bill template matching are improved.
In some embodiments, in the case that no target bill template exists in the preset bill template library, a new bill template may be generated based on the invoice information of the invoice to be identified, so as to update the preset bill template, which may be implemented through the following processes:
in the first step, in response to the fact that the bill template matched with the invoice information is not found, a new bill template is generated based on the invoice information.
In some embodiments, the confidence of the typesetting conditions of the bill templates in the preset bill template library and the invoice information is less than the confidence threshold, or the category of the bill templates in the preset bill template library does not have the category of the invoice to be identified corresponding to the invoice information, that is, it is determined that the bill template matched with the invoice information is not found in the preset bill template library. In this case, a new ticket template may be generated by analyzing the invoice information. For example, although the category of the bill template in the preset bill template library includes the category of the invoice to be identified corresponding to the invoice information, because the different regions have different typesetting for the invoice of the same category, the bill template of the category that already exists in the preset bill template library is not matched with the invoice to be identified; based on the method, the typesetting of the invoice can be analyzed according to the invoice information, so that a new bill template is generated. Or, if the invoice is a small public invoice, the type of the bill template in the preset bill template library does not have the type of the invoice, so that the typesetting of the small public invoice can be analyzed according to the invoice information, and a new bill template is generated.
And secondly, adding the new bill template to the preset bill template library.
In some embodiments, after a new ticket template is generated by analyzing the invoice information, the new ticket template is added to the library of pre-provisioned ticket templates. In some possible implementation manners, the bill templates in the preset bill template library can be filtered according to a certain time period, and the bill template library is updated in time.
In the embodiment of the application, the preset bill template library is updated, so that the updated preset bill template library can meet the update iteration of invoice typesetting, and the bill templates with high confidence are matched for the region images.
In some embodiments, for acquiring a reimbursement bill image whose screen content includes a reimbursement bill to be identified, identifying the reimbursement bill content may be implemented in the following two ways, and obtaining the reimbursement bill content with the structure, that is, the step S102 may be implemented in the following two ways:
the first method is as follows:
step S121, identifying the table lines in the reimbursement note image to obtain a plurality of table areas formed by intersecting the table lines.
In some embodiments, since the reimbursement bill image includes many form lines and rectangles formed by intersecting different form lines, the reimbursement bill image is divided, and one rectangle is divided into one form area, so as to obtain a plurality of form areas.
And step S122, recognizing the characters in each table area to obtain a table recognition result.
In some embodiments, the text in each form area is identified by OCR recognition of the text in each form area. As shown in fig. 4, the sales slip image 401 is divided into a plurality of table areas, for example, a table area 402 where "business trip name" is located, a table area 403 where "start/stop point" is located, and a table area 404 where "subtotal" is located, by identifying table lines in the sales slip image 401.
And S123, matching characters in the form identification results corresponding to different form areas based on the incidence relation among the different form areas to obtain the data to be reimbursed.
In some embodiments, the association between different table regions is determined by performing semantic analysis on multiple table regions. For example, the content in the adjacent table areas around any table area in the report image is analyzed to determine the association relationship with the table area. As shown in fig. 4, for the table area 405 where "business trip complement" is located, the adjacent table area includes: a table area 406 of "zhang san", a table area 403 of "start and stop points", a table area 407 of "standard", a table area 408 of "days", a table area 409 of "amount of money", and a table area 410 of "accommodation fee", and semantic analysis is performed on the screen contents in these table areas, it is determined that the table area 405 and the table area 405 are independent from each other, but the table area 405, the table area 407, the table area 408, and the table area 409 are associated with each other. Based on the table area association relationship, the position of the character output is adjusted according to the character in the table identification result, so that the characters in the adjusted character output result are associated, namely, the structural reimbursement data is output. For example, in the output data to be reimbursed, "business trip complement" is output in one row, and "standard" and "50" are output in the next row of "business trip complement"; the days and the 3 are output in the same row with the standard and are separated by a semicolon; the "amount" and "150" are output in the same row as the "standard" and are separated by a semicolon, etc.
In the first mode, by identifying the table lines in the reimbursement note image and taking the area formed by crossing the plurality of table lines as a unit, the characters in each table area are identified, so that the subsequent identification is more targeted, and the characters in one table area are analyzed based on semantic analysis or correlation analysis, so that the accuracy is higher.
The second method comprises the following steps:
step S124, determining the type of the reimbursement bill in the reimbursement bill image.
In some embodiments, reimbursement slip types include, but are not limited to: office fees, traveling fees, lease fees, consulting fees, special expenses, daily expenses and the like. In some possible implementations, determining the reimbursement form type by identifying a header name of the reimbursement form image; for example, if the header name is a travel invoice, the type of the invoice is a travel class. Or the type of the reimbursement bill is determined by identifying fixed fields in the reimbursement bill image.
Step S125, a target layout template matched with the reimbursement bill type is searched in a preset layout template library.
In some embodiments, after determining the type of the reimbursement order, since reimbursement orders for various institutions typically have a fixed layout template, a layout template for which the reimbursement order type belongs may be looked up from a library of preset layout templates based on the type of the reimbursement order. For example, if the type of the reimbursement bill is a travel reimbursement bill, the layout template belonging to the travel class is searched in the preset layout template library to obtain the target layout template.
Step S126, responding to the searched target layout template, and determining a reference area comprising a fixed field and an area to be identified comprising a variable field.
In some embodiments, the reference area is an area with a fixed field, as shown in fig. 4, since the "business trip name" in the table area 402 is a fixed field, the table area 402 is the reference area, and since "zhang san" in the table area 406 is not a fixed field, the table area 406 is not the reference area. The area to be identified is used for inputting the variable field matched with the fixed field. The area to be identified associated with the reference area is a field content which is associated with the characters of the reference area and is a variable field; for example, since the fixed field in the table area 402 is "business trip name", and the field related to "business trip name" is the specific business trip name "zhang san", the area to be identified associated with the table area 402 is the table area 406.
And S127, recognizing characters in the reimbursement bill image based on the reference area and the area to be recognized to obtain reimbursement data.
In some embodiments, after the target layout template is determined, the reference area marked in the target layout template and the area to be recognized associated with the reference area can be analyzed. By calling the target layout template of the same type in the preset layout template library according to the reimbursement bill type, the efficiency of character recognition on reimbursement images can be improved.
In some embodiments, in the process of performing layout template matching on the reimbursement note image, the target to-be-recognized area of the reference area can be obtained by performing character recognition on the whole image and then searching a character matching part of the reference area in a character recognition result, and the method can be implemented by the following steps:
the method comprises the following steps of firstly, carrying out overall recognition on characters in a reimbursement note image to obtain a second overall recognition result.
In some embodiments, during the process of performing layout template matching on the reimbursement order image, OCR technology is used to perform overall character recognition on the reimbursement order image, so as to obtain a character recognition result, i.e. a second global recognition result. As shown in fig. 4, the second global recognition result is a result of performing overall character recognition on the reimbursement note image 401.
And secondly, searching a part of recognition results matched with each reference area in the second global recognition result.
In some embodiments, in the second global recognition result, a fixed field of the marked reference area, i.e. the partial recognition result, is looked up. As shown in fig. 4, with respect to the table area 402 as a reference area, the partial recognition result matching the reference area is "business trip name".
And thirdly, determining a target to-be-recognized area associated with the reference area corresponding to the partial recognition result based on the partial recognition result.
In some embodiments, in the second global recognition result, a recognition result associated with the partial recognition result is determined, and the to-be-recognized region to which the associated recognition result belongs is the target to-be-recognized region. For example, if the partial recognition result is "business trip name", and the recognition result associated with the partial recognition result is "zhangsan" in the second global recognition result, the target area to be recognized is the table area 406 where "zhangsan" is located.
And fourthly, matching the fixed characters positioned in the reference area and the variable fields positioned in the target area to be recognized in the second global recognition result based on the incidence relation between each reference area and the target area to be recognized to obtain the data to be reimbursed.
In some embodiments, for each reference region, in the second global recognition result, a target region to be recognized corresponding to the reference region is determined; thus, based on the incidence relation between each reference area and the target area to be recognized, the matching relation between the fixed field and the variable field in the second global recognition result is established; and outputting the data to be reimbursed based on the matching relationship. In this way, the output positions of the fixed field of the reference area and the variable field of the target area to be recognized can be determined, so that the document structuralization processing of the second global recognition result is realized; it is reasonable to make the output positions of the fixed characters positioned in the reference area and the variable fields positioned in the target area to be recognized in the output data to be reimbursed. For example, taking fig. 4 as an example, since the table area 402 as the reference area and the table area 406 as the area to be identified have an association relationship, when outputting the fields in the two areas, the fields in the two areas may be output in a row, with "business trip name" in front and "zhangsan" in the back.
In the embodiment of the application, the reimbursement note image is matched with each layout template to call the layout templates of the same type to realize character recognition on the reimbursement note, and the character recognition structure is structured through the reference area marked in the layout template and the associated area to be recognized, so that the accuracy and readability of the obtained data to be reimbursed can be improved.
In some embodiments, if a template of the reimbursement bill type is not included in the preset layout template library, a target layout template that is the same as the reimbursement bill type cannot be found, and then a new layout template may be generated based on the reimbursement bill type in combination with a fixed field of a reference area in the reimbursement bill; and the generated new layout template is stored in the preset layout template library so as to update the preset layout template library, so that the updated preset layout template library can meet various types of reimbursement bills, and the accuracy of template matching of reimbursement bill images is improved.
In some embodiments, after performing character recognition on the bill image and the reimbursement note image, the matching degree between the obtained bill data and the reimbursement data needs to be determined, so as to check the invoice to be recognized and the reimbursement note in the bill image, and further determine whether the bill data meets the second reimbursement requirement, which can be implemented through the following processes:
and step S151, classifying the data to be reimbursed based on the fixed fields in the reference areas of the reimbursement bill images to obtain a reimbursement class set.
In some embodiments, for the reimbursement data presented by the reimbursement order image, by analyzing the fixed fields in the reference areas of the reimbursement order image, it can be obtained which reimbursement categories are included in the reimbursement order image, for example, the fixed fields include: lodging, transportation, and dining, then the reimbursement categories include: lodging fees, transportation fees and catering fee telephone fees.
Step S152, determining the single data of the invoice to be identified corresponding to each reimbursement category in the bill data.
In some embodiments, since the bill image includes multiple invoices to be identified, the multiple invoices to be identified may be invoices of the same category or invoices of different categories. After the reimbursement types included in the reimbursement bill are determined, classifying the invoices to be identified according to the reimbursement type set in the bill data to obtain bill data corresponding to the invoices of each reimbursement type, namely the bill data. For example, if the reimbursement category is catering, the bill data corresponding to the invoice of the catering is determined, that is, the single category data of the catering is determined.
Step S153, for each reimbursement category, determining a matching degree between the data to be reimbursed corresponding to each reimbursement category and the single-class data of each reimbursement category.
In some embodiments, the data to be reimbursed and the bill data are checked according to reimbursement categories in the reimbursement bill, and whether the data provided by the invoice of the category is identical with the data filled in the reimbursement bill or not in each reimbursement category is respectively checked, so that the reimbursement bill and the invoice are checked on the basis of the data.
The single type of data for each reimbursement category includes: the total amount of the invoice, the invoicing date, the customer name, the unit seal and the like of the reimbursement category; and determining the matching degree between the data to be reimbursed corresponding to each reimbursement category and the single-class data of each reimbursement category by comparing whether each item of data in the single-class data is matched with the data to be reimbursed of the reimbursement category.
Step S154, determining that the ticket data matches the reimbursement data in response to the matching degree being greater than or equal to the preset matching degree threshold.
In some embodiments, if the sheet data of the reimbursement category in the bill data is consistent with the data of the reimbursement category filled in the reimbursement slip, the matching degree is greater than or equal to a preset matching degree threshold; for example, if the dates of the invoices are consistent with the logs filled in the reimbursement bill, the total amount of the invoices is less than or equal to the reimbursement amount filled in the reimbursement bill, and the invoicing unit is consistent with the invoicing unit filled in the reimbursement bill, the invoices and the reimbursement bill filled in the reimbursement bill are consistent, that is, the bill data is matched with the data to be reimbursed. Further, the invoice is confirmed to pass the audit, and the personal information of the reimburser and the information of a bank account and the like can be further confirmed for reimbursement.
In the embodiment of the application, the content filled in the reimbursement image and the content presented by the invoice to be identified are checked to determine whether the bill data is consistent with the reimbursement data or not, so that the reimbursement bill and the invoice can be automatically checked, and the efficiency of checking the reimbursement bill and the invoice in the financial reimbursement process is improved.
In some embodiments, by checking the bill data of the invoice to be identified with the reimbursement requirement to determine whether the invoice in the bill image meets the reimbursement requirement, the following process can be implemented:
step S161, in the bill data, determines the bill and bill data belonging to each invoice to be identified.
In some embodiments, for each invoice to be identified in the ticket image, determining the single-ticket data of one invoice to be identified; such as the amount of a single ticket, the date of the ticket, the name of the customer, etc.
Step S162, determining whether the single-ticket data meets the external reimbursement limit to determine whether the ticket data meets the second reimbursement requirement.
In some embodiments, a determination is made as to whether the single-ticket voucher data satisfies an external reimbursement definition, and responsive to the single-ticket voucher data satisfying the external reimbursement definition, a determination is made that the voucher data satisfies a second reimbursement requirement. By judging each item in the bill data of the single bill, if the bill data of the single bill meets the external reimbursement limit, each bill to be identified in the bill image meets the external reimbursement limit, so that the bill data is determined to meet the second reimbursement requirement.
In some possible implementation manners, firstly, determining the amount of the single tickets in the single-ticket data, and/or determining the types of the target invoices with additional detail requirements, and/or determining the identification information of the invoices to be identified corresponding to the single-ticket data; then, in response to the bill amount being less than or equal to a preset amount upper limit, and/or the detail data of the target invoice type being matched with the detail requirement, and/or in response to the marking information being contained in a preset bill identification library, determining that the bill data meets an external reimbursement limit, so as to determine that the bill data meets the second reimbursement requirement.
Based on this, the checking whether the bill data satisfies the external reimbursement limit can be realized in various ways:
the first method is as follows: in a first step, a ticket amount of the ticket data is determined.
In some embodiments, for the bill data of a single invoice to be identified, the amount of the invoice, namely the amount of the single invoice, is determined; for example, for a meal invoice, the amount of money consumed by the invoice is determined.
And secondly, responding to the fact that the amount of the single bill is smaller than or equal to the preset amount upper limit, and determining that the bill data meets the external reimbursement limit.
In some embodiments, if the amount of the single bill is less than or equal to the preset amount upper limit of the single bill, it may be further determined that the total amount of all invoices belonging to the same category as the invoice exceeds the preset total amount upper limit, and if the total amount is less than or equal to the preset total amount upper limit, it is determined that the bill data satisfies the external reimbursement limit.
The second method comprises the following steps: in the first step, a target invoice type with additional detail requirements is determined.
In some embodiments, the target invoice type may be set in an external reimbursement definition, and may also be determined based on the detailed degree of the picture content of the invoice to be identified; for example, the invoice to be identified does not include a lodging ticket that indicates which hotel to consume.
And secondly, responding to the matching of the detail data of the target invoice type and the detail requirement, and determining that the bill data meets the external reimbursement limit.
In some embodiments, first, in the bill data, finding detail data of an invoice matching the target bill type; then, the matching degree of the detail data of the target invoice type and the detail requirement is judged, if the matching degree is higher, the fact that the appropriate detail description is added to the invoice type with the additional detail requirement is shown, and therefore, the fact that the bill data meet the external reimbursement limit is determined.
The third method comprises the following steps: firstly, identification information of an invoice to be identified corresponding to the bill data is determined.
In some embodiments, the identification information of the invoice to be identified includes: the invoice number and the invoice code of the invoice to be identified and the like can uniquely identify the information of the invoice to be identified.
And secondly, responding to the fact that the marking information is contained in a preset bill identification library, and determining that the bill data meets external reimbursement limitation.
In some embodiments, the preset bill identification library is a number library capable of checking an invoice number in an official website, if the invoice number of the invoice to be identified is contained in the preset bill identification library, the invoice to be identified is a valid invoice, and further, it is determined that the bill data meets the external reimbursement limit.
In the embodiment of the present application, the first to third modes may be three modes for checking whether the bill data satisfies the external reimbursement limit in parallel, and may also be any two or three of the first to third modes for setting a priority relationship or a progressive relationship to check whether the bill data satisfies the external reimbursement limit; for example, the priority of the setting mode three is higher than that of the mode one, and the priority of the mode one is higher than that of the mode two; firstly, judging whether the labeling information is contained in a preset bill identification library or not; and finally, if the single bill amount of the effective bill is less than or equal to the upper limit of the amount, judging whether the detail data of the target invoice type is matched with the detail requirement, and if the detail data of the target invoice type is matched with the detail requirement, determining that the bill data meets the external reimbursement limit.
In another implementation manner, a progressive relationship may also be set for the first and third manners, for example, first, whether the annotation information is included in the preset bill identification library is judged; and then, if the marking information is contained in a preset bill identification library, judging the single bill amount of the effective bill, and finally, if the single bill amount of the effective bill is less than or equal to the upper limit of the amount, determining that the bill data meets the external reimbursement limit.
In the embodiment of the application, the invoice to be identified is checked in multiple modes to determine whether the bill data of the invoice to be identified meets the reimbursement requirement or not, so that the invoice data and the reimbursement requirement can be automatically checked.
In some embodiments, after the data to be reimbursed and the bill data are approved, the invoice to be identified is reimbursed, that is, the step S103 may be implemented by:
step S131, in response to the data to be reimbursed meeting the first reimbursement requirement and the bill data meeting the second reimbursement requirement, at least determining the financial account and reimburser information bound by the reimbursement note image.
In some embodiments, in the case where the reimbursement data satisfies the first reimbursement requirement and the ticket data satisfies the second reimbursement requirement, determining the basic information of the reimburser in the reimbursement slip image comprises: the names of the business departures, the departments in which the business departures are located, the participating projects, the contact ways and the like; through the analysis of the information of the reimburser, the step of determining the financial account number bound by the reimbursement bill image comprises the following steps: payment accounts and collection accounts, etc. And determining remark information for each money.
Step S132, at least the financial account number and the reimburser information are used as financial data.
In some embodiments, determining remark information for the ticket data based on reimburser information in the financial data; for example, in which item the fee for making the remark in the amount of making the payment is generated, the implemented money of the item may be recorded, and the details of each item in the identification result may be included. And taking the contents of the financial account number, the reimburser information, the remark information and the like as financial data to realize reimbursement processing of the bill data.
In the embodiment of the application, after the reimbursement bill and the invoice are checked, financial data such as a money amount, a payment account, a collection account, remark information and the like are determined by extracting reimburser information from an image of the reimbursement bill, so that automatic reimbursement is realized.
In some embodiments, it may be determined whether to reimburse all of the amount in the reimbursement data or a portion of the amount based on a relationship between the amount in the instrument data and the amount in the reimbursement data, as follows:
in the first step, the total amount of the invoices to be identified in each reimbursement category is determined in the bill image.
In some embodiments, the plurality of invoices to be identified in the ticket image are classified according to the reimbursement category included in the reimbursement bill, and for each invoice to be identified of the reimbursement category, the total amount of the reimbursement category is determined. For example, the total amount of the invoices to be identified for the catering class is determined.
And secondly, determining the amount of the reimbursement to be reimbursed of each reimbursement category in the reimbursement data.
In some embodiments, in the reimbursement order image, an amount to reimburse for the reimbursement category is determined. For example, after the total amount of the to-be-identified invoice of the catering is determined, the to-be-reimbursed amount of the catering filled in is determined in the to-be-reimbursed data.
And thirdly, in response to the total amount being less than or equal to the amount to be reimbursed, reimbursing the invoices to be identified of each reimbursement category based on the financial data.
In some embodiments, if the total amount of invoices for the reimbursement category is less than the reimbursement amount filled in the reimbursement slip, the total amount of invoices to be identified for the reimbursement category is indicated, the total amount is matched with the reimbursement data, and the invoices to be identified are reimbursed based on the payment account and the collection account in the financial data.
And fourthly, responding to the total amount larger than the amount to be reimbursed, and determining a plurality of candidate invoices with the sum equal to the amount to be reimbursed in the bill data.
In some embodiments, if the total amount of the invoices of the reimbursement category is greater than the reimbursement amount filled in the reimbursement bill, the total amount of the invoices to be identified of the reimbursement category is not matched with the reimbursement data, a plurality of candidate invoices with the sum of the amounts less than or equal to the reimbursement amount are determined in the invoices to be identified of the category. For example, the catering invoice to be identified is 10 invoices with 150 yuan, that is, the total amount is 1500 yuan, but the reimbursement amount filled in the reimbursement bill is 1000 yuan, and since the amount of each invoice to be identified is 150 yuan, the candidate invoice is 6 invoices in the 10 invoices.
And fifthly, reimbursing the candidate invoices based on the financial data.
In some embodiments, the total amount of the plurality of candidate invoices is reimbursed based on the payment account and the collection account in the financial data. For example, the restaurant-type invoice to be identified is 10 invoices with 150 yuan, the reimbursement amount filled in the invoice is 1000 yuan, and the final reimbursement amount is 900 yuan of the total amount of the 6 invoices.
In other embodiments, after reimbursement of the ticket data, reimbursement personnel can be reminded through the system by the following process:
first, the amount of money in the ticket data that has completed reimbursement is determined.
In some embodiments, after the payment is completed for the ticket data, the amount of the payment, i.e., the amount of the reimbursement completed, is obtained.
And secondly, generating and outputting prompt information based on the amount of the reimbursed money.
In some embodiments, prompt information matching the reimbursed amount may be generated, for example, the prompt information may include the amount of the bet amount and remark information about the items generated by the bet amount. The reminding mode of the prompting message can include but is not limited to mail, short message, telephone voice and the like.
In the embodiment of the application, after the bill data is reimbursed, reimbursers are reminded in time through the system, so that the reimbursers can know reimbursement conditions in time.
In the following, an exemplary application of the embodiment of the present application in an actual application scenario will be described, taking an example of implementing intelligent reimbursement of financial data identification for financial data.
In some embodiments, the reimbursement process of each company is relatively complicated, and the reimbursement personnel is required to stick various bills to an a4 paper, and after the online approval process is completed, the reimbursement bill passing through the leadership approval of each layer and the paper on which the various bills are stuck are taken as reimbursement bases and submitted to finance. In addition, in the financial affairs, it is usually necessary to consume a large amount of manpower and material resources to check each reimbursement bill, and in the process of checking each reimbursement bill, it is necessary to check information of a plurality of bills, such as invoicing date, unit, amount, type, project, etc. Especially in the concentrated reimbursement period such as the end of the year, the working pressure of financial staff is higher, and the situation of checking errors often appears, so that the workload of the financial staff is increased, and the reimbursement period is longer.
In addition, since the bills are manually pasted, it is likely that the bills are distorted and twisted eight times, and for the financial staff, much effort is required to check the bills in the case that a plurality of bills exist in a single reimbursement bill. The association between the bills and the reimbursement bill also require financial staff to check customs, which undoubtedly increases the auditing difficulty of the reimbursement process.
Based on this, the embodiment of the present application provides a method for reimbursing financial data, first, identifying a plurality of bills (corresponding to the bill images in the above embodiment) pasted on a piece of paper by mixed bill identification to determine a template corresponding to each invoice; and the corresponding template is called to complete the identification of each invoice. For the condition that the identification difficulty of the invoice is high or the identification result is abnormal, the identification result of the invoice can be firstly output to a manual checking node and rechecked by financial staff. Then, based on the data presented in the reimbursement note image and the bill identification result, whether reimbursement data submitted by reimbursers meets reimbursement requirements (for example, whether the reimbursement data is real and reliable, whether the provided bills are real and effective, and whether the value/type of the provided bills meet the content in the reimbursement notes) is judged, and the reimbursement data meeting the reimbursement requirements passes through an approval process. The method can be realized by the following processes:
firstly, identifying the bill image.
In some embodiments, a scanned document/document image (i.e., an image taken of a document affixed to paper) is input and the image is subject to region extraction. 4 vertexes of each note in the image can be found through target detection, and each note is subjected to matting through the 4 vertexes to obtain one or more matting images (one matting image corresponds to one note). The image can be corrected while matting so as to correct the distorted matte image. As shown in fig. 3, the bill image 300 is an image to which 3 invoices are pasted, that is, an invoice 301, an invoice 302, and an invoice 303 are pasted; the invoice 301 is a special invoice for the xx value-added tax, and the invoice 302 is a general quota invoice for the national tax administration of the xx province x city. The invoice 303 is a special invoice for the xxx road-bridge toll, and the identification result 304 is a character identification result obtained by performing character detection, character identification and document structuring processing on the invoice 301, that is, bill data corresponding to the invoice 301.
Through predefining multiple bill templates, obtain and predetermine the bill template storehouse, include: value added tax invoices, electronic tickets, special invoices, general invoices and the like. And classifying each matte image to confirm a target bill template matched with the matte image based on the confidence coefficient, so as to call the template to perform character detection, character recognition and document structurization on the matte image. The text detection can obtain each text area including the text in the cutout image, for example, the text area can be selected by using a rectangular frame or other graphic frame frames, and then the text area is subjected to text recognition to obtain a text recognition result. Through the relationship between adjacent text areas, document structuring processing can be performed on the text recognition results to output structured bill recognition results (each bill can obtain a set of recognition results as shown in fig. 3 by the recognition result 304 and fig. 5 by the recognition result 504), in fig. 5, a bill image includes 3 x provincial national tax bureau general quota invoices with different denominations to be recognized, wherein invoice 501 is a pickup quota invoice, invoice 502 is a five-yuan quota invoice, and invoice 503 is a two-yuan quota invoice; the invoice 501 is subjected to character detection, character recognition and document structuring, so that a locally structured bill recognition result 504 is obtained. The invoice in fig. 5 may be uploaded to the automatic reimbursement system by way of local upload 521, or may be obtained by way of local URL522, so as to implement text detection 523 of the invoice to be automatically identified.
And secondly, identifying the reimbursement bill image.
The first step and the second step have no precedence relation in the execution sequence, the first step and the second step represent two processes, the first step represents a process of identifying the bill image to obtain bill data, and the second step represents a process of identifying the reimbursement bill image to obtain to-be-reimbursed data.
In some embodiments, the data in the reimbursement form may be identified by form identification. For example, the method can be implemented in the following two ways:
the first method is as follows: identifying the table lines in the reimbursement bill, identifying characters in each area by taking an area formed by intersecting a plurality of table lines as a unit, and identifying results based on the characters in adjacent areas. As shown in the report image 401 of fig. 4, the report is divided into character regions each formed of a rectangular region on the basis of the table line in the report. And finding out the association between different character areas through semantic analysis, and outputting the structured data based on the association relation to obtain the structured data to be reimbursed. For example, the output data format to be reimbursed is: the catering fee, the amount and the amount are integrated; lodging fees, amounts, two-eight-ten yuan complete.
The second method comprises the following steps: the reimbursement form of each company usually has a fixed layout template, and the name of the head of the reimbursement form can be used as an index, such as a travel reimbursement form. And calling the same type of template from a plurality of pre-stored layout templates to complete the reimbursement bill identification and output the structured data based on the reference area (such as date) marked in the template and the area to be identified (such as 12 months and 12 days in 2020) associated with the reference area. In the template matching process, character recognition can be performed on the whole image, and then a part which is completely matched with characters in the reference area is found from the character recognition result based on the character recognition result, so that the target area to be recognized corresponding to the reference area is positioned in the reimbursement note, and image recognition is completed.
And thirdly, checking whether the reimbursement data meets the reimbursement requirement.
In some embodiments, the check may be for a number of cases:
the first condition is as follows: judging whether the information in the reimbursement bill is matched with the actually submitted bill, namely judging whether the data to be reimbursed is matched with the bill data; if at least one of the following exists, the data is considered as not matching; for example, the amount of the ticket is less than the amount in the invoice, the type of ticket does not match the type of ticket indicated in the invoice, the date of the ticket does not match, or the company name of the ticket does not match.
Case two: judging whether the upper limit of the single bill sum meets the requirement; whether the document types with detail requirements are accompanied with the details; whether the number on the bill can be searched from a preset identification information base or not, namely whether the number can be searched from an official way or not is judged to judge whether the bill is suspected to be forged or not; for example, the verification may be performed by a call from a third party to a website.
Case three: and inquiring whether the reimbursement bill passes layer-by-layer examination and approval from the internal system.
And fourthly, carrying out reimbursement examination and approval.
In some embodiments, for all the first to third steps and checking, the money-making operation is automatically completed by calling relevant financial data, such as a bank card number, based on the basic information of the reimburser in the reimbursement bill. That is, based on the recognition result, the amount of money to be paid, the payment account, the collection account, the remark information (in which item the charge is generated, and accordingly, the implemented money of the item can be recorded, and specifically, the content of each item in the recognition result) and the like are determined, the money to be paid is completed, the reimbursement personnel is reminded through the system to pass the reimbursement, and the reminder is given after the money is received. The reminding mode can include but is not limited to mail, short message, telephone voice and the like.
In the embodiment of the application, firstly, the bill is matched with each template to obtain the confidence coefficient of the bill, so that the high confidence coefficient is determined as the identification template to be called; then, through bill identification and reimbursement bill identification, the identification results of the bill identification and the reimbursement bill identification are matched to check the reliability of reimbursement data; and finally, carrying out automatic reimbursement examination and approval, making money and reminding an reimburser. Therefore, the identification and the association of the bill and the reimbursement bill are automatically realized, and manpower and material resources are saved; and the correlation result is checked, and whether the reimbursement data is real and effective is judged based on respective filling specifications and payment regulations, so that manpower and material resources are saved.
The embodiment of the present application provides a financial data's reimbursement device, and fig. 6 is financial data's reimbursement device's that the embodiment of the present application provides structural composition sketch map, as shown in fig. 6, financial data's reimbursement device 600 includes:
the first identification module 601 is configured to identify an acquired bill image to obtain bill data;
the second identification module 602 is configured to identify the acquired reimbursement note image to obtain data to be reimbursed;
a first determining module 603 configured to determine financial data associated with the reimbursement slip image in response to the pending reimbursement data satisfying a first reimbursement requirement and the instrument data satisfying a second reimbursement requirement;
a first reimbursement module 604, configured to reimburse at least a portion of the data in the billing data based on the financial data.
In the above apparatus, the pair of first identification modules 601 includes:
the first extraction submodule is used for extracting an image area where the invoice to be identified is located in the bill image to obtain at least one area image;
and the first identification submodule is used for identifying the area image to obtain the bill data.
In the above apparatus, one invoice to be identified corresponds to one area image, the apparatus further includes:
and the correction processing module is used for responding to the non-positive state of the area image, performing correction processing on the picture content in the area image, and taking the image obtained after the correction processing as the area image.
In the above apparatus, the first identification submodule includes:
the first acquisition unit is used for acquiring the invoice category to which the area image belongs;
the first searching unit is used for searching a target bill template matched with the invoice type in a preset bill template library;
the first identification unit is used for responding to the searched target bill template and carrying out character identification on the character area in the area image based on the target bill template to obtain a character identification result;
and the first determining unit is used for obtaining the bill data based on the character recognition result and the incidence relation among different character areas.
In the above apparatus, the first identification submodule includes:
the second determining unit is used for responding to the target bill template which is not found, and performing character recognition on the regional image to obtain a first overall recognition result; based on semantic information in the regional image, adjusting the first global identification result to obtain an intermediate output result, and using the intermediate output result as the bill data, or sending the regional image and the intermediate output result to a check node to acquire the bill data from the check node;
alternatively, the first and second electrodes may be,
the third determining unit is used for responding to the target bill template which is not found, outputting return prompt information to acquire an invoice image corresponding to the area image; determining invoice information of the invoice image, searching a bill template matched with the invoice information in the preset bill template library, using the bill template as the target bill template, and performing character recognition on the area image to obtain the bill data.
In the above apparatus, the apparatus further comprises:
the first generation module is used for responding to the condition that the bill template matched with the invoice information is not found, and generating a new bill template based on the invoice information;
and the first adding module is used for adding the new bill template to the preset bill template library.
In the above apparatus, the second identifying module 602 includes:
the second identification submodule is used for identifying table lines in the reimbursement bill image to obtain a plurality of table areas formed by intersecting the table lines;
the third identification submodule is used for identifying characters in the table area to obtain a table identification result;
and the first matching submodule is used for matching characters in the table identification results corresponding to different table areas based on the incidence relation among the different table areas to obtain the data to be reimbursed.
In the above apparatus, the second identifying module 602 includes:
the first determining submodule is used for determining the type of the reimbursement bill in the reimbursement bill image;
the first searching sub-module is used for searching a target layout template matched with the reimbursement bill type in a preset layout template library;
the second determining submodule is used for responding to the searched target layout template and determining a reference area comprising fixed fields and an area to be identified comprising variable fields in the target layout template;
and the third identification submodule is used for identifying characters in the reimbursement bill image based on the reference area and the area to be identified to obtain the data to be reimbursed.
In the above apparatus, the third identifying sub-module includes:
the second identification unit is used for carrying out overall identification on the characters in the reimbursement note image to obtain a second overall identification result;
a second searching unit, configured to search, in the second global recognition result, a partial recognition result that matches each of the reference regions;
a fourth determination unit configured to determine, based on the partial recognition result, a target to-be-recognized region associated with a reference region corresponding to the partial recognition result;
and the first matching unit is used for matching the fixed characters positioned in the reference area and the variable fields positioned in the target area to be recognized in the second global recognition result based on the incidence relation between each reference area and the target area to be recognized to obtain the data to be reimbursed.
In the above apparatus, the apparatus further comprises:
the second generation module is used for responding to the situation that the target layout template is not found, and generating a new layout template based on the reimbursement note type;
and the first updating module is used for updating the preset layout template base based on the new layout template.
In the above apparatus, the first reimbursement requirement is that the number of approval pass information included in the data to be reimbursed is equal to a preset value; the second reimbursement requirement is that the ticket data matches the reimbursement data, and the ticket data satisfies the external reimbursement limit.
In the above apparatus, the apparatus further comprises:
the first classification module is used for classifying the data to be reimbursed based on a fixed field in a reference area of the reimbursement bill image to obtain a reimbursement class set;
the second determining module is used for determining the single-class data of the invoice to be identified corresponding to each reimbursement class in the bill data;
the third determining module is used for determining the matching degree between the data to be reimbursed corresponding to each reimbursement category and the single-class data of each reimbursement category for each reimbursement category;
the fourth determining module is used for responding to the fact that the matching degree is larger than or equal to the preset matching degree threshold value, and determining that the bill data is matched with the data to be reimbursed;
the fifth determining module is used for determining the bill data of each invoice to be identified in the bill data;
the sixth determining module is used for determining the amount of the single tickets in the single-ticket data, and/or determining the types of the target invoices with additional detail requirements, and/or determining the identification information of the invoices to be identified corresponding to the single-ticket data;
and the seventh determining module is used for responding to the fact that the amount of the single bill is smaller than or equal to a preset amount upper limit, and/or responding to the fact that the detail data of the target invoice type is matched with the detail requirement, and/or responding to the fact that the marking information is contained in a preset bill identification library, determining that the bill data meets the external reimbursement limit, and determining that the bill data meets the second reimbursement requirement.
In the above apparatus, the first determining module 603 includes:
the third determining submodule is used for responding that the data to be reimbursed meets the first reimbursement requirement and the bill data meets the second reimbursement requirement, and at least determining the financial account number and reimburser information bound by the reimbursement note image;
and the fourth determining submodule is used for taking the financial account and the reimburser information as the financial data.
In the above apparatus, the first reimbursement module 604 includes:
the fifth determining submodule is used for determining the total amount of the invoices to be identified in each reimbursement category in the bill image;
a sixth determining submodule, configured to determine an amount to be reimbursed for each reimbursement category in the data to be reimbursed;
a first reimbursement submodule, configured to reimburse the invoices to be identified for each of the reimbursement categories based on the financial data in response to the total amount being less than or equal to the reimbursement amount;
a seventh determining sub-module, configured to determine, in response to the total amount being greater than the to-be-reimbursed amount, a plurality of candidate invoices with a sum of amounts less than or equal to the to-be-reimbursed amount among the to-be-identified invoices of the reimbursed category;
and the second reimbursement submodule is used for reimbursing the candidate invoices based on the financial data.
In the above apparatus, the apparatus further comprises:
the eighth determining module is used for determining the amount of the reimbursed money in the bill data;
and the third generation module is used for generating and outputting prompt information based on the reimbursed amount.
It should be noted that the above description of the embodiment of the apparatus, similar to the above description of the embodiment of the method, has similar beneficial effects as the embodiment of the method. For technical details not disclosed in the embodiments of the apparatus of the present application, reference is made to the description of the embodiments of the method of the present application for understanding.
It should be noted that, in the embodiment of the present application, if the method for reimbursing financial data is implemented in the form of a software functional module and is sold or used as a standalone product, it may also be stored in a computer readable storage medium. Based on such understanding, the technical solutions of the embodiments of the present application may be essentially or partially implemented in the form of a software product, which is stored in a storage medium and includes several instructions to enable an electronic device (which may be a terminal, a server, etc.) to execute all or part of the methods described in the embodiments of the present application. And the aforementioned storage medium includes: various media capable of storing program codes, such as a usb disk, a hard disk drive, a Read Only Memory (ROM), a magnetic disk, or an optical disk. Thus, embodiments of the present application are not limited to any specific combination of hardware and software.
Correspondingly, an embodiment of the present application further provides a computer program product, where the computer program product includes computer-executable instructions, and after the computer-executable instructions are executed, the steps in the method for reimbursing financial data provided by the embodiment of the present application can be implemented.
Accordingly, an embodiment of the present application further provides a computer storage medium, where computer-executable instructions are stored on the computer storage medium, and when executed by a processor, the computer-executable instructions implement the steps of the method for reimbursing financial data provided in the foregoing embodiment.
Accordingly, an electronic device is provided in an embodiment of the present application, fig. 7 is a schematic view of a composition structure of the electronic device provided in the embodiment of the present application, and as shown in fig. 7, the electronic device 700 includes: a processor 701, at least one communication bus, a communication interface 702, at least one external communication interface, and a memory 703. Wherein communication interface 702 is configured to enable connectivity communications between these components. The communication interface 702 may include a display screen, and the external communication interface may include a standard wired interface and a wireless interface, among others. The processor 701 is configured to execute an image processing program in a memory to implement the steps of the method for reimbursing financial data provided in the foregoing embodiments.
The above descriptions of the embodiments of the financial data reimbursement device, the electronic device and the storage medium are similar to the above descriptions of the embodiments of the method, have similar technical descriptions and advantages to the corresponding embodiments of the method, and are limited by the space. For technical details not disclosed in the embodiments of the present disclosure of the apparatus for reimbursement of financial data, electronic device and storage medium, reference is made to the description of the embodiments of the method of the present disclosure for understanding.
It should be appreciated that reference throughout this specification to "one embodiment" or "an embodiment" means that a particular feature, structure or characteristic described in connection with the embodiment is included in at least one embodiment of the present application. Thus, the appearances of the phrases "in one embodiment" or "in an embodiment" in various places throughout this specification are not necessarily all referring to the same embodiment. Furthermore, the particular features, structures, or characteristics may be combined in any suitable manner in one or more embodiments. It should be understood that, in the various embodiments of the present application, the sequence numbers of the above-mentioned processes do not mean the execution sequence, and the execution sequence of each process should be determined by its function and inherent logic, and should not constitute any limitation to the implementation process of the embodiments of the present application. The above-mentioned serial numbers of the embodiments of the present application are merely for description and do not represent the merits of the embodiments. It should be noted that, in this document, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other like elements in a process, method, article, or apparatus that comprises the element.
In the several embodiments provided in the present application, it should be understood that the disclosed apparatus and method may be implemented in other ways. The above-described device embodiments are merely illustrative, for example, the division of the unit is only a logical functional division, and there may be other division ways in actual implementation, such as: multiple units or components may be combined, or may be integrated into another system, or some features may be omitted, or not implemented. In addition, the coupling, direct coupling or communication connection between the components shown or discussed may be through some interfaces, and the indirect coupling or communication connection between the devices or units may be electrical, mechanical or other forms.
The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units; can be located in one place or distributed on a plurality of network units; some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.
In addition, all functional units in the embodiments of the present application may be integrated into one processing unit, or each unit may be separately regarded as one unit, or two or more units may be integrated into one unit; the integrated unit can be realized in a form of hardware, or in a form of hardware plus a software functional unit. Those of ordinary skill in the art will understand that: all or part of the steps for realizing the method embodiments can be completed by hardware related to program instructions, the program can be stored in a computer readable storage medium, and the program executes the steps comprising the method embodiments when executed; and the aforementioned storage medium includes: various media that can store program codes, such as a removable Memory device, a Read Only Memory (ROM), a magnetic disk, or an optical disk.
Alternatively, the integrated units described above in the present application may be stored in a computer-readable storage medium if they are implemented in the form of software functional modules and sold or used as independent products. Based on such understanding, the technical solutions of the embodiments of the present application may be essentially implemented or portions thereof that contribute to the prior art may be embodied in the form of a software product, which is stored in a storage medium and includes several instructions for enabling an electronic device (which may be a personal computer, a server, or a network device) to execute all or part of the methods described in the embodiments of the present application. And the aforementioned storage medium includes: a removable storage device, a ROM, a magnetic or optical disk, or other various media that can store program code. The above description is only for the specific embodiments of the present application, but the scope of the present application is not limited thereto, and any person skilled in the art can easily conceive of the changes or substitutions within the technical scope of the present application, and shall be covered by the scope of the present application. Therefore, the protection scope of the present application shall be subject to the protection scope of the claims.

Claims (18)

1. A method of reimbursing financial data, the method comprising:
identifying the acquired bill image to obtain bill data;
identifying the acquired reimbursement bill image to obtain data to be reimbursed;
in response to the outstanding data meeting a first reimbursement requirement and the ticket data meeting a second reimbursement requirement, determining financial data associated with the reimbursement slip image;
and performing reimbursement processing on at least part of the data in the bill data based on the financial data.
2. The method of claim 1, wherein said identifying the captured document image to obtain document data comprises:
extracting an image area where an invoice to be identified is located in the bill image to obtain at least one area image;
and identifying the area image to obtain the bill data.
3. The method according to claim 1 or 2, wherein an invoice to be identified corresponds to an area image, and before the identifying the area image, the method further comprises:
and responding to the non-positive state of the area image, carrying out correction processing on the picture content in the area image, and taking the image obtained after the correction processing as the area image.
4. The method of claim 2, wherein the identifying the region image to obtain the ticket data comprises:
acquiring the invoice category to which the area image belongs;
searching a target bill template matched with the bill category in a preset bill template library;
responding to the searched target bill template, and performing character recognition on the character area in the area image based on the target bill template to obtain a character recognition result;
and obtaining the bill data based on the character recognition result and the incidence relation among different character areas.
5. The method of claim 4, wherein the identifying the region image to obtain the ticket data comprises:
in response to that the target bill template is not found, performing character recognition on the area image to obtain a first overall recognition result; based on semantic information in the regional image, adjusting the first global identification result to obtain an intermediate output result, and using the intermediate output result as the bill data, or sending the regional image and the intermediate output result to a check node to acquire the bill data from the check node;
alternatively, the first and second electrodes may be,
responding to the target bill template which is not found, outputting return prompt information to obtain an invoice image corresponding to the area image; determining invoice information of the invoice image, searching a bill template matched with the invoice information in the preset bill template library, using the bill template as the target bill template, and performing character recognition on the area image to obtain the bill data.
6. The method of claim 5, further comprising:
in response to not finding a ticket template matching the invoice information, generating a new ticket template based on the invoice information;
and adding the new bill template to the preset bill template library.
7. The method according to any one of claims 1 to 6, wherein the identifying the acquired reimbursement slip image to obtain the data to be reimbursed comprises:
identifying table lines in the reimbursement bill image to obtain a plurality of table areas formed by intersecting the table lines;
identifying characters in the table area to obtain a table identification result;
and matching characters in the table identification results corresponding to different table areas based on the incidence relation among the different table areas to obtain the data to be reimbursed.
8. The method according to any one of claims 1 to 7, wherein the identifying the acquired reimbursement slip image to obtain the data to be reimbursed comprises:
determining a reimbursement bill type in the reimbursement bill image;
searching a target layout template matched with the reimbursement bill type in a preset layout template library;
in response to the target layout template is found, determining a reference area comprising a fixed field and an area to be identified comprising a variable field in the target layout template;
and identifying characters in the reimbursement bill image based on the reference area and the area to be identified to obtain the data to be reimbursed.
9. The method according to claim 8, wherein the identifying the text in the reimbursement note image based on the reference area and the area to be identified to obtain the reimbursement data comprises:
integrally identifying characters in the reimbursement bill image to obtain a second global identification result;
searching a partial recognition result matched with each reference region in the second global recognition result;
determining a target region to be recognized associated with a reference region corresponding to the partial recognition result based on the partial recognition result;
and matching the fixed characters positioned in the reference area and the variable fields positioned in the target area to be recognized in the second global recognition result based on the association relationship between each reference area and the target area to be recognized to obtain the data to be reimbursed.
10. The method of claim 9, wherein after searching for a target layout template matching the reimbursement form type in a preset layout template library, the method further comprises:
in response to not finding the target layout template, generating a new layout template based on the reimbursement slip type;
and updating the preset layout template base based on the new layout template.
11. The method according to any one of claims 1 to 10, wherein the first reimbursement requirement is that the data to be reimbursed comprises a quantity of approval pass information equal to a preset value; the second reimbursement requirement is that the ticket data matches the reimbursement data, and the ticket data satisfies the external reimbursement limit.
12. The method of claim 11, wherein prior to determining financial data associated with the reimbursement slip image in response to the reimbursement data satisfying a first reimbursement requirement and the instrument data satisfying a second reimbursement requirement, the method further comprises:
classifying the data to be reimbursed based on a fixed field in a reference area of the reimbursement bill image to obtain a reimbursement class set;
determining the bill data of the invoice to be identified corresponding to each reimbursement category in the bill data;
for each reimbursement category, determining the matching degree between the data to be reimbursed corresponding to each reimbursement category and the single-class data of each reimbursement category;
responding to the matching degree greater than or equal to the preset matching degree threshold value, and determining that the bill data is matched with the data to be reimbursed;
determining bill data belonging to each invoice to be identified in the bill data;
determining the amount of the single tickets in the single-ticket data, and/or determining the types of target invoices with additional detail requirements, and/or determining the identification information of the invoices to be identified corresponding to the single-ticket data;
and in response to the fact that the amount of the single bill is smaller than or equal to a preset amount upper limit, and/or in response to the fact that the detail data of the target invoice type is matched with the detail requirement, and/or in response to the fact that the marking information is contained in a preset bill identification library, determining that the bill data meets an external reimbursement limit, and determining that the bill data meets the second reimbursement requirement.
13. The method of any of claims 1 to 12, wherein determining financial data associated with the reimbursement note image in response to the reimbursement data satisfying a first reimbursement requirement and the instrument data satisfying a second reimbursement requirement comprises:
at least determining the financial account and the reimburser information bound by the reimbursement note image in response to the data to be reimbursed meeting a first reimbursement requirement and the bill data meeting a second reimbursement requirement;
and taking the financial account number and the reimburser information as the financial data.
14. The method of any one of claims 1 to 13, wherein said reimbursing at least some of said billing data based on said financial data comprises:
determining the total amount of the invoices to be identified of each reimbursement category in the bill image;
determining the amount to be reimbursed for each reimbursement category in the data to be reimbursed;
in response to the total amount being less than or equal to the reimbursement amount, reimbursement is made for invoices to be identified for each of the reimbursement categories based on the financial data;
in response to the total amount being greater than the amount to be reimbursed, determining a plurality of candidate invoices having a sum of amounts less than or equal to the amount to be reimbursed among the invoices to be identified for the reimbursement category;
reimbursement is performed for the plurality of candidate invoices based on the financial data.
15. The method of any of claims 1 to 14, wherein after said reimbursement processing of said billing data based on said financial data, said method further comprises:
determining the amount of the completed reimbursement in the bill data;
and generating and outputting prompt information based on the reimbursed amount.
16. An apparatus for reimbursement of financial data, the apparatus comprising:
the first identification module is used for identifying the acquired bill image to obtain bill data;
the second identification module is used for identifying the acquired reimbursement bill image to obtain data to be reimbursed;
a first determination module to determine financial data associated with the reimbursement slip image in response to the backlog data meeting a first reimbursement requirement and the ticket data meeting a second reimbursement requirement;
and the first reimbursement module is used for reimbursing at least part of data in the bill data based on the financial data.
17. A computer storage medium having computer-executable instructions stored thereon that, when executed, perform the method steps of any of claims 1 to 15.
18. An electronic device, comprising a memory having computer-executable instructions stored thereon and a processor capable of performing the method steps of any of claims 1-15 when executing the computer-executable instructions on the memory.
CN202110249954.6A 2021-03-08 2021-03-08 Financial data reimbursement method, device, equipment and storage medium Pending CN112801041A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110249954.6A CN112801041A (en) 2021-03-08 2021-03-08 Financial data reimbursement method, device, equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110249954.6A CN112801041A (en) 2021-03-08 2021-03-08 Financial data reimbursement method, device, equipment and storage medium

Publications (1)

Publication Number Publication Date
CN112801041A true CN112801041A (en) 2021-05-14

Family

ID=75816659

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110249954.6A Pending CN112801041A (en) 2021-03-08 2021-03-08 Financial data reimbursement method, device, equipment and storage medium

Country Status (1)

Country Link
CN (1) CN112801041A (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113239881A (en) * 2021-06-03 2021-08-10 上海中通吉网络技术有限公司 Invoice reimbursement method
CN113326895A (en) * 2021-06-25 2021-08-31 湖南星汉数智科技有限公司 Passenger ticket travel itinerary identification method and device, computer equipment and storage medium
CN113704823A (en) * 2021-08-30 2021-11-26 长城计算机软件与系统有限公司 Reimbursement processing method, system, storage medium and electronic equipment

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107194400A (en) * 2017-05-31 2017-09-22 北京天宇星空科技有限公司 A kind of finance reimbursement unanimous vote is according to picture recognition processing method
CN107358232A (en) * 2017-06-28 2017-11-17 中山大学新华学院 Invoice recognition methods and identification and management system based on plug-in unit
CN109934554A (en) * 2019-01-29 2019-06-25 远光软件股份有限公司 A kind of method, electric terminal and storage medium for examining invoice
CN109977957A (en) * 2019-03-04 2019-07-05 苏宁易购集团股份有限公司 A kind of invoice recognition methods and system based on deep learning
CN110264288A (en) * 2019-05-20 2019-09-20 深圳壹账通智能科技有限公司 Data processing method and relevant apparatus based on information discriminating technology
CN111931664A (en) * 2020-08-12 2020-11-13 腾讯科技(深圳)有限公司 Mixed note image processing method and device, computer equipment and storage medium
CN112241727A (en) * 2020-10-30 2021-01-19 深圳供电局有限公司 Multi-ticket identification method and system and readable storage medium

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107194400A (en) * 2017-05-31 2017-09-22 北京天宇星空科技有限公司 A kind of finance reimbursement unanimous vote is according to picture recognition processing method
CN107358232A (en) * 2017-06-28 2017-11-17 中山大学新华学院 Invoice recognition methods and identification and management system based on plug-in unit
CN109934554A (en) * 2019-01-29 2019-06-25 远光软件股份有限公司 A kind of method, electric terminal and storage medium for examining invoice
CN109977957A (en) * 2019-03-04 2019-07-05 苏宁易购集团股份有限公司 A kind of invoice recognition methods and system based on deep learning
CN110264288A (en) * 2019-05-20 2019-09-20 深圳壹账通智能科技有限公司 Data processing method and relevant apparatus based on information discriminating technology
CN111931664A (en) * 2020-08-12 2020-11-13 腾讯科技(深圳)有限公司 Mixed note image processing method and device, computer equipment and storage medium
CN112241727A (en) * 2020-10-30 2021-01-19 深圳供电局有限公司 Multi-ticket identification method and system and readable storage medium

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113239881A (en) * 2021-06-03 2021-08-10 上海中通吉网络技术有限公司 Invoice reimbursement method
CN113326895A (en) * 2021-06-25 2021-08-31 湖南星汉数智科技有限公司 Passenger ticket travel itinerary identification method and device, computer equipment and storage medium
CN113704823A (en) * 2021-08-30 2021-11-26 长城计算机软件与系统有限公司 Reimbursement processing method, system, storage medium and electronic equipment
CN113704823B (en) * 2021-08-30 2024-03-29 新长城科技有限公司 Reimbursement processing method, reimbursement processing system, storage medium and electronic equipment

Similar Documents

Publication Publication Date Title
US20230377032A1 (en) System and method for processing transaction records for users
CN109887153B (en) Finance and tax processing method and system
CN108090823B (en) Accounting data management system based on software as a service (SaaS)
CN112801041A (en) Financial data reimbursement method, device, equipment and storage medium
US9916606B2 (en) System and method for processing a transaction document including one or more financial transaction entries
JP6179848B2 (en) Book creation system, method and program, and print slip
US9449347B2 (en) Method and apparatus for processing receipts
JP6712738B1 (en) Voucher judging device, accounting processor, voucher judging program, voucher judging system and voucher judging method
US20140064618A1 (en) Document information extraction using geometric models
JP2006511896A (en) Receipt and related data capture, storage and processing systems and methods
CN110956739A (en) Bill identification method and device
CN109299762A (en) A kind of business finance reimbursement management system based on big data
CN107798515A (en) A kind of method that database automatically generates accounting voucher
US20140046791A1 (en) Information processing device, information processing method, information processing program, and recording medium in which information processing program is recorded
US20140198969A1 (en) Device and Method for Contribution Accounting
US20140268250A1 (en) Systems and methods for receipt-based mobile image capture
CN110648211A (en) Data validation
CN109271951A (en) A kind of method and system promoting book keeping operation review efficiency
JP6635563B1 (en) Journal element analysis device, accounting processing system, journal element analysis method, journal element analysis program
CN111914729A (en) Voucher association method and device, computer equipment and storage medium
CN113850659A (en) Reimbursement data generation method and device, electronic equipment and storage medium
US20210256288A1 (en) Bill identification method, device, electronic device and computer-readable storage medium
JP6528074B1 (en) Accounting processor, accounting method, accounting program
CN112668335A (en) Method for identifying and extracting business license structured information by using named entity
JP6402397B1 (en) Accounting device, accounting method, accounting program

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination