EP3494531A1 - Système et procédé de génération de données consolidées pour documents électroniques - Google Patents

Système et procédé de génération de données consolidées pour documents électroniques

Info

Publication number
EP3494531A1
EP3494531A1 EP17837779.2A EP17837779A EP3494531A1 EP 3494531 A1 EP3494531 A1 EP 3494531A1 EP 17837779 A EP17837779 A EP 17837779A EP 3494531 A1 EP3494531 A1 EP 3494531A1
Authority
EP
European Patent Office
Prior art keywords
electronic document
expense
template
determined
data
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Withdrawn
Application number
EP17837779.2A
Other languages
German (de)
English (en)
Other versions
EP3494531A4 (fr
Inventor
Noam Guzman
Isaac SAFT
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Vatbox Ltd
Original Assignee
Vatbox Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Priority claimed from US15/361,934 external-priority patent/US20170154385A1/en
Application filed by Vatbox Ltd filed Critical Vatbox Ltd
Publication of EP3494531A1 publication Critical patent/EP3494531A1/fr
Publication of EP3494531A4 publication Critical patent/EP3494531A4/fr
Withdrawn legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q40/00Finance; Insurance; Tax strategies; Processing of corporate or income taxes
    • G06Q40/12Accounting
    • G06Q40/123Tax preparation or submission
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/10Office automation; Time management
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q20/00Payment architectures, schemes or protocols
    • G06Q20/08Payment architectures
    • G06Q20/14Payment architectures specially adapted for billing systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q20/00Payment architectures, schemes or protocols
    • G06Q20/38Payment protocols; Details thereof
    • G06Q20/40Authorisation, e.g. identification of payer or payee, verification of customer or shop credentials; Review and approval of payers, e.g. check credit lines or negative lists
    • G06Q20/405Establishing or using transaction specific rules
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/06Buying, selling or leasing transactions
    • G06Q30/0601Electronic shopping [e-shopping]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/06Buying, selling or leasing transactions
    • G06Q30/0601Electronic shopping [e-shopping]
    • G06Q30/0633Lists, e.g. purchase orders, compilation or processing
    • G06Q30/0635Processing of requisition or of purchase orders

Definitions

  • the present disclosure relates generally to verifying files in data systems, and more particularly to verifying requests based on contents of electronic documents.
  • a customer may input credit card information pursuant to a payment, and the merchant may verify the credit card information in real-time before authorizing the sale. The verification typically includes determining whether the provided information is valid (i.e., that a credit card number, expiration date, PIN code, and/or customer name match known information).
  • a purchase order may be generated for the customer.
  • the purchase order provides evidence of the order such as, for example, a purchase price, goods and/or services ordered, and the like.
  • an invoice for the order may be generated. While the purchase order is usually used to indicate which products are requested and an estimate or offering for the price, the invoice is usually used to indicate which products were actually provided and the final price for the products. Frequently, the purchase price as demonstrated by the invoice for the order is different from the purchase price as demonstrated by the purchase order. As an example, if a guest at a hotel initially orders a 3-night stay but ends up staying a fourth night, the total price of the purchase order may reflect a different total price than that of the subsequent invoice.
  • existing image recognition solutions may be unable to accurately identify some or all special characters (e.g., "!,” “@,” “#,” “$,” “ ⁇ ,” “%,” “&,” etc.).
  • some existing image recognition solutions may inaccurately identify a dash included in a scanned receipt as the number “1 .”
  • some existing image recognition solutions cannot identify special characters such as the dollar sign, the yen symbol, etc.
  • such solutions may face challenges in preparing recognized information for subsequent use. Specifically, many such solutions either produce output in an unstructured format, or can only produce structured output if the input electronic documents are specifically formatted for recognition by an image recognition system. The resulting unstructured output typically cannot be processed efficiently. In particular, such unstructured output may contain duplicates, and may include data that requires subsequent processing prior to use.
  • Deductible expenses are expenses that are subtracted from a company's income before it is subject to taxation. Standard business deductions may include, for example, general and administrative expenses, business- related travel and entertainment expenses, automobile expenses, and employee benefits. Some business expenses are "current” and must be deducted in the year that they are paid, while others are “capitalized” and, therefore, are spread out or depreciated over time.
  • Certain embodiments disclosed herein include a method for generating consolidated data based on electronic documents.
  • the method comprises: analyzing a first electronic document to determine at least one transaction parameter, the first electronic document indicating a transaction including at least one expense, wherein the first electronic document includes at least partially unstructured data; creating a template for the first electronic document, wherein the template is a structured dataset including the determined at least one transaction parameter; retrieving, based on the template, a second electronic document, wherein the second electronic document indicates evidence of the transaction; determining at least one deductible expense of the at least one expense based on at least one deduction rule, the template, and the second electronic document; and generating consolidation metadata based on the determined at least one deductible expense.
  • Certain embodiments disclosed herein also include a non-transitory computer readable medium having stored thereon instructions for causing a processing circuitry to perform a process generating consolidated data based on electronic documents, the process comprising: analyzing a first electronic document to determine at least one transaction parameter, the first electronic document indicating a transaction including at least one expense, wherein the first electronic document includes at least partially unstructured data; creating a template for the first electronic document, wherein the template is a structured dataset including the determined at least one transaction parameter; retrieving, based on the template, a second electronic document, wherein the second electronic document indicates evidence of the transaction; determining at least one deductible expense of the at least one expense based on at least one deduction rule, the template, and the second electronic document; and generating consolidation metadata based on the determined at least one deductible expense.
  • Certain embodiments disclosed herein also include a system generating consolidated data based on electronic documents.
  • the system comprises: a processing circuitry; and a memory, the memory containing instructions that, when executed by the processing circuitry, configure the system to: analyze a first electronic document to determine at least one transaction parameter, the first electronic document indicating a transaction including at least one expense, wherein the first electronic document includes at least partially unstructured data; create a template for the first electronic document, wherein the template is a structured dataset including the determined at least one transaction parameter; retrieve, based on the template, a second electronic document, wherein the second electronic document indicates evidence of the transaction; determine at least one deductible expense of the at least one expense based on at least one deduction rule, the template, and the second electronic document; and generate consolidation metadata based on the determined at least one deductible expense.
  • Figure 1 is a network diagram utilized to describe the various disclosed embodiments.
  • Figure 2 is a schematic diagram of a validation system according to an embodiment.
  • Figure 3 is a flowchart illustrating a method for consolidating electronic documents according to an embodiment.
  • Figure 4 is a flowchart illustrating a method for creating a dataset based on at least one electronic document according to an embodiment.
  • the various disclosed embodiments include a method and system for consolidating electronic documents.
  • a dataset is created based on a first expense report electronic document indicating information related to a transaction.
  • a template of transaction attributes is created based on the first electronic document dataset.
  • a second evidencing electronic document providing evidence of the transaction may be retrieved.
  • the expense report electronic document and the evidencing electronic document may be compared to determine whether there is a difference in values of one or more transaction parameters indicated therein. When there is a difference, a cause of the difference may be determined.
  • deduction rules are retrieved.
  • one or more deductible expenses may be determined. Metadata indicating the determined deductible expenses is generated and sent to, an enterprise system.
  • a consolidated expense report electronic document may be generated.
  • the consolidated expense report electronic document indicates expenses for the different enterprises.
  • the metadata of the deductible expenses may be utilized as consolidation data for reporting consolidated expenses.
  • the disclosed embodiments allow for automatic consolidation of electronic documents with respect to deductible expenses indicated therein. More specifically, the disclosed embodiments include providing structured dataset templates for electronic documents, thereby allowing for retrieving evidencing documents based on electronic expense reports that are unstructured, semi-structured, or otherwise lacking a known structure. For example, the disclosed embodiments may be used to effectively analyze images of scanned expense reports for transactions, thereby allowing for more accurate recognition of portions of the expense reports requiring evidence and, consequently, of appropriate documentation evidencing the transactions. The determined deductible expenses may be utilized to create consolidated expense reports indicating valid expenses.
  • Fig. 1 shows an example network diagram 100 utilized to describe the various disclosed embodiments.
  • a consolidated data generator 120 an enterprise system 130, a database 140, and a plurality of web sources 150-1 through 150-N (hereinafter referred to individually as a web source 150 and collectively as web sources 150, merely for simplicity purposes), are communicatively connected via a network 1 10.
  • the network 1 10 may be, but is not limited to, a wireless, cellular or wired network, a local area network (LAN), a wide area network (WAN), a metro area network (MAN), the Internet, the worldwide web (WWW), similar networks, and any combination thereof.
  • LAN local area network
  • WAN wide area network
  • MAN metro area network
  • WWW worldwide web
  • the enterprise system 130 is associated with an enterprise, and may store data related to purchases made by the enterprise or representatives of the enterprise as well as enterprise characteristic parameters indicating characteristics of the enterprise such as, but not limited to, country of formation, revenue data, structural data, and the like.
  • the enterprise may be, but is not limited to, a business whose employees may purchase goods and services subject to VAT taxes while abroad.
  • the enterprise system 130 may be, but is not limited to, a server, a database, an enterprise resource planning system, a customer relationship management system, or any other system storing relevant data.
  • the purchase-related data may include, for example, electronic documents.
  • Each electronic document stored by the enterprise system 130 may show, e.g., an expense report, or an evidence of a transaction (e.g., a receipt, an invoice, a purchase confirmation, and the like).
  • Data included in each electronic document may be structured, semi-structured, unstructured, or a combination thereof.
  • the structured or semi- structured data may be in a format that is not recognized by the consolidated data generator 120 and, therefore, may be treated as unstructured data.
  • the database 140 may store metadata generated by the consolidated data generator 120 to be utilized for generating consolidated expense report electronic documents.
  • the web sources 150 may store evidencing electronic documents, deduction rules, or both.
  • the evidencing electronic documents may be utilized as evidence for granting requests such as, for example, invoices, tax receipts, order confirmations, and the like.
  • the deduction rules may define expenses that may be deducted (e.g., based on types, amounts, etc.), and may further define such deductible expenses with respect to characteristics of the enterprise such as, but not limited to, country of incorporation, revenue data, structural data (e.g., subsidies), and the like.
  • the web sources 150 may include, but are not limited to, servers or devices of merchants, tax authority servers, accounting servers, a database associated with an enterprise, and the like.
  • the web source 150-1 may be a merchant server storing image files showing invoices for transactions made by a merchant associated with the merchant server
  • the web source 150-2 may be a tax authority server storing deduction rules for expenses incurred in a particular country.
  • the consolidated data generator 120 is configured to create a template based on transaction parameters identified using machine vision of a first expense report electronic document indicating information related to a transaction including one or more expenses.
  • the consolidated data generator 120 may be configured to retrieve the expense report electronic document from, e.g., the enterprise system 130. Based on the created template, the consolidated data generator 120 is configured to retrieve, from one of the web sources 150, a second evidencing electronic document indicating information evidencing the transaction.
  • the consolidated data generator 120 is configured to create datasets based on electronic documents including data at least partially lacking a known structure (e.g., unstructured data, semi-structured data, or structured data having an unknown structure). To this end, the consolidated data generator 120 may be further configured to utilize optical character recognition (OCR) or other image processing to determine data in the electronic document.
  • OCR optical character recognition
  • the consolidated data generator may therefore include or be communicatively connected to a recognition processor (e.g., the recognition processor 235, Fig. 2).
  • the consolidated data generator 120 is configured to analyze the created dataset for the expense report electronic document to identify transaction parameters related to transactions indicated in the expense report electronic document.
  • the transaction parameters indicate information of one or more expenses.
  • the consolidated data generator 120 is configured to create a template based on the dataset.
  • Each template is a structured dataset including the identified transaction parameters for a transaction.
  • the consolidated data generator 120 is configured to retrieve a second evidencing electronic document.
  • the retrieved evidencing electronic document matches the expense report electronic document, for example, with respect to a set of uniquely identifying transaction parameters in each of the evidencing electronic document and the expense report electronic document.
  • the retrieved evidencing electronic document may have the same transaction identifier number, or may have the same date and merchant identifier. If no matching second evidencing electronic document can be retrieved, the consolidated data generator 120 may be configured to determine that no expenses indicated in the expense report electronic document are deductible.
  • Using structured templates for determining whether expenses are deductible allows for more efficient and accurate determination than, for example, by utilizing unstructured data.
  • corresponding deduction rules may be analyzed only with respect to relevant portions of an expense report electronic document (e.g., portions included in specific fields of a structured template), thereby reducing the number of instances of application of each rule as well as reducing false positives due to applying rules to data that is likely unrelated to each rule.
  • the uniquely identifying transaction parameters utilized to retrieve the corresponding evidencing electronic document may be extracted from specific fields of the created template rather than requiring comparison to all unstructured data of the expense report electronic document.
  • the consolidated data generator 120 is configured to determine whether there is a difference in values of one or more transaction parameters in the compared electronic documents.
  • the comparison may include comparing transaction parameters in the created template to transaction parameters indicated in the evidencing electronic document.
  • the transaction parameters compared to determine the difference may be parameters related to expenses, and more specifically may be parameters requiring evidence in order to be successfully deducted.
  • the compared transaction parameters may include a price per expense (e.g., good or service purchased).
  • the difference may be, for example, a numerical difference (e.g., a difference in price, quantity, or both), a proportion, and the like.
  • comparing the electronic documents may further creating an evidencing template for the evidencing electronic document and comparing the evidencing template to the expense report template with respect to corresponding fields of the transaction parameters to be compared. For example, data indicated in a "price" field of each template may be compared. Comparing data of structured templates may further allow for more accurate and efficient determination of differences.
  • the consolidated data generator 120 may be further configured to determine a cause of the difference.
  • the cause of the difference may be determined based on one or more causation rules with respect to the determined difference and the compared transaction parameters.
  • the causation rules may be related to, for example, differences due to additional or refunded purchases, differences in currency exchange rates, taxes (e.g., VATs) that were not charged at a time of purchase, incidental charges, gratuity charges, and other potential reasons that may be indicated by the values of the transaction parameters.
  • a difference of +$100.51 may be associated with an additional night's stay at a hotel and a difference of -$100.51 may be associated with a refund for a night's stay when the type of room has a cost of $100.51 per night.
  • the expense is deductible.
  • certain predetermined causes may be associated with nondeductible expenses. For example, a cause of difference indicating that a refund of 1 night's hotel stay may result in determining that the full expense (i.e., the expense including the refunded night) is not to be deducted.
  • it may be determined whether the expense is partially deductible based on the cause of the difference. For example, if the cause of the difference is a partial refund (e.g., a refund of one night's hotel stay out of 3 total nights), the non-refunded portion of the expense may be determined as the expense to be deducted.
  • the consolidated data generator 120 is configured to retrieve deduction rules from one or more of the web sources 150.
  • the retrieved deduction rules may be retrieved based on the country of formation of the business (e.g., from a web source 150 of a tax authority associated with the country of formation), the structure of the enterprise, the most recent annual revenue for the enterprise, combinations thereof, and the like.
  • the consolidated data generator 120 is configured to apply the retrieved rules to data of the expense report template, the evidencing electronic document, the enterprise characteristics, or a combination thereof, in order to determine whether each expense indicated in the expense report electronic document is deductible. Further, a deductible amount may be determined for each expense. The deductible amount may be determined, for example, as a proportion of the total amount of the expense or a partial amount of the expense (for example, if the expense is determined to be partially deductible as described herein above).
  • the consolidated data generator 120 is configured to generate metadata based on the determined deductible expense.
  • the metadata may include, for example, a deductible amount, an indication of which expenses of the transaction are deductible, the cause of a difference between the expense report and the evidencing document, a combination thereof, and the like.
  • the consolidated data generator 120 may be further configured to generate a notification including the metadata.
  • the generated metadata may be utilized as consolidated data for creating a consolidated expense report.
  • the consolidated data generator 120 may be configured to generate a consolidated expense report electronic document based on a plurality of sets of metadata.
  • the sets of metadata may be related to different expense reports, and may be further related to expense reports from different enterprises.
  • the consolidated expense report electronic document may be utilized for consolidating expenses for, e.g., tax reporting purposes.
  • Fig. 2 is an example schematic diagram of the consolidated data generator 120 according to an embodiment.
  • the consolidated data generator 120 includes a processing circuitry 210 coupled to a memory 215, a storage 220, and a network interface 240.
  • the consolidated data generator 120 may include an optical character recognition (OCR) processor 230.
  • OCR optical character recognition
  • the components of the consolidated data generator 120 may be communicatively connected via a bus 250.
  • the processing circuitry 210 may be realized as one or more hardware logic components and circuits.
  • illustrative types of hardware logic components include field programmable gate arrays (FPGAs), application-specific integrated circuits (ASICs), Application-specific standard products (ASSPs), system-on-a-chip systems (SOCs), general-purpose microprocessors, microcontrollers, digital signal processors (DSPs), and the like, or any other hardware logic components that can perform calculations or other manipulations of information.
  • the memory 215 may be volatile (e.g., RAM, etc.), non-volatile (e.g., ROM, flash memory, etc.), or a combination thereof.
  • computer readable instructions to implement one or more embodiments disclosed herein may be stored in the storage 220.
  • the memory 215 is configured to store software.
  • Software shall be construed broadly to mean any type of instructions, whether referred to as software, firmware, middleware, microcode, hardware description language, or otherwise. Instructions may include code (e.g., in source code format, binary code format, executable code format, or any other suitable format of code).
  • the instructions when executed by the one or more processors, cause the processing circuitry 210 to perform the various processes described herein. Specifically, the instructions, when executed, cause the processing circuitry 210 to generate consolidated data based on electronic documents, as discussed herein.
  • the storage 220 may be magnetic storage, optical storage, and the like, and may be realized, for example, as flash memory or other memory technology, CD-ROM, Digital Versatile Disks (DVDs), or any other medium which can be used to store the desired information.
  • flash memory or other memory technology
  • CD-ROM Compact Discs
  • DVDs Digital Versatile Disks
  • the OCR processor 230 may include, but is not limited to, a feature and/or pattern recognition processor (RP) 235 configured to identify patterns, features, or both, in unstructured data sets. Specifically, in an embodiment, the OCR processor 230 is configured to identify at least characters in the unstructured data. The identified characters may be utilized to create a dataset including data required for verification of a request.
  • RP pattern recognition processor
  • the network interface 240 allows the consolidated data generator 120 to communicate with the enterprise system 130, the database 140, the web sources 150, or a combination of, for the purpose of, for example, collecting metadata, retrieving data, storing data, and the like.
  • Fig. 3 is an example flowchart 300 illustrating a method for generating consolidated data based on electronic documents according to an embodiment.
  • the method may be performed by a consolidated data generator (e.g., the consolidated data generator 120).
  • a dataset is created based on a first expense report electronic document including information related to a transaction.
  • the transaction includes one or more expenses.
  • the expense report electronic document may include, but is not limited to, unstructured data, semi-structured data, structured data with structure that is unanticipated or unannounced, or a combination thereof.
  • S310 may further include analyzing the expense report electronic document using optical character recognition (OCR) to determine data in the electronic document, identifying key fields in the data, identifying values in the data, or a combination thereof.
  • OCR optical character recognition
  • analyzing the expense report dataset may include, but is not limited to, determining transaction parameters such as, but not limited to, at least one entity identifier (e.g., a consumer enterprise identifier, a merchant enterprise identifier, or both), information related to the transaction (e.g., a date, a time, a price, a type of good or service sold, etc.), or both.
  • entity identifier e.g., a consumer enterprise identifier, a merchant enterprise identifier, or both
  • information related to the transaction e.g., a date, a time, a price, a type of good or service sold, etc.
  • analyzing the expense report dataset may also include identifying the expenses of the transaction based on the expense report dataset.
  • a template is created based on the expense report dataset.
  • the template may be, but is not limited to, a data structure including a plurality of fields.
  • the fields may include the identified transaction parameters.
  • the fields may be predefined.
  • Creating templates from electronic documents allows for faster processing due to the structured nature of the created templates. For example, query and manipulation operations may be performed more efficiently on structured datasets than on datasets lacking such structure. Further, organizing information from electronic documents into structured datasets, the amount of storage required for saving information contained in electronic documents may be significantly reduced. Electronic documents are often images that require more storage space than datasets containing the same information. For example, datasets representing data from 100,000 image electronic documents can be saved as data records in a text file. A size of such a text file would be significantly less than the size of the 100,000 images.
  • a second evidencing electronic document is retrieved.
  • the retrieved evidencing electronic document indicates evidence of the transaction of the expense report electronic document.
  • S340 includes searching, based on a set of uniquely identifying transaction parameters in the template, in at least one web source.
  • a transaction identification number "123456789" indicated in a "Transaction ID" field of the first template may be utilized as a search query to find the second electronic document based on, e.g., metadata of the second electronic document including the transaction identification number "123456789.”
  • S340 includes selecting the at least one web source based on the first template.
  • S350 based on the template and the retrieved evidencing electronic document, it may be determined whether there is a difference in one or more transaction parameters.
  • S350 includes comparing transaction parameters in the created template to corresponding data of the evidencing electronic document.
  • S350 may also include creating a template for the evidencing electronic document and comparing data in one or more fields of the expense report electronic document to data in corresponding fields of the evidencing electronic document.
  • each expense associated with the different compared transaction parameters may be determined to be non-deductible.
  • S350 may further include determining a cause of the difference. Based on the determined cause of the difference, it may be determined whether one or more of the expenses indicated in the expense report electronic document is non-deductible or only partially deductible. In some implementations, when the difference is a difference in amount (e.g., a different price for one of the expenses indicated in the expense report and in the invoice for the transaction), the higher value of the amount may be determined as non-deductible.
  • a difference in amount e.g., a different price for one of the expenses indicated in the expense report and in the invoice for the transaction
  • one or more deduction rules are retrieved from web sources.
  • the deduction rules may be retrieved based on enterprise characteristics related to an enterprise (e.g., an enterprise associated with the expense report electronic document), and may be further retrieved based on the transaction parameters for the transaction.
  • the deduction rules may vary based on the country of formation of the enterprise, the structure of the enterprise (e.g., with respect to subsidiary and parent companies), the revenues of the enterprise, and the like.
  • the retrieved deduction rules are applied to the transaction parameters indicated in the expense report electronic document, the evidencing electronic document, or both.
  • the deduction rules may not be applied with respect to transaction parameters of expenses that were determined to be non-deductible.
  • the results of applying the deduction rules may include, but is not limited to, a determination of each deductible expense, a deduction amount for each deductible expense, or both.
  • metadata including the determined deductible expenses, deduction amounts, or both, is generated. The metadata may be utilized with metadata of other expense reports to generate a consolidated expense report, thereby consolidating expense reports.
  • a notification may be generated.
  • the notification may include the metadata, an indication of the deductible expenses, the determined cause of the difference, or a combination thereof.
  • the notification may indicate the non-deductible expenses.
  • Fig. 4 is an example flowchart S310 illustrating a method for creating a dataset based on an electronic document according to an embodiment.
  • the electronic document is obtained.
  • Obtaining the electronic document may include, but is not limited to, receiving the electronic document (e.g., receiving a scanned image) or retrieving the electronic document (e.g., retrieving the electronic document from a consumer enterprise system, a merchant enterprise system, or a database).
  • the electronic document is analyzed.
  • the analysis may include, but is not limited to, using optical character recognition (OCR) to determine characters in the electronic document.
  • OCR optical character recognition
  • the key field may include, but are not limited to, merchant's name and address, date, currency, good or service sold, a transaction identifier, an invoice number, and so on.
  • An electronic document may include unnecessary details that would not be considered to be key values. As an example, a logo of the merchant may not be required and, thus, is not a key value.
  • a list of key fields may be predefined, and pieces of data that may match the key fields are extracted. Then, a cleaning process is performed to ensure that the information is accurately presented. For example, if the OCR would result in a data presented as "121 1212005", the cleaning process will convert this data to 12/12/2005.
  • the cleaning process may be performed using external information resources, such as dictionaries, calendars, and the like.
  • it is checked if the extracted pieces of data are completed. For example, if the merchant name can be identified but its address is missing, then the key field for the merchant address is incomplete. An attempt to complete the missing key field values is performed. This attempt may include querying external systems and databases, correlation with previously analyzed invoices, or a combination thereof. Examples for external systems and databases may include business directories, Universal Product Code (UPC) databases, parcel delivery and tracking systems, and so on.
  • S430 results in a complete set of the predefined key fields and their respective values.
  • a structured dataset is generated.
  • the generated dataset includes the identified key fields and values.
  • any reference to an element herein using a designation such as "first,” “second,” and so forth does not generally limit the quantity or order of those elements. Rather, these designations are generally used herein as a convenient method of distinguishing between two or more elements or instances of an element. Thus, a reference to first and second elements does not mean that only two elements may be employed there or that the first element must precede the second element in some manner. Also, unless stated otherwise, a set of elements comprises one or more elements.
  • the phrase "at least one of” followed by a listing of items means that any of the listed items can be utilized individually, or any combination of two or more of the listed items can be utilized. For example, if a system is described as including "at least one of A, B, and C," the system can include A alone; B alone; C alone; A and B in combination; B and C in combination; A and C in combination; or A, B, and C in combination.
  • the various embodiments disclosed herein can be implemented as hardware, firmware, software, or any combination thereof.
  • the software is preferably implemented as an application program tangibly embodied on a program storage unit or computer readable medium consisting of parts, or of certain devices and/or a combination of devices.
  • the application program may be uploaded to, and executed by, a machine comprising any suitable architecture.
  • the machine is implemented on a computer platform having hardware such as one or more central processing units (“CPUs"), a memory, and input/output interfaces.
  • CPUs central processing units
  • the computer platform may also include an operating system and microinstruction code.
  • a non-transitory computer readable medium is any computer readable medium except for a transitory propagating signal.

Landscapes

  • Business, Economics & Management (AREA)
  • Accounting & Taxation (AREA)
  • Finance (AREA)
  • Engineering & Computer Science (AREA)
  • Strategic Management (AREA)
  • Theoretical Computer Science (AREA)
  • Development Economics (AREA)
  • Economics (AREA)
  • Physics & Mathematics (AREA)
  • General Business, Economics & Management (AREA)
  • General Physics & Mathematics (AREA)
  • Marketing (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Human Resources & Organizations (AREA)
  • Technology Law (AREA)
  • Computer Security & Cryptography (AREA)
  • Data Mining & Analysis (AREA)
  • Operations Research (AREA)
  • Quality & Reliability (AREA)
  • Tourism & Hospitality (AREA)
  • Financial Or Insurance-Related Operations Such As Payment And Settlement (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

Cette invention concerne système et un procédé générant des données consolidées sur la base de documents électroniques. Le procédé comprend : l'analyse d'un premier document électronique pour déterminer au moins un paramètre de transaction, le premier document électronique indiquant une transaction comprenant au moins une dépense, le premier document électronique comprenant des données au moins partiellement non structurées ; la création d'un modèle pour le premier document électronique, le modèle étant un ensemble de données structurées comprenant ledit/lesdits paramètre(s) de transaction déterminé(s) ; la récupération, sur la base du modèle, d'un second document électronique, le second document électronique indiquant une preuve de la transaction ; la détermination d'au moins une dépense déductible à partir de ladite/desdites dépenses sur la base d'au moins une règle de déduction, du modèle et du second document électronique ; et la génération de métadonnées de consolidation sur la base de ladite/desdites dépense(s) déduite(s) déterminée(s).
EP17837779.2A 2016-08-05 2017-08-04 Système et procédé de génération de données consolidées pour documents électroniques Withdrawn EP3494531A4 (fr)

Applications Claiming Priority (3)

Application Number Priority Date Filing Date Title
US201662371221P 2016-08-05 2016-08-05
US15/361,934 US20170154385A1 (en) 2015-11-29 2016-11-28 System and method for automatic validation
PCT/US2017/045554 WO2018027158A1 (fr) 2016-08-05 2017-08-04 Système et procédé de génération de données consolidées pour documents électroniques

Publications (2)

Publication Number Publication Date
EP3494531A1 true EP3494531A1 (fr) 2019-06-12
EP3494531A4 EP3494531A4 (fr) 2020-04-08

Family

ID=61073095

Family Applications (1)

Application Number Title Priority Date Filing Date
EP17837779.2A Withdrawn EP3494531A4 (fr) 2016-08-05 2017-08-04 Système et procédé de génération de données consolidées pour documents électroniques

Country Status (3)

Country Link
EP (1) EP3494531A4 (fr)
CN (1) CN109791643A (fr)
WO (1) WO2018027158A1 (fr)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US12001446B2 (en) 2022-04-12 2024-06-04 Thinking Machine Systems Ltd. System and method for extracting data from invoices and contracts

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20100161616A1 (en) * 2008-12-16 2010-06-24 Carol Mitchell Systems and methods for coupling structured content with unstructured content
US8774516B2 (en) * 2009-02-10 2014-07-08 Kofax, Inc. Systems, methods and computer program products for determining document validity
US8861861B2 (en) * 2011-05-10 2014-10-14 Expensify, Inc. System and method for processing receipts and other records of users
CN103843315B (zh) * 2011-10-01 2018-08-17 甲骨文国际公司 移动费用解决方案体系架构及方法

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US12001446B2 (en) 2022-04-12 2024-06-04 Thinking Machine Systems Ltd. System and method for extracting data from invoices and contracts

Also Published As

Publication number Publication date
WO2018027158A1 (fr) 2018-02-08
EP3494531A4 (fr) 2020-04-08
CN109791643A (zh) 2019-05-21

Similar Documents

Publication Publication Date Title
US11062132B2 (en) System and method for identification of missing data elements in electronic documents
US10509811B2 (en) System and method for improved analysis of travel-indicating unstructured electronic documents
US11138372B2 (en) System and method for reporting based on electronic documents
US20170323006A1 (en) System and method for providing analytics in real-time based on unstructured electronic documents
US20170169292A1 (en) System and method for automatically verifying requests based on electronic documents
US20180011846A1 (en) System and method for matching transaction electronic documents to evidencing electronic documents
EP3494495A1 (fr) Système et procédé pour remplir des documents électroniques
US20170323157A1 (en) System and method for determining an entity status based on unstructured electronic documents
US20180025225A1 (en) System and method for generating consolidated data for electronic documents
EP3526760A1 (fr) Système et procédé de génération d'un document électronique d'attestation modifié comprenant des éléments manquants
EP3430540A1 (fr) Système et procédé pour la génération automatique de données de rapport basées sur des documents électroniques
US20180046663A1 (en) System and method for completing electronic documents
US20180025438A1 (en) System and method for generating analytics based on electronic documents
US10387561B2 (en) System and method for obtaining reissues of electronic documents lacking required data
US20180025224A1 (en) System and method for identifying unclaimed electronic documents
US20170169519A1 (en) System and method for automatically verifying transactions based on electronic documents
EP3494496A1 (fr) Système et procédé de génération de rapports sur la base de documents électroniques
WO2017201012A1 (fr) Système et procédé pour fournir des analyses en temps réel sur la base de documents électroniques non structurés
EP3417383A1 (fr) Vérification automatique de demandes sur la base de documents électroniques
EP3494531A1 (fr) Système et procédé de génération de données consolidées pour documents électroniques
WO2018034941A1 (fr) Système et un procédé de génération d'analyses sur la base de documents électroniques
WO2017201292A1 (fr) Système et procédé de chiffrement de données dans des documents électroniques
WO2017142615A1 (fr) Système et procédé de gestion d'intégrité de données
US20170193609A1 (en) System and method for automatically monitoring requests indicated in electronic documents
EP3497589A1 (fr) Système et procédé d'identification de documents électroniques

Legal Events

Date Code Title Description
PUAI Public reference made under article 153(3) epc to a published international application that has entered the european phase

Free format text: ORIGINAL CODE: 0009012

STAA Information on the status of an ep patent application or granted ep patent

Free format text: STATUS: REQUEST FOR EXAMINATION WAS MADE

17P Request for examination filed

Effective date: 20190222

AK Designated contracting states

Kind code of ref document: A1

Designated state(s): AL AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HR HU IE IS IT LI LT LU LV MC MK MT NL NO PL PT RO RS SE SI SK SM TR

AX Request for extension of the european patent

Extension state: BA ME

DAV Request for validation of the european patent (deleted)
DAX Request for extension of the european patent (deleted)
A4 Supplementary search report drawn up and despatched

Effective date: 20200306

RIC1 Information provided on ipc code assigned before grant

Ipc: G06F 16/00 20190101ALI20200302BHEP

Ipc: G06Q 20/38 20120101ALI20200302BHEP

Ipc: G06Q 30/04 20120101ALI20200302BHEP

Ipc: G06Q 20/04 20120101ALI20200302BHEP

Ipc: G06K 9/00 20060101ALI20200302BHEP

Ipc: G06K 9/34 20060101ALI20200302BHEP

Ipc: G06Q 30/06 20120101ALI20200302BHEP

Ipc: G06Q 10/10 20120101ALI20200302BHEP

Ipc: G06Q 20/14 20120101ALI20200302BHEP

Ipc: G06Q 40/00 20120101AFI20200302BHEP

Ipc: G06Q 20/40 20120101ALI20200302BHEP

STAA Information on the status of an ep patent application or granted ep patent

Free format text: STATUS: THE APPLICATION IS DEEMED TO BE WITHDRAWN

18D Application deemed to be withdrawn

Effective date: 20201006