US10817656B2 - Methods and devices for enabling computers to automatically enter information into a unified database from heterogeneous documents - Google Patents
Methods and devices for enabling computers to automatically enter information into a unified database from heterogeneous documents Download PDFInfo
- Publication number
- US10817656B2 US10817656B2 US15/821,682 US201715821682A US10817656B2 US 10817656 B2 US10817656 B2 US 10817656B2 US 201715821682 A US201715821682 A US 201715821682A US 10817656 B2 US10817656 B2 US 10817656B2
- Authority
- US
- United States
- Prior art keywords
- program code
- image file
- computer
- text
- fields
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000000034 method Methods 0.000 title claims description 60
- 238000013479 data entry Methods 0.000 claims abstract description 21
- 238000012015 optical character recognition Methods 0.000 claims abstract description 17
- 230000009471 action Effects 0.000 claims description 8
- 238000004590 computer program Methods 0.000 claims description 8
- 238000012545 processing Methods 0.000 description 38
- 230000015654 memory Effects 0.000 description 26
- 238000004891 communication Methods 0.000 description 23
- 230000002085 persistent effect Effects 0.000 description 14
- 230000008569 process Effects 0.000 description 11
- 230000003287 optical effect Effects 0.000 description 6
- 239000004744 fabric Substances 0.000 description 5
- 238000012546 transfer Methods 0.000 description 4
- 230000000694 effects Effects 0.000 description 3
- 239000004065 semiconductor Substances 0.000 description 3
- 230000008901 benefit Effects 0.000 description 2
- 238000005516 engineering process Methods 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 230000007246 mechanism Effects 0.000 description 1
- 239000013307 optical fiber Substances 0.000 description 1
- 238000003909 pattern recognition Methods 0.000 description 1
- 230000000644 propagated effect Effects 0.000 description 1
- 238000012552 review Methods 0.000 description 1
- 239000007787 solid Substances 0.000 description 1
- 230000029305 taxis Effects 0.000 description 1
- 210000003813 thumb Anatomy 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/90—Details of database functions independent of the retrieved data types
- G06F16/93—Document management systems
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/10—Text processing
- G06F40/166—Editing, e.g. inserting or deleting
- G06F40/174—Form filling; Merging
-
- G06K9/00449—
-
- G06K9/00456—
-
- G06K9/344—
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V30/00—Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
- G06V30/10—Character recognition
- G06V30/14—Image acquisition
- G06V30/148—Segmentation of character regions
- G06V30/153—Segmentation of character regions using recognition of characters or words
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V30/00—Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
- G06V30/40—Document-oriented image-based pattern recognition
- G06V30/41—Analysis of document content
- G06V30/412—Layout analysis of documents structured with printed lines or input boxes, e.g. business forms or tables
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V30/00—Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
- G06V30/40—Document-oriented image-based pattern recognition
- G06V30/41—Analysis of document content
- G06V30/413—Classification of content, e.g. text, photographs or tables
-
- G06K2209/01—
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V30/00—Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
- G06V30/10—Character recognition
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L67/00—Network arrangements or protocols for supporting network services or applications
- H04L67/01—Protocols
- H04L67/02—Protocols based on web technology, e.g. hypertext transfer protocol [HTTP]
Definitions
- the present disclosure relates to methods and devices for enabling computers to automatically enter information into a unified database from heterogenous documents.
- OCR optical character recognition
- PDF portable document format
- the illustrative embodiments provide for a computer-implemented method of enabling a computer to automatically enter information into a unified database from heterogenous documents.
- the computer-implemented method includes receiving, at a processor, an image file.
- the computer-implemented method also includes displaying, by the processor, the image file in a first area of a window rendered on a tangible display device.
- the computer-implemented method also includes displaying, by the processor, fields for data entry in a second area of the window.
- the computer-implemented method also includes performing, by the processor, optical character recognition on the image file.
- the computer-implemented method also includes identifying, by the processor, at least one parameter of text in the image file.
- the computer-implemented method also includes comparing, by the processor, the at least one parameter of the text to at least one of a plurality of stored parameters.
- the computer-implemented method also includes sorting, by the processor, the text according to the at least one of the plurality of stored parameters into a plurality of categories, wherein sorted text is formed.
- the computer-implemented method also includes auto-populating and displaying, by the processor, the fields in the second area of the window based on the sorted text.
- the illustrative embodiments also contemplate a non-transitory computer-recordable storage medium storing program code, which when executed by a processor, performs the above method.
- the illustrative embodiments also contemplate a computer including a processor and a non-transitory computer-recordable storage medium storing program code, which when executed by the processor, performs the above method.
- FIG. 1 illustrates a sample screenshot of a user interface for software configured to receive image files and auto populate specific fields in a unified database, in accordance with an illustrative embodiment
- FIG. 2 illustrates another sample screenshot of a user interface for software configured to receive image files and auto populate specific fields in a unified database, in accordance with an illustrative embodiment
- FIG. 3 illustrates a flowchart of a method for receiving image files and auto populating specific fields in a unified database, in accordance with an illustrative embodiment
- FIG. 4 illustrates a data processing system, in accordance with an illustrative embodiment.
- OCR optical character recognition
- unified database is defined as one or more databases, whether relational databases, content addressable databases, or other types of databases, which together are directed towards a common enterprise and use a common set of identifiers.
- databases when taken together, could contain information regarding employee records, tax information, and other information, that use a common system of identifiers.
- employee name would be the name of a field throughout all databases so that confusion is avoided when working with the databases in the context of a single enterprise.
- This wage garnishment example is just one example.
- the human resource department or the third party vendor also must process tax information such as data entered into W-2s, taxes paid to multiple government agencies, and many others.
- the illustrative embodiments recognize and take into account that even when this data comes in the form of electronic files displayable on a computer, a human user must take an inordinate and undesirable amount of time to enter the correct information into the unified database of the human resources department or the third-party vendor.
- the illustrative embodiments provide for methods and devices that address these issues and provide a means for enabling computers to automatically enter information into a unified database from heterogeneous documents.
- the illustrative embodiments take advantage of OCR technology, but also utilize a database of common terms to identify candidates for entries into a field of a unified database.
- the illustrative embodiments automatically populate fields of interest, and then display the populated fields so that a user can verify the entries.
- the computer can automatically verify the entries into the fields to confirm that they relate to an employee.
- the computer can verify that “John Doe” is a valid entry by confirming that “John Doe” actually is an employee recorded in the unified database.
- the illustrative embodiments further recognize and take into account the user interface problem of operating multiple windows of different software products; one to view the documents, and another to perform data entry. Switching between windows is inconvenient and wastes time during data entry.
- the illustrative embodiments also provide a means for displaying a single window which allows for selection of an image file for processing, displays the image file, and presents fields for entering data into the unified database.
- the illustrative embodiments address these and other issues by providing for methods and devices for enabling computers to automatically enter information into a unified database from heterogeneous documents.
- attention is now turned to the figures.
- FIG. 1 illustrates a sample screenshot of a user interface for software configured to receive image files and auto populate specific fields in a unified database, in accordance with an illustrative embodiment.
- Screenshot 100 is displayed on a tangible display device, such as display 400 of FIG. 4 and is generated by a processor, such processor unit 404 of FIG. 4 , executing program code designed to render screenshot 100 and provide functionality for at least some of the images shown.
- This program code may be implemented as software, firmware, or both.
- Screenshot 100 shows two primary areas, area 102 and area 104 .
- An “area”, as used herein, is a portion of a display on a device that shows part of the screenshot.
- Area 102 is used to display information related to the document or documents to be processed.
- Area 104 is used to display information useful for entering information into the unified database.
- area 102 Attention is first turned to area 102 .
- instructions 106 , instructions 108 , and/or select files 110 are provided to prompt a user to access the files from which data is to be processed.
- Title 112 may be provided to remind the user as to which types of files are to be processed.
- this illustrative embodiment described a method for presenting a display for a user to retrieve desired image files
- the illustrative embodiments also contemplate automatically presenting a user with image files for processing.
- the illustrative embodiments further contemplate automatically selecting and processing image files such that a user is not involved in the process of converting heterogeneous image files into entries into a unified database.
- area 104 is also displayed on screenshot 100 .
- Title 114 indicates to a user the nature of what is displayed in area 104 , which in this case is details of the agency notice displayed in area 102 that are to be entered in fields in area 104 for subsequent entry into the unified database. Ultimately the purpose of this data entry is to assist the enterprise in properly complying with the requirements of a specifically received agency notice.
- the use of area 104 is described with respect to FIG. 2 .
- FIG. 2 illustrates another sample screenshot of a user interface for software configured to receive image files and auto populate specific fields in a unified database, in accordance with an illustrative embodiment.
- Screenshot 200 is related to screenshot 100 in that screenshot 200 is taken after an image file has been uploaded and is being displayed in area 102 .
- Screenshot 200 shows an example of how a heterogeneous image file can be processed and its relevant information transmitted to fields in area 104 for entry into a unified database.
- screenshot 200 is displayed on a tangible display device, such as display 414 of FIG. 4 , as a result of a processor, such as processor unit 404 of FIG.
- program 4 executing a program embodied either as program code on a non-transitory computer-recordable storage medium, or as firmware.
- program Whether implemented as program code on a non-transitory computer-recordable storage medium or as firmware, the term “program” shall be used, though the term “program” excludes purely signal based media.
- area 102 shows image 202 of agency notice 204 .
- Agency notice 204 in this illustrative embodiment, relates tax information that by law must be processed by the enterprise.
- the program loads the image file of agency notice 204 and performs optical character recognition (OCR) on the file.
- OCR optical character recognition
- the program compares text extracted from the file based on the OCR to a plurality of terms stored in a database in order to characterize the text.
- the extracted text can be compared not only by text matching, but also by analyzing a location from where text was lifted, and according to patterns of text.
- the program can determine that the name of the “company” in this particular agency notice is “Automatic Data Processing” based on the location of this term in agency notice 204 as well as the recognizable pattern of a sender's address bar near the top of the page. Additionally, the term “ADP” is associated with the company.
- such a comparison is not necessary.
- the user can simply read the page and enter the term “Automatic Data Processing” or possibly “ADP” in field 116 , which is the “company name” to be entered into the unified database.
- the user can likewise fill out other fields in area 104 .
- sample answers are automatically generated and automatically copied into the relevant fields in area 104 .
- field 116 will be auto-populated with the term “Automatic Data Processing” or perhaps “ADP”.
- the remaining fields and button selections will likewise be auto-populated and auto-selected.
- a user will review the automatically supplied entries into the fields shown in area 104 .
- the user can then submit the entries, which are then transferred to the unified database for further processing an appropriate action.
- the user can make adjustments to the field entries and button selections prior to submission of the data.
- submission is automatic, and user is not required at all. In this case, all processing takes place out of sight of a user, with data automatically being input into the unified database.
- this particular illustrative embodiment is less useful the more heterogeneous the documents being processed. For example, when tax documents are received from a wide variety of companies in a wide variety of different formats, then the likelihood of errors in automatic population of the fields of interest increases. When the probability of such errors increases, adding a human reviewer to the process can increase the accuracy of the data transfer process.
- the illustrative embodiments provide an integrated technology for reviewing heterogeneous image documents for text and entering this text data appropriately into a unified database.
- the illustrative embodiments may auto populate fields in one illustrative embodiment, thereby substantially increasing the speed of such data processing.
- the illustrative embodiments enable computers to automatically enter information into a unified database from heterogenous documents, thereby accomplishing a technical effect.
- Another technical effect of the illustrative embodiments is enabling an improved user interface for human users so that human users may more efficiently use a computer to accomplish desired data entry tasks.
- the illustrative embodiments are implemented solely in a computer, intrinsically a part of the operation of computer, and relate only to improving computer functionality and presentation. Thus, the illustrative embodiments cannot be accomplished by a human being, but rather only by a computer improved using the techniques described herein.
- FIG. 1 and FIG. 2 do not necessarily limit other illustrative embodiments or the claims. Many variations are possible, based on many different types of documents, fields of interest for data entry, or other enterprise goals.
- the illustrative embodiments may also be extended.
- the illustrative embodiments contemplate automatically processing multiple image documents simultaneously.
- the illustrative embodiments contemplate collating information for entry into fields which request information regarding, for example, now many times a given item is referenced across multiple documents.
- the illustrative embodiments also contemplate processing either homogenous or heterogeneous file types and formats, such as but not limited to .png, .pdf, .jpg, .jif, and many other file types.
- the illustrative embodiments are not necessarily limited to the examples given above.
- FIG. 3 illustrates a flowchart of a method for receiving image files and auto populating specific fields in a unified database, in accordance with an illustrative embodiment.
- Method 300 may be implemented using a data processing system, such as data processing system 400 of FIG. 4 .
- Method 300 is a variation of the methods described above with respect to FIG. 1 and FIG. 2 .
- Method 300 is only performable by a computer and accomplishes the technical effects described above with respect to FIG. 2 .
- Method 300 may be characterized as a method of enabling a computer to automatically enter information into a unified database from heterogenous documents.
- Method 300 includes receiving, at a processor, an image file (operation 302 ). Method 300 also includes displaying, by the processor, the image file in a first area of a window rendered on a tangible display device (operation 304 ). Method 300 also includes displaying, by the processor, fields for data entry in a second area of the window (operation 306 ).
- Method 300 also includes performing, by the processor, optical character recognition on the image file (operation 308 ).
- Method 300 also includes identifying, by the processor, at least one parameter of text in the image file (operation 310 ).
- This parameter or parameters may take many different forms, as described above.
- a parameter may be the text itself for text matching, a location of the text in the image file, surrounding text for pattern recognition matching, pre-stored codes, words, or phrases, color used in the image file, image file type, and potentially many others.
- the purpose of the parameter or parameters is to enable the computer to recognize appropriate text from potentially many different heterogeneous image files for entry into one or more specific fields for ultimate entry into a unified database.
- Method 300 also includes comparing, by the processor, the at least one parameter of the text to at least one of a plurality of stored parameters (operation 312 ).
- Method 300 also includes sorting, by the processor, the text according to the at least one of the plurality of stored parameters into a plurality of categories, wherein sorted text is formed (operation 314 ). Sorting into a plurality of categories is specifically related to determining which alphanumeric text sequences should be applied to which fields in the second area of the display window. For example, the phrase “Automatic Data Processing” can be recognized as belonging to the category of “company name” and thus assigned to a field accordingly.
- method 300 also includes auto-populating and displaying, by the processor, the fields in the second area of the window based on the sorted text (operation 316 ).
- Method 300 may be varied by including different operations, or by including additional operations, or by potentially using fewer operations. Some of these additional illustrative embodiments follow, and are shown in FIG. 3 as boxes surrounded dotted lines to indicate that they are optional additional steps taken with respect to method 300 .
- method 300 may also include submitting the fields as entries into a unified database (operation 318 ).
- method 300 may also include receiving, prior to submitting, user input from a user input device indicating that the fields are correct (operation 320 ).
- a user could possibly edit the entries in the fields prior to submission.
- method 300 may also include automatically taking an action, based on the entries in the unified database, required by an order stated in a document from which the image file was made (operation 322 ).
- An example of such an action would be, responsive to receiving a court order, withholding wages from an employee's paycheck and paying the withheld wages to a designated payee.
- Another example would be to populate a paystub and transmit the paystub to an employee or others authorized to receive the paystub.
- Many different actions are possible, and such actions are not necessarily limited to a human resources context.
- the first area and the second area are displayed side by side in the window, whereby use of multiple display windows is avoided.
- the image comprises a plurality of images taken from a plurality of heterogeneous image files, and wherein auto-populating is performed for different sets of fields for each one of the plurality of heterogeneous image files.
- displaying the image file and displaying the fields is performed on a web browser of a local computer, and wherein receiving, performing, identifying, comparing, sorting, and auto-populating are performed by a remote server as software as a service.
- the computer-implemented method is performed on a single local computer.
- Data processing system 400 in FIG. 4 is an example of a data processing system that may be used to implement the illustrative embodiments, such screenshot 100 of FIG. 1 , screenshot 200 of FIG. 2 , method 300 of FIG. 3 , or any other module or system or process disclosed herein.
- data processing system 400 includes communications fabric 402 , which provides communications between processor unit 404 , memory 406 , persistent storage 408 , communications unit 410 , input/output (I/O) unit 412 , and display 414 .
- communications fabric 402 which provides communications between processor unit 404 , memory 406 , persistent storage 408 , communications unit 410 , input/output (I/O) unit 412 , and display 414 .
- Processor unit 404 serves to execute instructions for software that may be loaded into memory 406 .
- This software may be an associative memory, content addressable memory, or software for implementing the processes described elsewhere herein.
- software loaded into memory 406 may be software for executing method 300 of FIG. 3 .
- Processor unit 404 may be a number of processors, a multi-processor core, or some other type of processor, depending on the particular implementation.
- a number, as used herein with reference to an item, means one or more items.
- processor unit 404 may be implemented using a number of heterogeneous processor systems in which a main processor is present with secondary processors on a single chip.
- processor unit 404 may be a symmetric multi-processor system containing multiple processors of the same type.
- Memory 406 and persistent storage 408 are examples of storage devices 416 .
- a storage device is any piece of hardware that is capable of storing information, such as, for example, without limitation, data, program code in functional form, and/or other suitable information either on a temporary basis and/or a permanent basis.
- Storage devices 416 may also be referred to as computer-readable storage devices in these examples.
- Memory 406 in these examples, may be, for example, a random-access memory or any other suitable volatile or non-volatile storage device.
- Persistent storage 408 may take various forms, depending on the particular implementation.
- persistent storage 408 may contain one or more components or devices.
- persistent storage 408 may be a hard drive, a flash memory, a rewritable optical disk, a rewritable magnetic tape, or some combination of the above.
- the media used by persistent storage 408 also may be removable.
- a removable hard drive may be used for persistent storage 408 .
- Communications unit 410 in these examples, provides for communications with other data processing systems or devices.
- communications unit 410 is a network interface card.
- Communications unit 410 may provide communications through the use of either or both physical and wireless communications links.
- Input/output (I/O) unit 412 allows for input and output of data with other devices that may be connected to data processing system 400 .
- input/output (I/O) unit 412 may provide a connection for user input through a keyboard, a mouse, and/or some other suitable input device. Further, input/output (I/O) unit 412 may send output to a printer.
- Display 414 provides a mechanism to display information to a user.
- Instructions for the operating system, applications, and/or programs may be located in storage devices 416 , which are in communication with processor unit 404 through communications fabric 402 .
- the instructions are in a functional form on persistent storage 408 . These instructions may be loaded into memory 406 for execution by processor unit 404 .
- the processes of the different embodiments may be performed by processor unit 404 using computer implemented instructions, which may be located in a memory, such as memory 406 .
- program code computer-usable program code
- computer-readable program code that may be read and executed by a processor in processor unit 404 .
- the program code in the different embodiments may be embodied on different physical or computer-readable storage media, such as memory 406 or persistent storage 408 .
- Program code 418 is located in a functional form on computer-readable media 420 that is selectively removable and may be loaded onto or transferred to data processing system 400 for execution by processor unit 404 .
- Program code 418 and computer-readable media 420 form computer program product 422 in these examples.
- computer-readable media 420 may be computer-readable storage media 424 or computer-readable signal media 426 .
- Computer-readable storage media 424 may include, for example, an optical or magnetic disk that is inserted or placed into a drive or other device that is part of persistent storage 408 for transfer onto a storage device, such as a hard drive, that is part of persistent storage 408 .
- Computer-readable storage media 424 also may take the form of a persistent storage, such as a hard drive, a thumb drive, or a flash memory, that is connected to data processing system 400 . In some instances, computer-readable storage media 424 may not be removable from data processing system 400 .
- program code 418 may be transferred to data processing system 400 using computer-readable signal media 426 .
- Computer-readable signal media 426 may be, for example, a propagated data signal containing program code 418 .
- Computer-readable signal media 426 may be an electromagnetic signal, an optical signal, and/or any other suitable type of signal. These signals may be transmitted over communications links, such as wireless communications links, optical fiber cable, coaxial cable, a wire, and/or any other suitable type of communications link.
- the communications link and/or the connection may be physical or wireless in the illustrative examples.
- program code 418 may be downloaded over a network to persistent storage 408 from another device or data processing system through computer-readable signal media 426 for use within data processing system 400 .
- program code stored in a computer-readable storage medium in a server data processing system may be downloaded over a network from the server to data processing system 400 .
- the data processing system providing program code 418 may be a server computer, a client computer, or some other device capable of storing and transmitting program code 418 .
- the different components illustrated for data processing system 400 are not meant to provide architectural limitations to the manner in which different embodiments may be implemented.
- the different illustrative embodiments may be implemented in a data processing system including components in addition to or in place of those illustrated for data processing system 400 .
- Other components shown in FIG. 4 can be varied from the illustrative examples shown.
- the different embodiments may be implemented using any hardware device or system capable of running program code.
- the data processing system may include organic components integrated with inorganic components and/or may be comprised entirely of organic components excluding a human being.
- a storage device may be comprised of an organic semiconductor.
- processor unit 404 may take the form of a hardware unit that has circuits that are manufactured or configured for a particular use. This type of hardware may perform operations without needing program code to be loaded into a memory from a storage device to be configured to perform the operations.
- processor unit 404 when processor unit 404 takes the form of a hardware unit, processor unit 404 may be a circuit system, an application specific integrated circuit (ASIC), a programmable logic device, or some other suitable type of hardware configured to perform a number of operations.
- ASIC application specific integrated circuit
- a programmable logic device the device is configured to perform the number of operations. The device may be reconfigured at a later time or may be permanently configured to perform the number of operations.
- Examples of programmable logic devices include, for example, a programmable logic array, programmable array logic, a field programmable logic array, a field programmable gate array, and other suitable hardware devices.
- program code 418 may be omitted because the processes for the different embodiments are implemented in a hardware unit.
- processor unit 404 may be implemented using a combination of processors found in computers and hardware units.
- Processor unit 404 may have a number of hardware units and a number of processors that are configured to run program code 418 . With this depicted example, some of the processes may be implemented in the number of hardware units, while other processes may be implemented in the number of processors.
- a storage device in data processing system 400 is any hardware apparatus that may store data.
- Memory 406 , persistent storage 408 , and computer-readable media 420 are examples of storage devices in a tangible form.
- a bus system may be used to implement communications fabric 402 and may be comprised of one or more buses, such as a system bus or an input/output bus.
- the bus system may be implemented using any suitable type of architecture that provides for a transfer of data between different components or devices attached to the bus system.
- a communications unit may include one or more devices used to transmit and receive data, such as a modem or a network adapter.
- a memory may be, for example, memory 406 , or a cache, such as found in an interface and memory controller hub that may be present in communications fabric 402 .
- the different illustrative embodiments can take the form of an entirely hardware embodiment, an entirely software embodiment, or an embodiment containing both hardware and software elements.
- Some embodiments are implemented in software, which includes but is not limited to forms such as, for example, firmware, resident software, and microcode.
- a computer-usable or computer-readable medium can generally be any tangible apparatus that can contain, store, communicate, propagate, or transport the program for use by or in connection with the instruction execution system, apparatus, or device.
- the computer-usable or computer-readable medium can be, for example, without limitation an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, or a propagation medium.
- a computer-readable medium include a semiconductor or solid state memory, magnetic tape, a removable computer diskette, a random access memory (RAM), a read-only memory (ROM), a rigid magnetic disk, and an optical disk.
- Optical disks may include compact disk-read only memory (CD-ROM), compact disk-read/write (CD-R/W), and DVD.
- a computer-usable or computer-readable medium may contain or store a computer-readable or computer-usable program code such that when the computer-readable or computer-usable program code is executed on a computer, the execution of this computer-readable or computer-usable program code causes the computer to transmit another computer-readable or computer-usable program code over a communications link.
- This communications link may use a medium that is, for example without limitation, physical or wireless.
- a data processing system suitable for storing and/or executing computer-readable or computer-usable program code will include one or more processors coupled directly or indirectly to memory elements through a communications fabric, such as a system bus.
- the memory elements may include local memory employed during actual execution of the program code, bulk storage, and cache memories which provide temporary storage of at least some computer-readable or computer-usable program code to reduce the number of times code may be retrieved from bulk storage during execution of the code.
- I/O devices can be coupled to the system either directly or through intervening I/O controllers. These devices may include, for example, without limitation, keyboards, touch screen displays, and pointing devices. Different communications adapters may also be coupled to the system to enable the data processing system to become coupled to other data processing systems or remote printers or storage devices through intervening private or public networks. Non-limiting examples of modems and network adapters are just a few of the currently available types of communications adapters.
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Physics & Mathematics (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Artificial Intelligence (AREA)
- General Engineering & Computer Science (AREA)
- Databases & Information Systems (AREA)
- Multimedia (AREA)
- General Health & Medical Sciences (AREA)
- Computational Linguistics (AREA)
- Health & Medical Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Business, Economics & Management (AREA)
- General Business, Economics & Management (AREA)
- Data Mining & Analysis (AREA)
- Character Discrimination (AREA)
Abstract
Description
Claims (20)
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US15/821,682 US10817656B2 (en) | 2017-11-22 | 2017-11-22 | Methods and devices for enabling computers to automatically enter information into a unified database from heterogeneous documents |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US15/821,682 US10817656B2 (en) | 2017-11-22 | 2017-11-22 | Methods and devices for enabling computers to automatically enter information into a unified database from heterogeneous documents |
Publications (2)
Publication Number | Publication Date |
---|---|
US20190155887A1 US20190155887A1 (en) | 2019-05-23 |
US10817656B2 true US10817656B2 (en) | 2020-10-27 |
Family
ID=66533069
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US15/821,682 Active US10817656B2 (en) | 2017-11-22 | 2017-11-22 | Methods and devices for enabling computers to automatically enter information into a unified database from heterogeneous documents |
Country Status (1)
Country | Link |
---|---|
US (1) | US10817656B2 (en) |
Families Citing this family (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110555372A (en) * | 2019-07-22 | 2019-12-10 | 深圳壹账通智能科技有限公司 | Data entry method, device, equipment and storage medium |
US20220076208A1 (en) * | 2020-09-04 | 2022-03-10 | Scopeasy Construction Software Limited | Methods and systems for processing training records and documents of employees |
Citations (21)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20030112270A1 (en) * | 2000-12-22 | 2003-06-19 | Merchant & Gould, P.C. | Litigation management system and method |
US20030179400A1 (en) * | 2002-03-22 | 2003-09-25 | Intellectual Property Resources, Inc. | Data capture during print process |
US20040181749A1 (en) * | 2003-01-29 | 2004-09-16 | Microsoft Corporation | Method and apparatus for populating electronic forms from scanned documents |
US6886136B1 (en) | 2000-05-05 | 2005-04-26 | International Business Machines Corporation | Automatic template and field definition in form processing |
US6898316B2 (en) * | 2001-11-09 | 2005-05-24 | Arcsoft, Inc. | Multiple image area detection in a digital image |
US6950553B1 (en) | 2000-03-23 | 2005-09-27 | Cardiff Software, Inc. | Method and system for searching form features for form identification |
US7069240B2 (en) * | 2002-10-21 | 2006-06-27 | Raphael Spero | System and method for capture, storage and processing of receipts and related data |
US7103198B2 (en) * | 2002-05-06 | 2006-09-05 | Newsoft Technology Corporation | Method for determining an adjacency relation |
US20070168382A1 (en) | 2006-01-03 | 2007-07-19 | Michael Tillberg | Document analysis system for integration of paper records into a searchable electronic database |
US20090044095A1 (en) * | 2007-08-06 | 2009-02-12 | Apple Inc. | Automatically populating and/or generating tables using data extracted from files |
US20090132605A1 (en) * | 2007-04-19 | 2009-05-21 | 2C Change A/S | Handling of data in a data sharing system |
US20090208103A1 (en) * | 2007-04-22 | 2009-08-20 | Bo-In Lin | Control of optical character recognition (OCR) processes to generate user controllable final output documents |
US7729928B2 (en) * | 2005-02-25 | 2010-06-01 | Virtual Radiologic Corporation | Multiple resource planning system |
US20100138343A1 (en) * | 2007-12-31 | 2010-06-03 | Bank Of America Corporation | Dynamic hold decisioning |
US7974877B2 (en) * | 2005-06-23 | 2011-07-05 | Microsoft Corporation | Sending and receiving electronic business cards |
US20120040717A1 (en) * | 2010-08-16 | 2012-02-16 | Veechi Corp | Mobile Data Gathering System and Method |
US20120166206A1 (en) * | 2010-12-23 | 2012-06-28 | Case Commons, Inc. | Method, computer readable medium, and apparatus for constructing a case management system |
US20140219583A1 (en) * | 2011-06-07 | 2014-08-07 | Amadeus S.A.S. | Personal information display system and associated method |
US20170109610A1 (en) * | 2013-03-13 | 2017-04-20 | Kofax, Inc. | Building classification and extraction models based on electronic forms |
US9753908B2 (en) * | 2007-11-05 | 2017-09-05 | The Neat Company, Inc. | Method and system for transferring data from a scanned document into a spreadsheet |
US10558880B2 (en) * | 2015-11-29 | 2020-02-11 | Vatbox, Ltd. | System and method for finding evidencing electronic documents based on unstructured data |
-
2017
- 2017-11-22 US US15/821,682 patent/US10817656B2/en active Active
Patent Citations (22)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US6950553B1 (en) | 2000-03-23 | 2005-09-27 | Cardiff Software, Inc. | Method and system for searching form features for form identification |
US6886136B1 (en) | 2000-05-05 | 2005-04-26 | International Business Machines Corporation | Automatic template and field definition in form processing |
US20030112270A1 (en) * | 2000-12-22 | 2003-06-19 | Merchant & Gould, P.C. | Litigation management system and method |
US6898316B2 (en) * | 2001-11-09 | 2005-05-24 | Arcsoft, Inc. | Multiple image area detection in a digital image |
US20030179400A1 (en) * | 2002-03-22 | 2003-09-25 | Intellectual Property Resources, Inc. | Data capture during print process |
US7103198B2 (en) * | 2002-05-06 | 2006-09-05 | Newsoft Technology Corporation | Method for determining an adjacency relation |
US7069240B2 (en) * | 2002-10-21 | 2006-06-27 | Raphael Spero | System and method for capture, storage and processing of receipts and related data |
US20040181749A1 (en) * | 2003-01-29 | 2004-09-16 | Microsoft Corporation | Method and apparatus for populating electronic forms from scanned documents |
US7305129B2 (en) | 2003-01-29 | 2007-12-04 | Microsoft Corporation | Methods and apparatus for populating electronic forms from scanned documents |
US7729928B2 (en) * | 2005-02-25 | 2010-06-01 | Virtual Radiologic Corporation | Multiple resource planning system |
US7974877B2 (en) * | 2005-06-23 | 2011-07-05 | Microsoft Corporation | Sending and receiving electronic business cards |
US20070168382A1 (en) | 2006-01-03 | 2007-07-19 | Michael Tillberg | Document analysis system for integration of paper records into a searchable electronic database |
US20090132605A1 (en) * | 2007-04-19 | 2009-05-21 | 2C Change A/S | Handling of data in a data sharing system |
US20090208103A1 (en) * | 2007-04-22 | 2009-08-20 | Bo-In Lin | Control of optical character recognition (OCR) processes to generate user controllable final output documents |
US20090044095A1 (en) * | 2007-08-06 | 2009-02-12 | Apple Inc. | Automatically populating and/or generating tables using data extracted from files |
US9753908B2 (en) * | 2007-11-05 | 2017-09-05 | The Neat Company, Inc. | Method and system for transferring data from a scanned document into a spreadsheet |
US20100138343A1 (en) * | 2007-12-31 | 2010-06-03 | Bank Of America Corporation | Dynamic hold decisioning |
US20120040717A1 (en) * | 2010-08-16 | 2012-02-16 | Veechi Corp | Mobile Data Gathering System and Method |
US20120166206A1 (en) * | 2010-12-23 | 2012-06-28 | Case Commons, Inc. | Method, computer readable medium, and apparatus for constructing a case management system |
US20140219583A1 (en) * | 2011-06-07 | 2014-08-07 | Amadeus S.A.S. | Personal information display system and associated method |
US20170109610A1 (en) * | 2013-03-13 | 2017-04-20 | Kofax, Inc. | Building classification and extraction models based on electronic forms |
US10558880B2 (en) * | 2015-11-29 | 2020-02-11 | Vatbox, Ltd. | System and method for finding evidencing electronic documents based on unstructured data |
Non-Patent Citations (1)
Title |
---|
Denoue et al., "FormCracker: Interactive Web-based Form Filling," DocEng2010, Sep. 21-24, 2010, Manchester, United Kingdom, 4 pages. |
Also Published As
Publication number | Publication date |
---|---|
US20190155887A1 (en) | 2019-05-23 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US10546351B2 (en) | System and method for automatic generation of reports based on electronic documents | |
US10783367B2 (en) | System and method for data extraction and searching | |
US10354000B2 (en) | Feedback validation of electronically generated forms | |
US10366123B1 (en) | Template-free extraction of data from documents | |
US10013411B2 (en) | Automating data entry for fields in electronic documents | |
US11810070B2 (en) | Classifying digital documents in multi-document transactions based on embedded dates | |
US20110052075A1 (en) | Remote receipt analysis | |
US11625660B2 (en) | Machine learning for automatic extraction and workflow assignment of action items | |
US20110166934A1 (en) | Targeted advertising based on remote receipt analysis | |
US20150186739A1 (en) | Method and system of identifying an entity from a digital image of a physical text | |
US9256805B2 (en) | Method and system of identifying an entity from a digital image of a physical text | |
US20180011846A1 (en) | System and method for matching transaction electronic documents to evidencing electronic documents | |
US10679230B2 (en) | Associative memory-based project management system | |
CN111914729A (en) | Voucher association method and device, computer equipment and storage medium | |
US10817656B2 (en) | Methods and devices for enabling computers to automatically enter information into a unified database from heterogeneous documents | |
CN115809653A (en) | Intelligent contract auditing method and system | |
US10942963B1 (en) | Method and system for generating topic names for groups of terms | |
US20170148033A1 (en) | Preventing restricted trades using physical documents | |
US20170147978A1 (en) | Executing shipments based on physical trade documents | |
WO2017033200A1 (en) | Electronic sorting and classification of documents | |
CN115471228A (en) | Financial business certificate checking method, device, equipment and storage medium | |
CN115880703A (en) | Form data processing method and device, electronic equipment and storage medium | |
US11093899B2 (en) | Augmented reality document processing system and method | |
KR20200045041A (en) | Method for Managing Integration Welfare Support for the Low-income Independents | |
US20240143642A1 (en) | Document Matching Using Machine Learning |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
AS | Assignment |
Owner name: ADP, LLC, NEW JERSEY Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:KUTTY, SANJAY;HONGGUO, AN;VINNAKOTA, SUBHASH C.;AND OTHERS;REEL/FRAME:044203/0108 Effective date: 20171121 |
|
FEPP | Fee payment procedure |
Free format text: ENTITY STATUS SET TO UNDISCOUNTED (ORIGINAL EVENT CODE: BIG.); ENTITY STATUS OF PATENT OWNER: LARGE ENTITY |
|
STPP | Information on status: patent application and granting procedure in general |
Free format text: NON FINAL ACTION MAILED |
|
STPP | Information on status: patent application and granting procedure in general |
Free format text: RESPONSE TO NON-FINAL OFFICE ACTION ENTERED AND FORWARDED TO EXAMINER |
|
STPP | Information on status: patent application and granting procedure in general |
Free format text: FINAL REJECTION MAILED |
|
STPP | Information on status: patent application and granting procedure in general |
Free format text: RESPONSE AFTER FINAL ACTION FORWARDED TO EXAMINER |
|
STPP | Information on status: patent application and granting procedure in general |
Free format text: DOCKETED NEW CASE - READY FOR EXAMINATION |
|
STPP | Information on status: patent application and granting procedure in general |
Free format text: NON FINAL ACTION MAILED |
|
STPP | Information on status: patent application and granting procedure in general |
Free format text: NOTICE OF ALLOWANCE MAILED -- APPLICATION RECEIVED IN OFFICE OF PUBLICATIONS |
|
FEPP | Fee payment procedure |
Free format text: PETITION RELATED TO MAINTENANCE FEES GRANTED (ORIGINAL EVENT CODE: PTGR); ENTITY STATUS OF PATENT OWNER: LARGE ENTITY |
|
FEPP | Fee payment procedure |
Free format text: PETITION RELATED TO MAINTENANCE FEES GRANTED (ORIGINAL EVENT CODE: PTGR); ENTITY STATUS OF PATENT OWNER: LARGE ENTITY |
|
STCF | Information on status: patent grant |
Free format text: PATENTED CASE |
|
AS | Assignment |
Owner name: ADP, INC., NEW JERSEY Free format text: CHANGE OF NAME;ASSIGNOR:ADP, LLC;REEL/FRAME:058959/0729 Effective date: 20200630 |
|
MAFP | Maintenance fee payment |
Free format text: PAYMENT OF MAINTENANCE FEE, 4TH YEAR, LARGE ENTITY (ORIGINAL EVENT CODE: M1551); ENTITY STATUS OF PATENT OWNER: LARGE ENTITY Year of fee payment: 4 |