EP0932873A2 - Verfahren und gerät zum scannen und managen von darstellungen von dokumenten - Google Patents

Verfahren und gerät zum scannen und managen von darstellungen von dokumenten

Info

Publication number
EP0932873A2
EP0932873A2 EP97913708A EP97913708A EP0932873A2 EP 0932873 A2 EP0932873 A2 EP 0932873A2 EP 97913708 A EP97913708 A EP 97913708A EP 97913708 A EP97913708 A EP 97913708A EP 0932873 A2 EP0932873 A2 EP 0932873A2
Authority
EP
European Patent Office
Prior art keywords
sub
dim
text
document
case
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Withdrawn
Application number
EP97913708A
Other languages
English (en)
French (fr)
Inventor
Michael L. Saltsman
Luke A. Spence
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tempest Software Inc
Original Assignee
Tempest Software Inc
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tempest Software Inc filed Critical Tempest Software Inc
Publication of EP0932873A2 publication Critical patent/EP0932873A2/de
Withdrawn legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/93Document management systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/40Information retrieval; Database structures therefor; File system structures therefor of multimedia data, e.g. slideshows comprising image and additional audio data

Definitions

  • the present invention relates to a method and apparatus for a scanning and document management system using hardware and computer software technology. More particularly the present invention relates to the industry catering to users of document management systems and instruction of computers via digitized images.
  • the document management industry is constantly struggling with problems associated with efficient and cost-effective management of voluminous documents.
  • the industry lacks cost-efficient computerized document management system capable of handling the input of a voluminous number of documents into a computer system.
  • one of the biggest stumbling blocks with the present technology is the inability to quickly and efficiently input voluminous number of documents into an imaging database.
  • Current document management systems employ a complicated arrangement of user interfaces that require extensive training in order to adequately utilize the system. Databases with complicated user interfaces are very expensive to implement in an organization. This presents a cost barrier to an organization contemplating installation of such a system.
  • the present invention relates to a method and apparatus for a scanning and document management system using hardware and computer software technology. More particularly, the present invention relates to an efficient methodology for scanning, generating instructions to be performed by a computer based on unique images digitized into the computer, converting to editable and searchable text, organizing and separating in electronic storage, labeling, annotating, viewing, accessing, manipulating, searching and printing of documents.
  • a document consists of one or more pages or sheets containing text and/or graphic symbols. The pages or sheets comprising a document may be derived from any source, including scanned pages and captured video.
  • a document may be in color, in black and white or both.
  • the invention is useful for various commercial applications such as for example in a law practice for case litigation support imaging database to keep track of a plurality of documents produced during litigation.
  • the present invention comprises a scanning and document management system with a mechanism for generating instructions to be performed by a computer based on unique images digitized into the computer, which preferably includes a scanning device electronically coupled to software-enabled computer.
  • the enabling software for scanning, converting to editable and searchable text, organizing and separating in electronic storage, labeling, annotating, viewing, accessing, manipulating, searching, and printing of documents is resident on the computer.
  • the software module for controlling the scanning device may be loaded as part of the scanning device, or scanning device and computer may be loaded with portions of the software module.
  • the method and apparatus for generating instructions to be performed by a computer based on unique images digitized into the computer is facilitated by a unique graphic pattern embodied on a physical medium, such as a sheet of paper or video tape.
  • the image is electronically digitized into a computer.
  • the image on sheet of paper is digitized by scanning the sheet into the computer.
  • the image on a video tape may be digitized into the computer by processing the analog video signals through a commercially available hardware add-on board installed in the computer.
  • the add-on board contains electronic circuitry capable of converting analog video signals to digital video signals.
  • Commercially available presentation software permits viewing of the digital video on the computer monitor. Additionally, the software contains functionality to freeze a particular video frame to save as an individual digitized image.
  • This image may contain a unique graphic pattern designed to provide one or more instructions to the computer.
  • a special software module examines and inte ⁇ rets the digitized image and thus produces one or more computer instructions.
  • the computer executes the generated instructions to produce a desired result.
  • An example of a practical application of this invention is in the use of a physical document separator, containing a unique graphic pattern (image), to mark the beginning and end of each document in a large volume of documents scanned collectively into a computer as a single electronic file.
  • a software module is used to examine and recognize the document separator images and thus produce and store a separate electronic file for each individual document contained in the single larger electronic file containing a plurality of documents. This is particularly desirable in various commercial applications such as in a law practice case litigation support imaging database where large volumes of documents are scanned to be electronically managed and used.
  • Figure 1 is a block diagram of a computer system with attached accessories according to the present invention
  • Figure 2 is a flowchart depicting the Document Separator logic of the preferred embodiment of the present invention
  • Figure 3 is a screen shot showing the PC START screen
  • Figure 4 is a screen shot showing the Main Information Screen (MIS);
  • Figure 5 is a screen shot of Watermark Professional Editor;
  • Figure 6 shows the contents of the CDINFO.DAT file
  • Figure 7 is a screen shot of the Briefing Tool
  • Figure 8 is a screen shot of the Transcript Viewer
  • Figure 9 is a screen shot of the ISYS Query screen
  • Figure 10 is a screen shot of the ISYS Search screen
  • Figure 11 is a screen shot of the Word Wheel search screen
  • Figure 12 is a screen shot of the System Settings screen
  • Figure 13 is a screen shot of the Preferences screen
  • Figure 14 is a screen shot of the Report Names screen which enables running of reports
  • Figure 15 is a screen shot of the Report Names screen which enables naming of reports
  • Figure 16 is a screen shot of the Duplicates Screen
  • Figure 17 is a screen shot of the Exhibit List screen
  • Figure 18 is a screen shot of the Litigation Bates Number Label Maker screen
  • Figure 19 is a screen shot of the Litigation Document Number Label Maker screen
  • Figure 20 is a screen shot of the Tempest Image Printer screen
  • Figure 21 is a screen shot of Luke's Watermark Scan Utility screen
  • Figure 22 depicts the Document Separator of the preferred embodiment
  • Figure 23 is an example of a multi-page TIFF file
  • Figure 24 shows a page divided into sixty-four sections
  • Figure 25 shows further division of the sixty-four sections into four quadrants
  • Figure 26 is an overlay of the 64 sections and 4 quadrants on a document page
  • Figure 27 is an overlay of the 64 section and 4 quadrants on the designated Document Separator
  • Figure 28 shows documents separated after execution of the Document Separator module
  • Figure 29 is a screen shot of the Watermark Exhibit Scan Utility; and Figure 30 is a screen shot of Luke's Automated OCR'ing Utility.
  • software for the present invention is developed using Microsoft's Visual Basic programming language in 32-bit mode.
  • the software portion of the preferred embodiment uses Microsoft Jet as the database engine.
  • the computer system 1-100 generally comprises a video display system 1-110, a keyboard 1-120, and a mouse 1-130.
  • the computer system 1-100 also preferably includes various standard components, including at least one central processing unit (CPU), memory, a hard drive, a CD-ROM drive, a floppy disk drive, one or more buses, and a power supply.
  • CPU central processing unit
  • the computer system 1-100 of the preferred embodiment includes a 200 Mhz Pentium MMX CPU, 32 megabytes of random access memory (RAM), 4 gigabytes of hard disk space, a 24X CD-ROM drive, 3.5" 1.4 megabyte floppy disk drive, a 17 inch monitor with 1024x768 resolution and a similarly equipped video card.
  • the software program is stored on a CD-ROM disk 1-400, floppy disks 1-500 and/or hard drive of the computer 1-100 for execution by the CPU.
  • the preferred embodiment of the present invention also includes a high-speed scanner 1-200, such as the commercially available Fujitsu scanner M3093GX, connected to the computer system 1-100.
  • the preferred Fujitsu scanner is rated with a scanning speed of 27 pages per minute and is capable of 200 to 400 DPI resolution.
  • any industry standard, i.e., TWAIN compliant, scanner with scanning speed of 24 to 30 pages per minute and 200 to 400 DPI resolution will satisfy the requirements of the preferred embodiment.
  • the preferred embodiment uses the optional sheet feeder 1-210 of the scanner 1-200 to facilitate high speed scanning of documents into the system.
  • the present invention preferably implements a document separator 1-300 placed at the end of each document to separately define the beginning and end of each document in a stack.
  • a high-speed laser printer 1-600 also preferably is attached to the computer system 1- 100.
  • the preferred embodiment uses the foregoing components to practice the present invention, as described below. One skilled in the art will understand, however, that modifications or omissions may be made to the list of preferred components without departing from the principles of the present invention.
  • the preferred embodiment of the present invention uses software containing various programming modules.
  • the software portion of the preferred embodiment of the present invention is inco ⁇ orated in its entirety and attached as the Appendix.
  • the PC START module functions as the main menu for the database management system.
  • the PC START module is an overlay program that acts as the master control program to manage a plurality of databases.
  • each document database contains information on a lawsuit or legal case, although the present invention has other applications.
  • the PC START screen provides the user with options to perform various functions on each case database. For example, the user can select functions such as Load Case 3-100, Create Case 3-200, Delete Case 3-300, Repair Case 3-400 and Compress Case 3-500.
  • the user can perform various system administration functions by selecting System Admin 3-600, produce labels by selecting Make Labels 3-700 or print documents by selecting Print Docs 3-800.
  • the Current Cases list- box 3-900 permits the user to select which case database(s) will be impacted by selection of one or more of the foregoing functions.
  • the Load Case module enables the user to load the selected case 3-900 into the computer system's memory 1-100. In order to load a case, the case must exist in the Current Cases list-box 3-900.
  • the Main Information Screen (MIS), Figure 4 is displayed upon selection of the Load Case 3-100 option from the PC START screen. ( Figure 3).
  • the MIS, Figure 4 contains the document image and document briefing information.
  • Document briefing is accomplished by the entry of summary text into various fields on the MIS such as "To” 4-110, “From” 4-120, “CC's” 4-130, “Description” 4- 140 and “Comments” 4-150.
  • Each of the foregoing briefing fields can be searched while in the MIS, Figure 4, by simply double-clicking on the desired field to search.
  • Case Issues 4-200 on the MIS, Figure 4 enable the user to designate special characteristics of the currently displayed document in document thumbnail 4-300. The special characteristics are designated by placing a checkmark in the box provided to the right of each Case Issue 4-200.
  • the use of Case Issues enables a novice user to produce sophisticated reports without any programming on the part of the user. For example, the user can easily produce a report containing all documents in the case database which involve expert testimony. Pages 15 to 17 of the Appendix show the programming logic to produce the reports based on case issues defined and selected by the user.
  • Case Issues 4-200 are user-defined and thus are not predetermined by the system. The user defines case issues relevant to a case database in the Preferences screen, Figure 13.
  • the Preferences screen, Figure 13 is selected by clicking on the Settings button 12-200 of the System Settings screen, Figure 12.
  • the user performs a one-time setup of chosen case issues by entering a case issue in each field of the data entry row 13-100.
  • the case issues entered in the data entry row 13-100 are relationally connected to the Case Issues 4-200 on the MIS, Figure 4, the report screens, Figures 14 and 15, and all associated program code and tables.
  • the Preferences screen, Figure 13, additionally permits the user to determine system folders 13-200 in which files related to a particular case database will reside on the computer system 1-100.
  • the document thumbnail 4-300 will be displayed in an individual window if the user double-clicks on the document thumbnail 4-300 using the mouse 1-130 or selects the Go To button 4-410.
  • Any commercially available image-viewing program may be used to display the thumbnail document 4-300.
  • the preferred embodiment of the present invention uses commercially available Watermark Professional Edition image viewer. While in the image viewer, Figure 5, the document display size may be changed to suit the user's needs.
  • the user can print the document, annotate the document using text or audio annotations and highlight significant portions of the document.
  • the system is capable of automatically determining the location of a user-requested document in a multi-volume CD- ROM document image database. This feature is facilitated by the system's ability automatically keep track of the location of all documents as they are scanned into the system, even if more than one CD-ROM is required to store the scanned document images.
  • This functionality is useful in circumstances where the user's computer system has a single CD- ROM drive. The functionality is particularly helpful on laptop computers, which customarily have a single CD-ROM drive.
  • the preferred embodiment maintains two different versions of each document in each case database: (1) the scanned image and (2) the full-text OCR version.
  • the scanned image version is displayed in thumbnail mode 4-300 and using the image viewer program.
  • the scanned image version is the more complete and accurate version of a document because it is essentially an identical image of the document as it was scanned into the system. While the scanned image is ideal for viewing an identical copy of the original document, the scanned image lacks the capability to be searched for text.
  • the physical limitation of the scanned image version necessitates a full-text version of the document.
  • the full-text version is obtained though the use of optical character recognition (OCR) software to convert the scanned image into a text-only file (see discussion below on the OCR process).
  • OCR optical character recognition
  • the Briefing Tool, Figure 7 of the preferred embodiment enables the user to view the scanned image of the document thumbnail 4-300 by clicking on the Brief button 4-420 on the MIS, Figure 4.
  • the Briefing Tool, Figure 7, differs from the image viewer, Figure 5, in that the Briefing Tool displays the document image at all time while it is open. It is desirable to have the functionality of an image viewer that remains on top of all other windows at all times because it permits the user to have both the MIS, Figure 4, and Briefing Tool, Figure 7, up on the monitor at the same time to facilitate efficient briefing of the document in the MIS. This functionality facilitates multiple users briefing the same case database at the same time without having the need to have actual physical document copies accessible to each briefer. Additionally, the Briefing Tool reduces the likelihood of lost physical document copies. As evident from the programming source code on page 26 of the Appendix, the Briefing Tool, Figure 7, of the preferred embodiment is written in TMS/Sequia OCX technology.
  • the Transcript Viewer is an ASCII transcript viewer. It displays an entire page of a deposition proceeding at a time.
  • the Transcript Viewer is initiated by clicking on the Transcr button 4-440 on the MIS, Figure 4.
  • the Transcript Viewer has the capability to automatically recognize various transcript types, such as different transcript types received from court reporters, and to automatically perform all necessary formatting and loading into the system. This feature is particularly desirable because state of the art systems lack this feature and thereby require significant labor to format transcripts prior to use on other state of the art document management systems.
  • the Transcript Viewer has a Find Word 8-100 feature, which enables the user to search transcripts for particular text strings.
  • Transcript Viewer in this preferred embodiment is the ability to use it in conjunction with any word processing software, such as Microsoft Word or Corel's WordPerfect.
  • Any word processing software such as Microsoft Word or Corel's WordPerfect.
  • the combination use enables the user to highlight any selected portion of a transcript, then click on the Copy button 8-200 in the Transcript Viewer.
  • the copied information can then be pasted into the word processor.
  • the Copy 8-200 function accomplishes much more than the average cut and paste routine.
  • the Copy 8- 200 function provides additional valuable information relating to the copied transcript text.
  • the Copy 8-200 function automatically generates document reference including information such as the page and line number of the copied text and the name of the transcript. This automatic functionality facilitates faster deposition summaries and easier inco ⁇ oration of deposition quotes in litigation pleadings.
  • the Transcript Viewer allows condensed printing of transcripts - up to four pages of transcripts on a single physical printed page.
  • the ISYS Query screen, Figure 9 of the preferred embodiment enables the user to perform full-text searches on the OCR version of the documents in the case database.
  • the ISYS Query screen, Figure 9 is initiated by clicking on the FullText button 4-470 on the MIS, Figure 4.
  • the preferred embodiment of the present invention uses a commercially available search engine called ISYS, produced by Odyssey Development Co ⁇ oration, for performing the full-text searches.
  • the third party ISYS Search screen, Figure 10 is initiated by clicking on the Q button 9-100.
  • the user enters queries in the query field 10-100 using search connectors 10-200 as needed.
  • the user may choose to use the Word Wheel icon 10-300 to search for the number of occurrences of any given specific word.
  • the Word Wheel Search screen, Figure 11 is initiated when the user clicks on the Word Wheel icon 10-300 on the ISYS Search screen, Figure 10.
  • the user enters the specific word to search for in the query field 11-100 in Figure 11.
  • the user must select one of the search methods 11-200.
  • the self-explanatory choices are "Starts with” and "Sounds like.”
  • the result of the single-word Word Wheel search is displayed in the result list-box 11-300.
  • the Add button 4-450 and the Delete button 4-460 on the MIS, Figure 4 of the preferred embodiment enable the user to perform standard add and delete processing of case database records.
  • the Reports button 4-480 on the MIS, Figure 4 enables the user to print reports as defined by the user's System Settings, Figure 12, Reports button 12-300 and Rpt Names button 12-400.
  • Report Screen, Figure 15 enables the user to name reports.
  • Report Screen, Figure 14, enables the user to run reports.
  • the Print Ser button 4-490 on the MIS, Figure 4 provides the functionality to produce an image of an index card that looks exactly like the MIS, Figure 4.
  • the Duplicates button 12-500 on the System Settings screen, Figure 12 performs a check for duplicate records in the case database.
  • the result of the duplicates check is displayed on the Duplicates Screen, Figure 16.
  • the Duplicates Screen is a grid type screen that includes an entry for every document in the database.
  • the duplicates checking routine performs approximately 14 separate checks on document numbers.
  • Duplicate records are grouped together based on document numbers. Duplicate records are highlighted and deleted by pressing the delete key.
  • the Duplicates Screen has the added functionality of providing the database administrator the capability to perform mass maintenance functions on all records in the database. For example, the administrator can globally make changes to all data in a case database.
  • Various reports can also be produced using the Duplicates Screen.
  • Figure 16 The reporting capabilities found here, i.e., Figure 16, are separate and distinct from other reporting features of the system. All appearance changes on this screen, Figure 16, such as hiding various columns for printing pu ⁇ oses, are temporary, while all data content changes, such as global find and replace, are permanent.
  • the Exhibit List button 12-600 on the System Settings screen, Figure 12 displays the Exhibit List screen, Figure 17.
  • the Exhibit List screen, Figure 17 allows the user to modify parts of the data for use in an automatic exhibit list which can be printed from the Report Screens, Figures 14 and 15.
  • the preferred embodiment of the present invention provides the user with the option of selecting the user's choice of full-text search engine Figure 9.
  • the system of the preferred embodiment is designed to automatically detect what particular component is installed on the user's computer system 1-100 and to automatically develop necessary links to enable use of the installed components as the user's full-text search engine.
  • the Create Case module permits the user to add a new case database.
  • a case database must be added to the computer system 1- 100 before any of the database actions listed in Figure 3 can be performed on the documents and the associated information which comprise a case database.
  • a series of dialogue boxes step the user through the process of creating a new database. The procedure for adding a new case database is initiated by clicking on the Create Case button 3-200 on the PC START screen, Figure 3.
  • the Delete Case module permits the user to delete a case database from the computer system 1-100. In order to delete a case, the case must exist in the Current Cases list-box 3-900 and must be selected by the user.
  • the delete case function is password protected to prevent unauthorized deletion of case databases.
  • a series of dialogue boxes step the user through the process of deleting a database. The procedure for deleting a case database is initiated by clicking on the Delete Case button 3-300 on the PC START screen, Figure 3.
  • the Repair Case module permits the user to perform various repair functions on a case database residing on the computer system 1- 100.
  • the case In order to repair a case, the case must exist in the Current Cases list-box 3-900 and must be selected by the user.
  • the Repair Case module corrects technical problems with the case database such as damaged or defective files. It is critical to repair a case when a damaged or defective file prevents the system from functioning as intended.
  • a series of dialogue boxes step the user through the process of repairing a database. The procedure for repairing a case database is initiated by clicking on the Repair Case button 3-400 on the PC START screen, Figure 3.
  • the Compress Case module permits the user to perform various maintenance functions on a case database residing on the computer system 1-100.
  • the case In order to compress a case, the case must exist in the Current Cases list-box 3-900 and must be selected by the user.
  • the Compress Case module performs tasks such as reorganization and defragmentation of the selected case database.
  • case databases are added and deleted as needed. However, the system does not physically delete a case database and its associated files until the case database is compressed. Thus, it is desirable to compress a case to free up storage space occupied logically deleted files.
  • Compressing a case improves overall performance of the system and allows faster processing of information contained in the case database.
  • a series of dialogue boxes step the user through the process of compressing a database. The procedure for compressing a case database is initiated by clicking on the Compress Case button 3-500 on the PC START screen, Figure 3.
  • the Make Labels module allows the user to print litigation label numbers based on either the Bates numbers stamped document production during litigation or case database system specific Document Number.
  • Figure 18 demonstrates the preferred embodiment's form for Bates number based label maker and
  • Figure 19 demonstrates the preferred embodiment's form for Document number based label maker.
  • the Print Docs module enables the user to perform high-speed document production.
  • the module contains functionality which acts as a batch print utility for delayed or scheduled printing of scanned image files on the high-speed laser printer 1-600.
  • the batch print capability of the Print Docs module is very useful as a cost-effective method for automated and scheduled high-speed document production.
  • the Tempest Image Printer screen enables the user to select one or more files to be printed. By simply clicking on check boxes, the system is capable of batching all files for deferred printing and production of selected documents on high-speed network printers. This feature eliminates costs associated with manual production of documents by clerks - a process that requires clerks to manually locate and copy selected documents for production.
  • the preferred embodiment of the present invention uses the commercially available Hewlett Packard 5Si model. However, the preferred embodiment uses a 5Si model with its standard memory upgraded to 32 megabytes. The upgraded 5 Si reduces the processing load on the computer system's 1-100 resources and effectively improves overall system performance.
  • the present invention envisions a system whereby the batch print functionality may be easily expanded by one skilled in the art to enable multiple printers to simultaneously share the burden of printing high-volume batched documents. The accelerated print management produces significant time and labor savings.
  • Luke's Watermark Scan Utility module is the mechanism of the preferred embodiment of the present invention by which all documents that belong to a case database are scanned into the computer system 1-100. The documents are scanned into the computer system 1-100 using the attached high-speed scanner 1-200. Luke's Watermark Scan Utility screen, Figure 21, is displayed when the scanning utility module is initiated.
  • the scanning utility of the preferred embodiment will work with any industry standard, i.e., TWAIN compliant, scanner. However, the utility can be modified to work with specific high-speed scanners.
  • the scan utility of the preferred embodiment includes special driver software for the preferred Fujitsu high-speed scanner. Scanner setup is accomplished by means of clicking on the Setup button 21-100 in Figure 21.
  • the scan utility automatically links the scanned image into the case database and gives it a document number and bates label number.
  • the automatic linking and setup of the scanned images into the case database is highly desirable because it saves significant manual effort that would otherwise be required.
  • the Scan button 21-200 initiates scanning of as many documents as are in the scanner's 1-200 sheet feeder 1-210.
  • the Single button 21-300 initiates scanning of only a single page of a document.
  • the Scan90 button 21-400 makes a 90-degree orientation adjustment to a document as it is scanned into the system. This eliminates the need for the user to adjust the scanned image when viewing it.
  • the Save button 21-500 permanently stores scanner settings.
  • the scan utility keeps a running tab of all scanned images that have not yet been processed by the OCR routine, i.e., conversion to full- text version of the scanned image. This is desirable because the OCR list generated by the scan utility is used in the OCR process without having to manually determine which scanned images need to be OCRed.
  • the Document Separator module is designed to facilitate efficient scanning of large volumes of documents in a continuous stream.
  • Documents are typically scanned one document at a time in order to signal to the system the end of one document and the start of another. Scanning one document at a time is a laborious and time-consuming manual procedure. It is desirable to perform continuous scanning of documents because it eliminates the need to manually signal the end of each document (containing one or more physical pages) and the start of the next document.
  • continuous scanning of documents is achieved by placing a physical document separator between each of the documents before scanning.
  • the physical document separator, Figure 22 is a physical sheet of paper that contains a predetermined unique image pattern.
  • FIG 23 shows an example of a continuous stream of documents separated by the designated uniquely patterned document separator.
  • the document separator module examines and processes each multi-page TIFF file to produce individual documents and saves each such document as a separate electronic file.
  • the logic of Document Separator is described in the flowchart in Figure 2. Specifically, the process begins by loading and displaying each scanned image on the computer screen. The width and height of the scanned sheet is determined and the sheet is divided into sixty-four (64) sections, Figure 24. The 64 sections are further divided into four (4) quadrants, Figure 25.
  • Figure 26 demonstrates overlay of the 64 sections and 4 quadrants on a page of a document.
  • Figure 27 demonstrates overlay of the 64 sections and 4 quadrants on a page that is the designated document separator with the predetermined unique graphic pattern. Thereafter, the software module takes one thousand (1000) color samples or pixel values from each quadrant. Each color sample is examined and compared against the predetermined unique graphic pattern of the designated document separator to determine whether the scanned page is the designated document separator. If the scanned page does not correspond identically/substantially to that of the predetermined unique image, (the Document Separator template), the scanned page is marked as part of an ongoing document and the process is repeated again. However, if the page is the designated document separator, then the end of the current document is indicated and all scanned pages prior to the document separator are saved as a single document in a separate electronic file.
  • the software module takes one thousand (1000) color samples or pixel values from each quadrant. Each color sample is examined and compared against the predetermined unique graphic pattern of the designated document separator to determine whether the scanned page is the designated document separator. If the scanned page does not correspond identically
  • Figure 28 displays the end result of the foregoing process for the example multi-page TIFF file shown in Figure 23.
  • the document separator routine At the completion of the document separator routine, four documents are extracted in Figure 28 from the single multi-page TIFF file in Figure 23. This process is repeated until all documents have been separated in a similar manner.
  • the utility of the present invention lies in part in time savings realized through continuous scanning of large volumes of plurality of documents using a highspeed scanner.
  • the Watermark Exhibit Scan Utility module is an automatic trial exhibit maker. It allows exhibit descriptions to be added as the documents are being scanned.
  • Luke's Automated OCR'ing Utility module enables the OCR software, such as Omni Page Pro, to run in a batch file mode at various scheduled times. It corrects errors and loads the assigned documents to the OCR program.
  • NewCaseName Left$ (NewCaseName , 8)
  • ChrName M ⁇ d$ (NewCaseName, Y, 1) Select Case UCase (ChrName;
  • CurDrv Left(CurD ⁇ r, 2) 1 Get current drive letter.
  • TmpPath UCase (HomeDir & " ⁇ " Si NewCaseName) ' Make path specification.
  • TmpPath UCase (HomeDir & " ⁇ ” & NewCaseName “ ⁇ IMAGES”)
  • TmpPath UCase (HomeDir & " ⁇ ” & NewCaseName “ ⁇ DOCS” )
  • TmpPath UCase (HomeDir & " ⁇ ” & NewCaseName “ ⁇ DEP0S")
  • CurDrv Left(CurD ⁇ r, 2) ' Get current drive letter.
  • NewCaseName Left$ (NewCaseName , 8)
  • ChrName M ⁇ d$ (NewCaseName, Y, 1) Select Case UCase (ChrName)
  • LoadAccess "msarn200.exe “ & Database & " /ini pchaser.ini”
  • X Shell (LoadAccess , 1)
  • Rem ProqramPath.DataField "DbPath” Rem MsqBox qs FileName$ Rem REMOVE THIS HARD CODE
  • LoadAccess "C: ⁇ PCHASER ⁇ msarn200.exe “ & " C: ⁇ PCHA ⁇ ER ⁇ ” & gs_FileName$ & " /ini pchaser . ini”
  • Data2.DatabaseName qs FileName$
  • LoadAccess "msarr.200 . exe” & ProgramPath . Caption & " /mi pchaser . i "
  • ReportF ⁇ leName$ "RPT20.RPT” p asp ,, ⁇ si5e ,'Fr, ⁇ " ⁇ i T ⁇ b l3 .
  • RenortF ⁇ leName$ Listl.Text & ".rpt" End Select
  • CrystalReportl .DataFiles (0) App. Path & " ⁇ ” & gs F ⁇ leName$ CrystalReportl .
  • ReportFileName App. Path & " ⁇ ” & ReportF leName
  • MyCriteria gs FieldName$ & " like " & ••'* » & txtSearch.Text & "*'"
  • Me. eft (Screen.Width - Me.Width) / 2
  • MyCriteria qs FieldName$ & " like " & "'*” & txtSearch.Text & "*'”
  • frmlmageViewer.vdVBX.Visible False vdVBX.
  • Width frmlmageViewer .ScaleWidth vdVEX.
  • Height frmlmageViewer . ScaleHeigh - vdVBX.Top
  • FontSize 18 frmlntro.
  • CurrentX 72: rmlntro.
  • CurrentY 202

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Business, Economics & Management (AREA)
  • General Business, Economics & Management (AREA)
  • Processing Or Creating Images (AREA)
EP97913708A 1996-10-22 1997-10-21 Verfahren und gerät zum scannen und managen von darstellungen von dokumenten Withdrawn EP0932873A2 (de)

Applications Claiming Priority (5)

Application Number Priority Date Filing Date Title
US2898596P 1996-10-22 1996-10-22
US2942596P 1996-10-22 1996-10-22
US29425P 1996-10-22
US28985P 1996-10-22
PCT/US1997/018935 WO1998018092A2 (en) 1996-10-22 1997-10-21 Method and apparatus for scanning and managing document images

Publications (1)

Publication Number Publication Date
EP0932873A2 true EP0932873A2 (de) 1999-08-04

Family

ID=26704356

Family Applications (1)

Application Number Title Priority Date Filing Date
EP97913708A Withdrawn EP0932873A2 (de) 1996-10-22 1997-10-21 Verfahren und gerät zum scannen und managen von darstellungen von dokumenten

Country Status (3)

Country Link
EP (1) EP0932873A2 (de)
AU (1) AU5083598A (de)
WO (1) WO1998018092A2 (de)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CA2420422C (en) * 2000-08-31 2009-10-06 Ontrack Data International, Inc. System and method for data management

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH0827831B2 (ja) * 1986-05-16 1996-03-21 株式会社日立製作所 画像登録方式
JPH03202966A (ja) * 1989-12-28 1991-09-04 Toshiba Corp 電子ファイリング装置
EP0592079A2 (de) * 1992-09-20 1994-04-13 Sun Microsystems, Inc. Automatisierte Softwareinstallierung und Betriebsumgebungskonfigurierung in einem Rechnersystem

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
See references of WO9818092A2 *

Also Published As

Publication number Publication date
AU5083598A (en) 1998-05-15
WO1998018092A2 (en) 1998-04-30
WO1998018092A3 (en) 1998-08-20

Similar Documents

Publication Publication Date Title
US7773822B2 (en) Apparatus and methods for management of electronic images
US6708189B1 (en) Computer file transfer system
US5706457A (en) Image display and archiving system and method
US6353840B2 (en) User-defined search template for extracting information from documents
US6009442A (en) Computer-based document management system
KR100345945B1 (ko) 텍스트및이미지문서를동기시켜표시하며조작하기위한방법및장치
US6820094B1 (en) Computer-based document management system
US6779153B1 (en) Creation of web pages through synchronization
US6237011B1 (en) Computer-based document management system
US7081975B2 (en) Information input device
US6810404B1 (en) Computer-based document management system
US6549913B1 (en) Method for compiling an image database, an image database system, and an image data storage medium
US5675780A (en) Method and apparatus for storing data in database form to a compact disc using a script file to describe the input format of data
US20050185225A1 (en) Methods and apparatus for imaging documents
US20060235855A1 (en) Digital library system
US6571248B1 (en) Data processing method and apparatus
GB2330227A (en) Automatic document archiving for a computer system
CN1122245C (zh) 图象相片产生方法和系统
US20080243818A1 (en) Content-based accounting method implemented in image reproduction devices
Sankar et al. Digitizing a million books: Challenges for document analysis
US20020083084A1 (en) Database system, its control method, and information processing apparatus
US5983231A (en) Instance updating method and apparatus therefor
US20030101199A1 (en) Electronic document processing system
JPH11272654A (ja) 文書編集装置及び方法
EP0932873A2 (de) Verfahren und gerät zum scannen und managen von darstellungen von dokumenten

Legal Events

Date Code Title Description
PUAI Public reference made under article 153(3) epc to a published international application that has entered the european phase

Free format text: ORIGINAL CODE: 0009012

17P Request for examination filed

Effective date: 19990521

AK Designated contracting states

Kind code of ref document: A2

Designated state(s): AT BE CH DE ES FR GB IE IT LI NL

17Q First examination report despatched

Effective date: 20010709

STAA Information on the status of an ep patent application or granted ep patent

Free format text: STATUS: THE APPLICATION IS DEEMED TO BE WITHDRAWN

18D Application deemed to be withdrawn

Effective date: 20020122