AU2001284369A1 - System and method for automatic preparation and searching of scanned documents - Google Patents

System and method for automatic preparation and searching of scanned documents

Info

Publication number
AU2001284369A1
AU2001284369A1 AU2001284369A AU8436901A AU2001284369A1 AU 2001284369 A1 AU2001284369 A1 AU 2001284369A1 AU 2001284369 A AU2001284369 A AU 2001284369A AU 8436901 A AU8436901 A AU 8436901A AU 2001284369 A1 AU2001284369 A1 AU 2001284369A1
Authority
AU
Australia
Prior art keywords
data
searching
image
digital format
automatic preparation
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Abandoned
Application number
AU2001284369A
Inventor
Emil Shteinvil
Yonatan Pesach Stern
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Ignite Olive Software Solutions Inc
Original Assignee
Olive Software Inc
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Olive Software Inc filed Critical Olive Software Inc
Publication of AU2001284369A1 publication Critical patent/AU2001284369A1/en
Abandoned legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/31Indexing; Data structures therefor; Storage structures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/50Information retrieval; Database structures therefor; File system structures therefor of still image data
    • G06F16/58Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
    • G06F16/583Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content
    • G06F16/5846Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content using extracted text

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Library & Information Science (AREA)
  • Software Systems (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Character Discrimination (AREA)
  • Radar Systems Or Details Thereof (AREA)
  • Electrotherapy Devices (AREA)

Abstract

A system and a method for converting microfilm data in a digital format for publishing through a network such as the Internet. First, an image is created of the microfilm, preferably in the TIFF format. Next, the words of the image are recognized through a process of OCR (optical character recognition), with an associated probability of error. The image data can then be converted into a digital format for publication, for example as XML data. Preferably, the user is able to perform a keyword search on the digital format data. More preferably, the keyword search is an adaptive search.
AU2001284369A 2000-08-24 2001-08-24 System and method for automatic preparation and searching of scanned documents Abandoned AU2001284369A1 (en)

Applications Claiming Priority (3)

Application Number Priority Date Filing Date Title
US22751200P 2000-08-24 2000-08-24
US60227512 2000-08-24
PCT/IL2001/000797 WO2002017166A2 (en) 2000-08-24 2001-08-24 System and method for automatic preparation and searching of scanned documents

Publications (1)

Publication Number Publication Date
AU2001284369A1 true AU2001284369A1 (en) 2002-03-04

Family

ID=22853387

Family Applications (1)

Application Number Title Priority Date Filing Date
AU2001284369A Abandoned AU2001284369A1 (en) 2000-08-24 2001-08-24 System and method for automatic preparation and searching of scanned documents

Country Status (6)

Country Link
EP (1) EP1312039B1 (en)
AT (1) ATE322051T1 (en)
AU (1) AU2001284369A1 (en)
DE (1) DE60118399T2 (en)
IL (1) IL154586A0 (en)
WO (1) WO2002017166A2 (en)

Families Citing this family (20)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7050630B2 (en) * 2002-05-29 2006-05-23 Hewlett-Packard Development Company, L.P. System and method of locating a non-textual region of an electronic document or image that matches a user-defined description of the region
FR2880709B1 (en) * 2005-01-11 2014-04-25 Vision Objects METHOD OF SEARCHING, RECOGNIZING AND LOCATING INK, DEVICE, PROGRAM AND LANGUAGE CORRESPONDING
FR2880708A1 (en) * 2005-01-11 2006-07-14 Vision Objects Sa Term e.g. typed character, searching method for digital handwritten document, involves converting handwritten data into intermediate data, in intermediate format, in form of segmentation graph, and searching terms on intermediate data
EP1684199A3 (en) * 2005-01-19 2008-07-09 Olive Software, Inc. Digitization of microfiche
DK176835B1 (en) 2008-03-07 2009-11-23 Jala Aps Method of scanning, medium containing a program for carrying out the method and system for carrying out the method
DK176834B1 (en) 2008-03-07 2009-11-23 Jala Aps Procedure for scanning
DE102009031872A1 (en) 2009-07-06 2011-01-13 Siemens Aktiengesellschaft Method and device for automatically searching for documents in a data memory
US9087059B2 (en) 2009-08-07 2015-07-21 Google Inc. User interface for presenting search results for multiple regions of a visual query
US8670597B2 (en) 2009-08-07 2014-03-11 Google Inc. Facial recognition with social network aiding
US9135277B2 (en) 2009-08-07 2015-09-15 Google Inc. Architecture for responding to a visual query
US9405772B2 (en) 2009-12-02 2016-08-02 Google Inc. Actionable search results for street view visual queries
US9183224B2 (en) 2009-12-02 2015-11-10 Google Inc. Identifying matching canonical documents in response to a visual query
US8811742B2 (en) 2009-12-02 2014-08-19 Google Inc. Identifying matching canonical documents consistent with visual query structural information
US8977639B2 (en) 2009-12-02 2015-03-10 Google Inc. Actionable search results for visual queries
US9176986B2 (en) 2009-12-02 2015-11-03 Google Inc. Generating a combination of a visual query and matching canonical document
US8805079B2 (en) 2009-12-02 2014-08-12 Google Inc. Identifying matching canonical documents in response to a visual query and in accordance with geographic information
US9852156B2 (en) 2009-12-03 2017-12-26 Google Inc. Hybrid use of location sensor data and visual query to return local listings for visual query
WO2012075315A1 (en) * 2010-12-01 2012-06-07 Google Inc. Identifying matching canonical documents in response to a visual query
US8935246B2 (en) 2012-08-08 2015-01-13 Google Inc. Identifying textual terms in response to a visual query
US20230214579A1 (en) * 2021-12-31 2023-07-06 Microsoft Technology Licensing, Llc Intelligent character correction and search in documents

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5265242A (en) * 1985-08-23 1993-11-23 Hiromichi Fujisawa Document retrieval system for displaying document image data with inputted bibliographic items and character string selected from multiple character candidates
FR2768825B1 (en) * 1997-09-22 2001-01-26 Aerospatiale DEVICE FOR DIGITIZING AND SEARCHING DOCUMENTS
CA2326901A1 (en) * 1998-04-01 1999-10-07 William Peterman System and method for searching electronic documents created with optical character recognition

Also Published As

Publication number Publication date
WO2002017166A2 (en) 2002-02-28
EP1312039B1 (en) 2006-03-29
WO2002017166A3 (en) 2002-06-13
ATE322051T1 (en) 2006-04-15
DE60118399T2 (en) 2006-12-07
DE60118399D1 (en) 2006-05-18
IL154586A0 (en) 2003-09-17
EP1312039A2 (en) 2003-05-21

Similar Documents

Publication Publication Date Title
AU2001284369A1 (en) System and method for automatic preparation and searching of scanned documents
US7245765B2 (en) Method and apparatus for capturing paper-based information on a mobile computing device
NZ336512A (en) Image based document processing system comparing recognition results from primary and secondary lists
US20020002461A1 (en) Data processing system for vocalizing web content
US5963966A (en) Automated capture of technical documents for electronic review and distribution
CN100372372C (en) Free text and attribute search of electronic program guide data
CN1343337B (en) Method and device for producing annotation data including phonemes data and decoded word
US7450760B2 (en) System and method for capturing and processing business data
CN1691631A (en) Method for management of vcards
US20100100371A1 (en) Method, System, and Apparatus for Message Generation
US20030005045A1 (en) Device and program for structured document generation data structure of structural document
CN1723458A (en) Method and system for utilizing video content to obtain text keywords or phrases for providing content related links to network-based resources
JPH05268459A (en) Format transmission method
CN102915437A (en) Text information identification method and system
CN110737629A (en) method and system for archiving electronic files
GB2307619A (en) Internet information access system
US8467609B2 (en) Document management device and document management method with identification, classification, search, and save functions
CN101651938A (en) Telephone number recognition system for mobile terminal and application method thereof
US20100034460A1 (en) Document management system and remote document management method with identification, classification, search, and save functions
CN101261645B (en) Method and apparatus for obtaining multiple layer information
CN101872344A (en) Control method for image scanning
CN115774805A (en) File intelligent query method and system based on digital processing
US20020069224A1 (en) Markup language document conversion apparatus and method
CN103684991A (en) Junk mail filtering method based on mail features and content
JP3443515B2 (en) Facsimile electronic mail device