AU2001284369A1 - System and method for automatic preparation and searching of scanned documents - Google Patents
System and method for automatic preparation and searching of scanned documentsInfo
- Publication number
- AU2001284369A1 AU2001284369A1 AU2001284369A AU8436901A AU2001284369A1 AU 2001284369 A1 AU2001284369 A1 AU 2001284369A1 AU 2001284369 A AU2001284369 A AU 2001284369A AU 8436901 A AU8436901 A AU 8436901A AU 2001284369 A1 AU2001284369 A1 AU 2001284369A1
- Authority
- AU
- Australia
- Prior art keywords
- data
- searching
- image
- digital format
- automatic preparation
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Abandoned
Links
- 238000000034 method Methods 0.000 title abstract 3
- 238000012015 optical character recognition Methods 0.000 abstract 2
- 230000003044 adaptive effect Effects 0.000 abstract 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/31—Indexing; Data structures therefor; Storage structures
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/50—Information retrieval; Database structures therefor; File system structures therefor of still image data
- G06F16/58—Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
- G06F16/583—Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content
- G06F16/5846—Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content using extracted text
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Databases & Information Systems (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Library & Information Science (AREA)
- Software Systems (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
- Character Discrimination (AREA)
- Radar Systems Or Details Thereof (AREA)
- Electrotherapy Devices (AREA)
Abstract
A system and a method for converting microfilm data in a digital format for publishing through a network such as the Internet. First, an image is created of the microfilm, preferably in the TIFF format. Next, the words of the image are recognized through a process of OCR (optical character recognition), with an associated probability of error. The image data can then be converted into a digital format for publication, for example as XML data. Preferably, the user is able to perform a keyword search on the digital format data. More preferably, the keyword search is an adaptive search.
Applications Claiming Priority (3)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US22751200P | 2000-08-24 | 2000-08-24 | |
US60227512 | 2000-08-24 | ||
PCT/IL2001/000797 WO2002017166A2 (en) | 2000-08-24 | 2001-08-24 | System and method for automatic preparation and searching of scanned documents |
Publications (1)
Publication Number | Publication Date |
---|---|
AU2001284369A1 true AU2001284369A1 (en) | 2002-03-04 |
Family
ID=22853387
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
AU2001284369A Abandoned AU2001284369A1 (en) | 2000-08-24 | 2001-08-24 | System and method for automatic preparation and searching of scanned documents |
Country Status (6)
Country | Link |
---|---|
EP (1) | EP1312039B1 (en) |
AT (1) | ATE322051T1 (en) |
AU (1) | AU2001284369A1 (en) |
DE (1) | DE60118399T2 (en) |
IL (1) | IL154586A0 (en) |
WO (1) | WO2002017166A2 (en) |
Families Citing this family (20)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US7050630B2 (en) * | 2002-05-29 | 2006-05-23 | Hewlett-Packard Development Company, L.P. | System and method of locating a non-textual region of an electronic document or image that matches a user-defined description of the region |
FR2880709B1 (en) * | 2005-01-11 | 2014-04-25 | Vision Objects | METHOD OF SEARCHING, RECOGNIZING AND LOCATING INK, DEVICE, PROGRAM AND LANGUAGE CORRESPONDING |
FR2880708A1 (en) * | 2005-01-11 | 2006-07-14 | Vision Objects Sa | Term e.g. typed character, searching method for digital handwritten document, involves converting handwritten data into intermediate data, in intermediate format, in form of segmentation graph, and searching terms on intermediate data |
EP1684199A3 (en) * | 2005-01-19 | 2008-07-09 | Olive Software, Inc. | Digitization of microfiche |
DK176835B1 (en) | 2008-03-07 | 2009-11-23 | Jala Aps | Method of scanning, medium containing a program for carrying out the method and system for carrying out the method |
DK176834B1 (en) | 2008-03-07 | 2009-11-23 | Jala Aps | Procedure for scanning |
DE102009031872A1 (en) | 2009-07-06 | 2011-01-13 | Siemens Aktiengesellschaft | Method and device for automatically searching for documents in a data memory |
US9087059B2 (en) | 2009-08-07 | 2015-07-21 | Google Inc. | User interface for presenting search results for multiple regions of a visual query |
US8670597B2 (en) | 2009-08-07 | 2014-03-11 | Google Inc. | Facial recognition with social network aiding |
US9135277B2 (en) | 2009-08-07 | 2015-09-15 | Google Inc. | Architecture for responding to a visual query |
US9405772B2 (en) | 2009-12-02 | 2016-08-02 | Google Inc. | Actionable search results for street view visual queries |
US9183224B2 (en) | 2009-12-02 | 2015-11-10 | Google Inc. | Identifying matching canonical documents in response to a visual query |
US8811742B2 (en) | 2009-12-02 | 2014-08-19 | Google Inc. | Identifying matching canonical documents consistent with visual query structural information |
US8977639B2 (en) | 2009-12-02 | 2015-03-10 | Google Inc. | Actionable search results for visual queries |
US9176986B2 (en) | 2009-12-02 | 2015-11-03 | Google Inc. | Generating a combination of a visual query and matching canonical document |
US8805079B2 (en) | 2009-12-02 | 2014-08-12 | Google Inc. | Identifying matching canonical documents in response to a visual query and in accordance with geographic information |
US9852156B2 (en) | 2009-12-03 | 2017-12-26 | Google Inc. | Hybrid use of location sensor data and visual query to return local listings for visual query |
WO2012075315A1 (en) * | 2010-12-01 | 2012-06-07 | Google Inc. | Identifying matching canonical documents in response to a visual query |
US8935246B2 (en) | 2012-08-08 | 2015-01-13 | Google Inc. | Identifying textual terms in response to a visual query |
US20230214579A1 (en) * | 2021-12-31 | 2023-07-06 | Microsoft Technology Licensing, Llc | Intelligent character correction and search in documents |
Family Cites Families (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US5265242A (en) * | 1985-08-23 | 1993-11-23 | Hiromichi Fujisawa | Document retrieval system for displaying document image data with inputted bibliographic items and character string selected from multiple character candidates |
FR2768825B1 (en) * | 1997-09-22 | 2001-01-26 | Aerospatiale | DEVICE FOR DIGITIZING AND SEARCHING DOCUMENTS |
CA2326901A1 (en) * | 1998-04-01 | 1999-10-07 | William Peterman | System and method for searching electronic documents created with optical character recognition |
-
2001
- 2001-08-24 WO PCT/IL2001/000797 patent/WO2002017166A2/en active IP Right Grant
- 2001-08-24 DE DE60118399T patent/DE60118399T2/en not_active Expired - Lifetime
- 2001-08-24 EP EP01963349A patent/EP1312039B1/en not_active Expired - Lifetime
- 2001-08-24 AT AT01963349T patent/ATE322051T1/en not_active IP Right Cessation
- 2001-08-24 IL IL15458601A patent/IL154586A0/en unknown
- 2001-08-24 AU AU2001284369A patent/AU2001284369A1/en not_active Abandoned
Also Published As
Publication number | Publication date |
---|---|
WO2002017166A2 (en) | 2002-02-28 |
EP1312039B1 (en) | 2006-03-29 |
WO2002017166A3 (en) | 2002-06-13 |
ATE322051T1 (en) | 2006-04-15 |
DE60118399T2 (en) | 2006-12-07 |
DE60118399D1 (en) | 2006-05-18 |
IL154586A0 (en) | 2003-09-17 |
EP1312039A2 (en) | 2003-05-21 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
AU2001284369A1 (en) | System and method for automatic preparation and searching of scanned documents | |
US7245765B2 (en) | Method and apparatus for capturing paper-based information on a mobile computing device | |
NZ336512A (en) | Image based document processing system comparing recognition results from primary and secondary lists | |
US20020002461A1 (en) | Data processing system for vocalizing web content | |
US5963966A (en) | Automated capture of technical documents for electronic review and distribution | |
CN100372372C (en) | Free text and attribute search of electronic program guide data | |
CN1343337B (en) | Method and device for producing annotation data including phonemes data and decoded word | |
US7450760B2 (en) | System and method for capturing and processing business data | |
CN1691631A (en) | Method for management of vcards | |
US20100100371A1 (en) | Method, System, and Apparatus for Message Generation | |
US20030005045A1 (en) | Device and program for structured document generation data structure of structural document | |
CN1723458A (en) | Method and system for utilizing video content to obtain text keywords or phrases for providing content related links to network-based resources | |
JPH05268459A (en) | Format transmission method | |
CN102915437A (en) | Text information identification method and system | |
CN110737629A (en) | method and system for archiving electronic files | |
GB2307619A (en) | Internet information access system | |
US8467609B2 (en) | Document management device and document management method with identification, classification, search, and save functions | |
CN101651938A (en) | Telephone number recognition system for mobile terminal and application method thereof | |
US20100034460A1 (en) | Document management system and remote document management method with identification, classification, search, and save functions | |
CN101261645B (en) | Method and apparatus for obtaining multiple layer information | |
CN101872344A (en) | Control method for image scanning | |
CN115774805A (en) | File intelligent query method and system based on digital processing | |
US20020069224A1 (en) | Markup language document conversion apparatus and method | |
CN103684991A (en) | Junk mail filtering method based on mail features and content | |
JP3443515B2 (en) | Facsimile electronic mail device |