CA2265060C - Word grouping accuracy value generation - Google Patents

Word grouping accuracy value generation Download PDF

Info

Publication number
CA2265060C
CA2265060C CA002265060A CA2265060A CA2265060C CA 2265060 C CA2265060 C CA 2265060C CA 002265060 A CA002265060 A CA 002265060A CA 2265060 A CA2265060 A CA 2265060A CA 2265060 C CA2265060 C CA 2265060C
Authority
CA
Canada
Prior art keywords
word
accuracy
character
value
words
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Fee Related
Application number
CA002265060A
Other languages
English (en)
French (fr)
Other versions
CA2265060A1 (en
Inventor
Hamadi Jamali
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Canon Inc
Original Assignee
Canon Inc
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Canon Inc filed Critical Canon Inc
Publication of CA2265060A1 publication Critical patent/CA2265060A1/en
Application granted granted Critical
Publication of CA2265060C publication Critical patent/CA2265060C/en
Anticipated expiration legal-status Critical
Expired - Fee Related legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition
    • G06V30/12Detection or correction of errors, e.g. by rescanning the pattern
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition
    • G06V30/26Techniques for post-processing, e.g. correcting the recognition result
    • G06V30/262Techniques for post-processing, e.g. correcting the recognition result using context analysis, e.g. lexical, syntactic or semantic context
    • G06V30/274Syntactic or semantic context, e.g. balancing
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Theoretical Computer Science (AREA)
  • Computational Linguistics (AREA)
  • Character Discrimination (AREA)
CA002265060A 1998-03-12 1999-03-08 Word grouping accuracy value generation Expired - Fee Related CA2265060C (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US09/041,854 1998-03-12
US09/041,854 US6269188B1 (en) 1998-03-12 1998-03-12 Word grouping accuracy value generation

Publications (2)

Publication Number Publication Date
CA2265060A1 CA2265060A1 (en) 1999-09-12
CA2265060C true CA2265060C (en) 2007-05-22

Family

ID=21918699

Family Applications (1)

Application Number Title Priority Date Filing Date
CA002265060A Expired - Fee Related CA2265060C (en) 1998-03-12 1999-03-08 Word grouping accuracy value generation

Country Status (5)

Country Link
US (1) US6269188B1 (https=)
EP (1) EP0942389B1 (https=)
JP (1) JPH11316800A (https=)
CA (1) CA2265060C (https=)
DE (1) DE69940331D1 (https=)

Families Citing this family (74)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8352400B2 (en) 1991-12-23 2013-01-08 Hoffberg Steven M Adaptive pattern recognition based controller apparatus and method and human-factored interface therefore
US7966078B2 (en) 1999-02-01 2011-06-21 Steven Hoffberg Network media appliance system and method
WO2000062243A1 (en) * 1999-04-14 2000-10-19 Fujitsu Limited Character string extracting device and method based on basic component in document image
WO2001067378A1 (en) * 2000-03-06 2001-09-13 Iarchives, Inc. System and method for creating a searchable word index of a scanned document including multiple interpretations of a word at a given document location
US7155061B2 (en) * 2000-08-22 2006-12-26 Microsoft Corporation Method and system for searching for words and phrases in active and stored ink word documents
JP4136316B2 (ja) * 2001-01-24 2008-08-20 富士通株式会社 文字列認識装置
US7133829B2 (en) * 2001-10-31 2006-11-07 Dictaphone Corporation Dynamic insertion of a speech recognition engine within a distributed speech recognition system
US7146321B2 (en) * 2001-10-31 2006-12-05 Dictaphone Corporation Distributed speech recognition system
US6766294B2 (en) 2001-11-30 2004-07-20 Dictaphone Corporation Performance gauge for a distributed speech recognition system
US6785654B2 (en) 2001-11-30 2004-08-31 Dictaphone Corporation Distributed speech recognition system with speech recognition engines offering multiple functionalities
US20030128856A1 (en) * 2002-01-08 2003-07-10 Boor Steven E. Digitally programmable gain amplifier
US7236931B2 (en) 2002-05-01 2007-06-26 Usb Ag, Stamford Branch Systems and methods for automatic acoustic speaker adaptation in computer-assisted transcription systems
US7292975B2 (en) * 2002-05-01 2007-11-06 Nuance Communications, Inc. Systems and methods for evaluating speaker suitability for automatic speech recognition aided transcription
NZ536775A (en) * 2002-05-20 2007-11-30 Tata Infotech Ltd Document structure identifier
US7171061B2 (en) * 2002-07-12 2007-01-30 Xerox Corporation Systems and methods for triage of passages of text output from an OCR system
US7045377B2 (en) * 2003-06-26 2006-05-16 Rj Mears, Llc Method for making a semiconductor device including a superlattice and adjacent semiconductor layer with doped regions defining a semiconductor junction
WO2005009205A2 (en) * 2003-07-09 2005-02-03 Gensym Corporation System and method for self management of health using natural language interface
US8442331B2 (en) 2004-02-15 2013-05-14 Google Inc. Capturing text from rendered documents using supplemental information
US7707039B2 (en) * 2004-02-15 2010-04-27 Exbiblio B.V. Automatic modification of web pages
JP4297798B2 (ja) * 2004-01-29 2009-07-15 富士通株式会社 移動体情報管理プログラム
US20060104515A1 (en) * 2004-07-19 2006-05-18 King Martin T Automatic modification of WEB pages
US20060136629A1 (en) * 2004-08-18 2006-06-22 King Martin T Scanner having connected and unconnected operational behaviors
US10635723B2 (en) 2004-02-15 2020-04-28 Google Llc Search engines and systems with handheld document data capture devices
US7812860B2 (en) 2004-04-01 2010-10-12 Exbiblio B.V. Handheld device for capturing text from both a document printed on paper and a document displayed on a dynamic display device
US7552630B2 (en) * 2004-02-27 2009-06-30 Akron Special Machinery, Inc. Load wheel drive
US7990556B2 (en) 2004-12-03 2011-08-02 Google Inc. Association of a portable scanner with input/output and storage devices
US7894670B2 (en) 2004-04-01 2011-02-22 Exbiblio B.V. Triggering actions in response to optically or acoustically capturing keywords from a rendered document
US9008447B2 (en) 2004-04-01 2015-04-14 Google Inc. Method and system for character recognition
US8146156B2 (en) 2004-04-01 2012-03-27 Google Inc. Archive of text captures from rendered documents
US20060098900A1 (en) 2004-09-27 2006-05-11 King Martin T Secure data gathering from rendered documents
US20060081714A1 (en) 2004-08-23 2006-04-20 King Martin T Portable scanning device
US9143638B2 (en) 2004-04-01 2015-09-22 Google Inc. Data capture from rendered documents using handheld device
US9116890B2 (en) 2004-04-01 2015-08-25 Google Inc. Triggering actions in response to optically or acoustically capturing keywords from a rendered document
US8081849B2 (en) 2004-12-03 2011-12-20 Google Inc. Portable scanning and memory device
US8713418B2 (en) 2004-04-12 2014-04-29 Google Inc. Adding value to a rendered document
US8874504B2 (en) 2004-12-03 2014-10-28 Google Inc. Processing techniques for visual capture data from a rendered document
US8489624B2 (en) 2004-05-17 2013-07-16 Google, Inc. Processing techniques for text capture from a rendered document
US8620083B2 (en) 2004-12-03 2013-12-31 Google Inc. Method and system for character recognition
US8346620B2 (en) 2004-07-19 2013-01-01 Google Inc. Automatic modification of web pages
US8032372B1 (en) 2005-09-13 2011-10-04 Escription, Inc. Dictation selection
US7734092B2 (en) * 2006-03-07 2010-06-08 Ancestry.Com Operations Inc. Multiple image input for optical character recognition processing systems and methods
US7966557B2 (en) 2006-03-29 2011-06-21 Amazon Technologies, Inc. Generating image-based reflowable files for rendering on various sized displays
EP2067119A2 (en) 2006-09-08 2009-06-10 Exbiblio B.V. Optical scanners, such as hand-held optical scanners
US7810026B1 (en) * 2006-09-29 2010-10-05 Amazon Technologies, Inc. Optimizing typographical content for transmission and display
US8595615B2 (en) * 2007-02-07 2013-11-26 International Business Machines Corporation System and method for automatic stylesheet inference
US8782516B1 (en) 2007-12-21 2014-07-15 Amazon Technologies, Inc. Content style detection
JP2009193356A (ja) * 2008-02-14 2009-08-27 Canon Inc 画像処理装置、画像処理方法、プログラム、及び記憶媒体
US8572480B1 (en) 2008-05-30 2013-10-29 Amazon Technologies, Inc. Editing the sequential flow of a page
US20090300126A1 (en) * 2008-05-30 2009-12-03 International Business Machines Corporation Message Handling
US9229911B1 (en) * 2008-09-30 2016-01-05 Amazon Technologies, Inc. Detecting continuation of flow of a page
WO2010096191A2 (en) 2009-02-18 2010-08-26 Exbiblio B.V. Automatically capturing information, such as capturing information using a document-aware device
EP2406767A4 (en) 2009-03-12 2016-03-16 Google Inc AUTOMATIC CONTENT SUPPLY ASSOCIATED WITH CAPTURED INFORMATION, TYPE INFORMATION CAPTURED IN REAL TIME
US8447066B2 (en) 2009-03-12 2013-05-21 Google Inc. Performing actions based on capturing information from rendered documents, such as documents under copyright
US20100274615A1 (en) * 2009-04-22 2010-10-28 Eran Belinsky Extendable Collaborative Correction Framework
US9135277B2 (en) 2009-08-07 2015-09-15 Google Inc. Architecture for responding to a visual query
US8670597B2 (en) 2009-08-07 2014-03-11 Google Inc. Facial recognition with social network aiding
US9087059B2 (en) 2009-08-07 2015-07-21 Google Inc. User interface for presenting search results for multiple regions of a visual query
US8600152B2 (en) * 2009-10-26 2013-12-03 Ancestry.Com Operations Inc. Devices, systems and methods for transcription suggestions and completions
US20110099193A1 (en) * 2009-10-26 2011-04-28 Ancestry.Com Operations Inc. Automatic pedigree corrections
US8805079B2 (en) 2009-12-02 2014-08-12 Google Inc. Identifying matching canonical documents in response to a visual query and in accordance with geographic information
US8977639B2 (en) 2009-12-02 2015-03-10 Google Inc. Actionable search results for visual queries
US9176986B2 (en) 2009-12-02 2015-11-03 Google Inc. Generating a combination of a visual query and matching canonical document
US9405772B2 (en) 2009-12-02 2016-08-02 Google Inc. Actionable search results for street view visual queries
US9183224B2 (en) 2009-12-02 2015-11-10 Google Inc. Identifying matching canonical documents in response to a visual query
US8811742B2 (en) 2009-12-02 2014-08-19 Google Inc. Identifying matching canonical documents consistent with visual query structural information
US9852156B2 (en) 2009-12-03 2017-12-26 Google Inc. Hybrid use of location sensor data and visual query to return local listings for visual query
US9081799B2 (en) 2009-12-04 2015-07-14 Google Inc. Using gestalt information to identify locations in printed information
US9323784B2 (en) 2009-12-09 2016-04-26 Google Inc. Image search using text-based elements within the contents of images
US8499236B1 (en) 2010-01-21 2013-07-30 Amazon Technologies, Inc. Systems and methods for presenting reflowable content on a display
CA2819369C (en) * 2010-12-01 2020-02-25 Google, Inc. Identifying matching canonical documents in response to a visual query
US9383913B2 (en) * 2012-05-30 2016-07-05 Sap Se Touch screen device data filtering
US8935246B2 (en) 2012-08-08 2015-01-13 Google Inc. Identifying textual terms in response to a visual query
RU2634194C1 (ru) * 2016-09-16 2017-10-24 Общество с ограниченной ответственностью "Аби Девелопмент" Верификация результатов оптического распознавания символов
WO2023059865A1 (en) 2021-10-08 2023-04-13 Ancestry.Com Operations Inc. Image identification, retrieval, transformation, and arrangement

Family Cites Families (37)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US3969698A (en) * 1974-10-08 1976-07-13 International Business Machines Corporation Cluster storage apparatus for post processing error correction of a character recognition machine
US4941125A (en) 1984-08-01 1990-07-10 Smithsonian Institution Information storage and retrieval system
US5265242A (en) 1985-08-23 1993-11-23 Hiromichi Fujisawa Document retrieval system for displaying document image data with inputted bibliographic items and character string selected from multiple character candidates
JP2695844B2 (ja) 1988-06-16 1998-01-14 株式会社東芝 文書整形装置
DE68913669T2 (de) * 1988-11-23 1994-07-21 Digital Equipment Corp Namenaussprache durch einen Synthetisator.
US5303361A (en) 1989-01-18 1994-04-12 Lotus Development Corporation Search and retrieval system
JP2816241B2 (ja) 1990-06-20 1998-10-27 株式会社日立製作所 画像情報検索装置
US5757983A (en) 1990-08-09 1998-05-26 Hitachi, Ltd. Document retrieval method and system
JP3303926B2 (ja) 1991-09-27 2002-07-22 富士ゼロックス株式会社 構造化文書分類装置及び方法
US5926565A (en) 1991-10-28 1999-07-20 Froessl; Horst Computer method for processing records with images and multiple fonts
US5875263A (en) 1991-10-28 1999-02-23 Froessl; Horst Non-edit multiple image font processing of records
US5375235A (en) 1991-11-05 1994-12-20 Northern Telecom Limited Method of indexing keywords for searching in a database recorded on an information recording medium
JP2579397B2 (ja) 1991-12-18 1997-02-05 インターナショナル・ビジネス・マシーンズ・コーポレイション 文書画像のレイアウトモデルを作成する方法及び装置
US5359667A (en) 1992-08-24 1994-10-25 Unisys Corporation Method for identifying and tracking document characteristics in a document image processing system
US6002798A (en) 1993-01-19 1999-12-14 Canon Kabushiki Kaisha Method and apparatus for creating, indexing and viewing abstracted documents
US5848184A (en) 1993-03-15 1998-12-08 Unisys Corporation Document page analyzer and method
JP3302147B2 (ja) 1993-05-12 2002-07-15 株式会社リコー 文書画像処理方法
WO1995015535A1 (en) * 1993-12-01 1995-06-08 Motorola Inc. Combined dictionary based and likely character string method of handwriting recognition
EP0667594A3 (en) 1994-02-14 1995-08-23 International Business Machines Corporation Image quality analysis method and apparatus
CA2144793C (en) 1994-04-07 1999-01-12 Lawrence Patrick O'gorman Method of thresholding document images
JPH087033A (ja) * 1994-06-16 1996-01-12 Canon Inc 情報処理方法及び装置
US5802205A (en) 1994-09-09 1998-09-01 Motorola, Inc. Method and system for lexical processing
US5675665A (en) * 1994-09-30 1997-10-07 Apple Computer, Inc. System and method for word recognition using size and placement models
JP3669016B2 (ja) 1994-09-30 2005-07-06 株式会社日立製作所 文書情報分類装置
US5805747A (en) * 1994-10-04 1998-09-08 Science Applications International Corporation Apparatus and method for OCR character and confidence determination using multiple OCR devices
JP3647518B2 (ja) 1994-10-06 2005-05-11 ゼロックス コーポレイション コード化したワードトークンを使用して文書画像をハイライトで強調する装置
US5642288A (en) 1994-11-10 1997-06-24 Documagix, Incorporated Intelligent document recognition and handling
JP3375766B2 (ja) * 1994-12-27 2003-02-10 松下電器産業株式会社 文字認識装置
US5617488A (en) * 1995-02-01 1997-04-01 The Research Foundation Of State University Of New York Relaxation word recognizer
US5764799A (en) * 1995-06-26 1998-06-09 Research Foundation Of State Of State Of New York OCR method and apparatus using image equivalents
US5781879A (en) * 1996-01-26 1998-07-14 Qpl Llc Semantic analysis and modification methodology
US5850480A (en) 1996-05-30 1998-12-15 Scan-Optics, Inc. OCR error correction methods and apparatus utilizing contextual comparison
JP2973944B2 (ja) 1996-06-26 1999-11-08 富士ゼロックス株式会社 文書処理装置および文書処理方法
US5933531A (en) * 1996-08-23 1999-08-03 International Business Machines Corporation Verification and correction method and system for optical character recognition
US5878385A (en) 1996-09-16 1999-03-02 Ergo Linguistic Technologies Method and apparatus for universal parsing of language
US6006226A (en) 1997-09-24 1999-12-21 Ricoh Company Limited Method and system for document image feature extraction
US5999664A (en) 1997-11-14 1999-12-07 Xerox Corporation System for searching a corpus of document images by user specified document layout components

Also Published As

Publication number Publication date
EP0942389A3 (en) 2000-09-20
EP0942389B1 (en) 2009-01-21
US6269188B1 (en) 2001-07-31
JPH11316800A (ja) 1999-11-16
EP0942389A2 (en) 1999-09-15
DE69940331D1 (de) 2009-03-12
CA2265060A1 (en) 1999-09-12

Similar Documents

Publication Publication Date Title
CA2265060C (en) Word grouping accuracy value generation
JP2973944B2 (ja) 文書処理装置および文書処理方法
JP3427692B2 (ja) 文字認識方法および文字認識装置
Kanai et al. Automated evaluation of OCR zoning
US7047238B2 (en) Document retrieval method and document retrieval system
US6272242B1 (en) Character recognition method and apparatus which groups similar character patterns
US5883986A (en) Method and system for automatic transcription correction
EP1952285B1 (en) System and method for searching and matching data having ideogrammatic content
US6917709B2 (en) Automated search on cursive records not having an ASCII index
EP2166488B1 (en) Handwritten word spotter using synthesized typed queries
EP1016033B1 (en) Automatic language identification system for multilingual optical character recognition
Sibun et al. Language determination: Natural language processing from scanned document images
US6385339B1 (en) Collaborative learning system and pattern recognition method
Zramdini et al. Optical font recognition from projection profiles
JP2001167131A (ja) 文書シグネチュアを使用する文書の自動分類方法
US5970171A (en) Apparatus and method of fusing the outputs of multiple intelligent character recognition (ICR) systems to reduce error rate
JP2003524258A (ja) 電子ドキュメントを処理する方法および装置
US20110229036A1 (en) Method and apparatus for text and error profiling of historical documents
JP6533395B2 (ja) 文字検索方法およびシステム
JP4678712B2 (ja) 言語識別装置、プログラム及び記録媒体
CN117076455A (zh) 一种基于智能识别的保单结构化存储方法、介质及系统
JP2906758B2 (ja) 文字読取装置
JPH096920A (ja) 手書き文字認識方法及びその装置
JP2827066B2 (ja) 数字列混在文書の文字認識の後処理方法
KR100292352B1 (ko) 형태소 분석을 이용한 인식기의 편집방법

Legal Events

Date Code Title Description
EEER Examination request
MKLA Lapsed

Effective date: 20170308