CN102301380B - 来自出版物的经ocr处理的文本和对应图像在客户端设备上的选择性显示 - Google Patents

来自出版物的经ocr处理的文本和对应图像在客户端设备上的选择性显示 Download PDF

Info

Publication number
CN102301380B
CN102301380B CN201080005734.9A CN201080005734A CN102301380B CN 102301380 B CN102301380 B CN 102301380B CN 201080005734 A CN201080005734 A CN 201080005734A CN 102301380 B CN102301380 B CN 102301380B
Authority
CN
China
Prior art keywords
text
document
image segments
client device
computer implemented
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Fee Related
Application number
CN201080005734.9A
Other languages
English (en)
Chinese (zh)
Other versions
CN102301380A (zh
Inventor
V·兰纳卡
A·波帕特
F·豪根
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Google LLC
Original Assignee
Google LLC
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Google LLC filed Critical Google LLC
Priority to CN201410345954.6A priority Critical patent/CN104134057B/zh
Publication of CN102301380A publication Critical patent/CN102301380A/zh
Application granted granted Critical
Publication of CN102301380B publication Critical patent/CN102301380B/zh
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition
    • G06V30/22Character recognition characterised by the type of writing
    • G06V30/224Character recognition characterised by the type of writing of printed characters having additional code marks or containing code marks
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition
    • G06V30/12Detection or correction of errors, e.g. by rescanning the pattern
    • G06V30/127Detection or correction of errors, e.g. by rescanning the pattern with the intervention of an operator
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06KGRAPHICAL DATA READING; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
    • G06K7/00Methods or arrangements for sensing record carriers, e.g. for reading patterns
    • G06K7/10Methods or arrangements for sensing record carriers, e.g. for reading patterns by electromagnetic radiation, e.g. optical sensing; by corpuscular radiation
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T11/002D [Two Dimensional] image generation
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T11/002D [Two Dimensional] image generation
    • G06T11/20Drawing from basic elements, e.g. lines or circles
    • G06T11/206Drawing of charts or graphs
    • GPHYSICS
    • G09EDUCATION; CRYPTOGRAPHY; DISPLAY; ADVERTISING; SEALS
    • G09GARRANGEMENTS OR CIRCUITS FOR CONTROL OF INDICATING DEVICES USING STATIC MEANS TO PRESENT VARIABLE INFORMATION
    • G09G5/00Control arrangements or circuits for visual indicators common to cathode-ray tube indicators and other visual indicators
    • G09G5/14Display of multiple viewports
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T11/002D [Two Dimensional] image generation
    • G06T11/60Editing figures and text; Combining figures or text
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Multimedia (AREA)
  • Computer Hardware Design (AREA)
  • Electromagnetism (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Toxicology (AREA)
  • Artificial Intelligence (AREA)
  • Character Discrimination (AREA)
  • Character Input (AREA)
  • User Interface Of Digital Computer (AREA)
  • Document Processing Apparatus (AREA)
  • Controls And Circuits For Display Device (AREA)
CN201080005734.9A 2009-01-28 2010-01-25 来自出版物的经ocr处理的文本和对应图像在客户端设备上的选择性显示 Expired - Fee Related CN102301380B (zh)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201410345954.6A CN104134057B (zh) 2009-01-28 2010-01-25 来自出版物的经ocr处理的文本和对应图像在客户端设备上的选择性显示

Applications Claiming Priority (5)

Application Number Priority Date Filing Date Title
US14790109P 2009-01-28 2009-01-28
US61/147,901 2009-01-28
US12/366,547 US8373724B2 (en) 2009-01-28 2009-02-05 Selective display of OCR'ed text and corresponding images from publications on a client device
US12/366,547 2009-02-05
PCT/US2010/021965 WO2010088182A1 (en) 2009-01-28 2010-01-25 Selective display of ocr'ed text and corresponding images from publications on a client device

Related Child Applications (1)

Application Number Title Priority Date Filing Date
CN201410345954.6A Division CN104134057B (zh) 2009-01-28 2010-01-25 来自出版物的经ocr处理的文本和对应图像在客户端设备上的选择性显示

Publications (2)

Publication Number Publication Date
CN102301380A CN102301380A (zh) 2011-12-28
CN102301380B true CN102301380B (zh) 2014-08-20

Family

ID=42353827

Family Applications (2)

Application Number Title Priority Date Filing Date
CN201080005734.9A Expired - Fee Related CN102301380B (zh) 2009-01-28 2010-01-25 来自出版物的经ocr处理的文本和对应图像在客户端设备上的选择性显示
CN201410345954.6A Active CN104134057B (zh) 2009-01-28 2010-01-25 来自出版物的经ocr处理的文本和对应图像在客户端设备上的选择性显示

Family Applications After (1)

Application Number Title Priority Date Filing Date
CN201410345954.6A Active CN104134057B (zh) 2009-01-28 2010-01-25 来自出版物的经ocr处理的文本和对应图像在客户端设备上的选择性显示

Country Status (5)

Country Link
US (4) US8373724B2 (enExample)
JP (2) JP5324669B2 (enExample)
KR (1) KR101315472B1 (enExample)
CN (2) CN102301380B (enExample)
WO (1) WO2010088182A1 (enExample)

Families Citing this family (41)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8373724B2 (en) * 2009-01-28 2013-02-12 Google Inc. Selective display of OCR'ed text and corresponding images from publications on a client device
US8442813B1 (en) 2009-02-05 2013-05-14 Google Inc. Methods and systems for assessing the quality of automatically generated text
US20120050819A1 (en) * 2010-08-30 2012-03-01 Jiang Hong Approach For Processing Scanned Document Data
US20120050818A1 (en) * 2010-08-31 2012-03-01 Kaoru Watanabe Sending scanned document data through a network to a mobile device
US9083826B2 (en) * 2010-08-31 2015-07-14 Ricoh Company, Ltd. Tracking the processing of electronic document data by network services using trace
US8515930B2 (en) 2010-08-31 2013-08-20 Ricoh Company, Ltd. Merging a scanned document with an existing document on a server
US20120159376A1 (en) * 2010-12-15 2012-06-21 Microsoft Corporation Editing data records associated with static images
TW201310355A (zh) * 2011-08-19 2013-03-01 Newsoft Technology Corp 經由資訊及指令關聯影像來瀏覽或執行指令的方法及其程式產品
US9069374B2 (en) * 2012-01-04 2015-06-30 International Business Machines Corporation Web video occlusion: a method for rendering the videos watched over multiple windows
US10332213B2 (en) 2012-03-01 2019-06-25 Ricoh Company, Ltd. Expense report system with receipt image processing by delegates
US9659327B2 (en) * 2012-03-01 2017-05-23 Ricoh Company, Ltd. Expense report system with receipt image processing
US9245296B2 (en) 2012-03-01 2016-01-26 Ricoh Company Ltd. Expense report system with receipt image processing
JP5983184B2 (ja) * 2012-08-24 2016-08-31 ブラザー工業株式会社 画像処理システム、画像処理方法、画像処理装置、および画像処理プログラム
US9519641B2 (en) * 2012-09-18 2016-12-13 Abbyy Development Llc Photography recognition translation
KR20140081470A (ko) * 2012-12-21 2014-07-01 삼성전자주식회사 문자 확대 표시 방법, 상기 방법이 적용되는 장치, 및 상기 방법을 수행하는 프로그램을 저장하는 컴퓨터로 읽을 수 있는 저장 매체
WO2014154457A1 (en) * 2013-03-29 2014-10-02 Alcatel Lucent Systems and methods for context based scanning
JP6525523B2 (ja) * 2013-07-31 2019-06-05 キヤノン株式会社 情報処理装置、制御方法およびプログラム
US9275554B2 (en) 2013-09-24 2016-03-01 Jimmy M Sauz Device, system, and method for enhanced memorization of a document
US9971573B2 (en) 2015-06-18 2018-05-15 The Joan and Irwin Jacobs Technion-Cornell Institute Computing platform and method thereof for searching, executing, and evaluating computational algorithms
US10755590B2 (en) 2015-06-18 2020-08-25 The Joan and Irwin Jacobs Technion-Cornell Institute Method and system for automatically providing graphical user interfaces for computational algorithms described in printed publications
US9864734B2 (en) * 2015-08-12 2018-01-09 International Business Machines Corporation Clickable links within live collaborative web meetings
US10044751B2 (en) * 2015-12-28 2018-08-07 Arbor Networks, Inc. Using recurrent neural networks to defeat DNS denial of service attacks
US9501696B1 (en) 2016-02-09 2016-11-22 William Cabán System and method for metadata extraction, mapping and execution
US10607101B1 (en) 2016-12-14 2020-03-31 Revenue Management Solutions, Llc System and method for patterned artifact removal for bitonal images
CN108628814A (zh) * 2017-03-20 2018-10-09 珠海金山办公软件有限公司 一种快速插入识别文字的方法及装置
JP6946690B2 (ja) * 2017-03-24 2021-10-06 カシオ計算機株式会社 表示装置、表示方法及びプログラム
CN111213156B (zh) * 2017-07-25 2024-05-10 惠普发展公司,有限责任合伙企业 字符识别锐度确定
JP6891073B2 (ja) * 2017-08-22 2021-06-18 キヤノン株式会社 スキャン画像にファイル名等を設定するための装置、その制御方法及びプログラム
CN109981421B (zh) * 2017-12-27 2022-02-01 九阳股份有限公司 一种智能设备配网方法和装置
GB201804383D0 (en) 2018-03-19 2018-05-02 Microsoft Technology Licensing Llc Multi-endpoint mixed reality meetings
CN110969056B (zh) * 2018-09-29 2023-08-08 杭州海康威视数字技术股份有限公司 文档图像的文档版面分析方法、装置及存储介质
CN111475999B (zh) * 2019-01-22 2023-04-14 阿里巴巴集团控股有限公司 错误提示的生成方法、装置
CN110377885B (zh) * 2019-06-14 2023-09-26 北京百度网讯科技有限公司 转换pdf文件的方法、装置、设备和计算机存储介质
US11403162B2 (en) * 2019-10-17 2022-08-02 Dell Products L.P. System and method for transferring diagnostic data via a framebuffer
US11205084B2 (en) * 2020-02-17 2021-12-21 Wipro Limited Method and system for evaluating an image quality for optical character recognition (OCR)
US11436713B2 (en) 2020-02-19 2022-09-06 International Business Machines Corporation Application error analysis from screenshot
US11842035B2 (en) * 2020-08-04 2023-12-12 Bentley Systems, Incorporated Techniques for labeling, reviewing and correcting label predictions for PandIDS
CN112131841A (zh) * 2020-08-27 2020-12-25 北京云动智效网络科技有限公司 一种文档质量评估方法及系统
CN115016710B (zh) * 2021-11-12 2023-06-16 荣耀终端有限公司 应用程序推荐方法
US20240095452A1 (en) * 2022-09-16 2024-03-21 Citrix Systems, Inc. Unicode based estimation of text intelligibility
CN117217876B (zh) * 2023-11-08 2024-03-26 深圳市明心数智科技有限公司 基于ocr技术的订单预处理方法、装置、设备及介质

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5325297A (en) * 1992-06-25 1994-06-28 System Of Multiple-Colored Images For Internationally Listed Estates, Inc. Computer implemented method and system for storing and retrieving textual data and compressed image data
US5889897A (en) * 1997-04-08 1999-03-30 International Patent Holdings Ltd. Methodology for OCR error checking through text image regeneration
US20020102966A1 (en) * 2000-11-06 2002-08-01 Lev Tsvi H. Object identification method for portable devices
US20020191847A1 (en) * 1998-05-06 2002-12-19 Xerox Corporation Portable text capturing method and device therefor
CN101044494A (zh) * 2004-10-20 2007-09-26 摩托罗拉公司 用于可视文本解释的电子装置和方法

Family Cites Families (46)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5675672A (en) * 1990-06-26 1997-10-07 Seiko Epson Corporation Two dimensional linker for character string data
JPH0581467A (ja) * 1991-08-29 1993-04-02 Canon Inc 画像処理方法及び装置
JPH07249098A (ja) * 1994-03-09 1995-09-26 Toshiba Corp 情報処理装置および情報処理方法
US5764799A (en) * 1995-06-26 1998-06-09 Research Foundation Of State Of State Of New York OCR method and apparatus using image equivalents
US6137906A (en) * 1997-06-27 2000-10-24 Kurzweil Educational Systems, Inc. Closest word algorithm
US6023534A (en) * 1997-08-04 2000-02-08 Xerox Corporation Method of extracting image data from an area generated with a halftone pattern
JP2000112955A (ja) * 1998-09-30 2000-04-21 Toshiba Corp 画像表示方法および画像ファイリング装置および記録媒体
US6278969B1 (en) 1999-08-18 2001-08-21 International Business Machines Corp. Method and system for improving machine translation accuracy using translation memory
US6587583B1 (en) * 1999-09-17 2003-07-01 Kurzweil Educational Systems, Inc. Compression/decompression algorithm for image documents having text, graphical and color content
GB2359953B (en) * 2000-03-03 2004-02-11 Hewlett Packard Co Improvements relating to image capture systems
US6738518B1 (en) 2000-05-12 2004-05-18 Xerox Corporation Document image decoding using text line column-based heuristic scoring
US6678415B1 (en) 2000-05-12 2004-01-13 Xerox Corporation Document image decoding using an integrated stochastic language model
JP4613397B2 (ja) * 2000-06-28 2011-01-19 コニカミノルタビジネステクノロジーズ株式会社 画像認識装置、画像認識方法および画像認識プログラムを記録したコンピュータ読取可能な記録媒体
JP2002049890A (ja) * 2000-08-01 2002-02-15 Minolta Co Ltd 画像認識装置、画像認識方法および画像認識プログラムを記録したコンピュータ読取可能な記録媒体
US6957384B2 (en) * 2000-12-27 2005-10-18 Tractmanager, Llc Document management system
JP4421134B2 (ja) * 2001-04-18 2010-02-24 富士通株式会社 文書画像検索装置
JP2002358481A (ja) * 2001-06-01 2002-12-13 Ricoh Elemex Corp 画像処理装置
US7171061B2 (en) 2002-07-12 2007-01-30 Xerox Corporation Systems and methods for triage of passages of text output from an OCR system
US8533270B2 (en) 2003-06-23 2013-09-10 Microsoft Corporation Advanced spam detection techniques
US8301893B2 (en) * 2003-08-13 2012-10-30 Digimarc Corporation Detecting media areas likely of hosting watermarks
JP2005107684A (ja) * 2003-09-29 2005-04-21 Fuji Photo Film Co Ltd 画像処理方法及び画像入出力装置
CN1871608A (zh) * 2003-10-27 2006-11-29 皇家飞利浦电子股份有限公司 搜索结果的逐屏幕呈现
JP2005352735A (ja) * 2004-06-10 2005-12-22 Fuji Xerox Co Ltd 文書ファイル作成支援装置、文書ファイル作成支援方法及びそのプログラム
JP2006031299A (ja) * 2004-07-15 2006-02-02 Hitachi Ltd 文字認識方法、文字データの修正履歴処理方法およびシステム
CN101432729A (zh) * 2004-08-21 2009-05-13 科-爱克思普莱斯公司 用于扩展式企业商务的方法、系统以及设备
US7639387B2 (en) * 2005-08-23 2009-12-29 Ricoh Co., Ltd. Authoring tools using a mixed media environment
US8156427B2 (en) * 2005-08-23 2012-04-10 Ricoh Co. Ltd. User interface for mixed media reality
US7669148B2 (en) * 2005-08-23 2010-02-23 Ricoh Co., Ltd. System and methods for portable device for mixed media system
CN1848109A (zh) * 2005-04-13 2006-10-18 摩托罗拉公司 用于编辑光学字符识别结果的方法和系统
US7760917B2 (en) * 2005-05-09 2010-07-20 Like.Com Computer-implemented method for performing similarity searches
US7809722B2 (en) * 2005-05-09 2010-10-05 Like.Com System and method for enabling search and retrieval from image files based on recognized information
CN100356392C (zh) * 2005-08-18 2007-12-19 北大方正集团有限公司 一种字符识别的后处理方法
KR100714393B1 (ko) * 2005-09-16 2007-05-07 삼성전자주식회사 텍스트 추출 기능을 갖는 호스트 장치 및 그의 텍스트 추출방법
US7796837B2 (en) * 2005-09-22 2010-09-14 Google Inc. Processing an image map for display on computing device
US8849821B2 (en) * 2005-11-04 2014-09-30 Nokia Corporation Scalable visual search system simplifying access to network and device functionality
US7822596B2 (en) * 2005-12-05 2010-10-26 Microsoft Corporation Flexible display translation
KR20080002084A (ko) * 2006-06-30 2008-01-04 삼성전자주식회사 광학 문자 판독을 위한 시스템 및 광학 문자 판독방법
US7912700B2 (en) 2007-02-08 2011-03-22 Microsoft Corporation Context based word prediction
US8763038B2 (en) * 2009-01-26 2014-06-24 Sony Corporation Capture of stylized TV table data via OCR
US20080267504A1 (en) * 2007-04-24 2008-10-30 Nokia Corporation Method, device and computer program product for integrating code-based and optical character recognition technologies into a mobile visual search
CN101419661B (zh) * 2007-10-26 2011-08-24 国际商业机器公司 基于图像中的文本进行图像显示的方法和系统
US8331677B2 (en) * 2009-01-08 2012-12-11 Microsoft Corporation Combined image and text document
US8373724B2 (en) * 2009-01-28 2013-02-12 Google Inc. Selective display of OCR'ed text and corresponding images from publications on a client device
US8442813B1 (en) 2009-02-05 2013-05-14 Google Inc. Methods and systems for assessing the quality of automatically generated text
US8588528B2 (en) * 2009-06-23 2013-11-19 K-Nfb Reading Technology, Inc. Systems and methods for displaying scanned images with overlaid text
US20110128288A1 (en) * 2009-12-02 2011-06-02 David Petrou Region of Interest Selector for Visual Queries

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5325297A (en) * 1992-06-25 1994-06-28 System Of Multiple-Colored Images For Internationally Listed Estates, Inc. Computer implemented method and system for storing and retrieving textual data and compressed image data
US5889897A (en) * 1997-04-08 1999-03-30 International Patent Holdings Ltd. Methodology for OCR error checking through text image regeneration
US20020191847A1 (en) * 1998-05-06 2002-12-19 Xerox Corporation Portable text capturing method and device therefor
US20020102966A1 (en) * 2000-11-06 2002-08-01 Lev Tsvi H. Object identification method for portable devices
CN101044494A (zh) * 2004-10-20 2007-09-26 摩托罗拉公司 用于可视文本解释的电子装置和方法

Also Published As

Publication number Publication date
WO2010088182A1 (en) 2010-08-05
JP5324669B2 (ja) 2013-10-23
JP2014032665A (ja) 2014-02-20
JP2012516508A (ja) 2012-07-19
JP6254374B2 (ja) 2017-12-27
US8675012B2 (en) 2014-03-18
CN102301380A (zh) 2011-12-28
US20130002710A1 (en) 2013-01-03
KR20110124255A (ko) 2011-11-16
CN104134057B (zh) 2018-02-13
US20130265325A1 (en) 2013-10-10
US8373724B2 (en) 2013-02-12
US20100188419A1 (en) 2010-07-29
CN104134057A (zh) 2014-11-05
US8482581B2 (en) 2013-07-09
KR101315472B1 (ko) 2013-10-04
US20140125693A1 (en) 2014-05-08
US9280952B2 (en) 2016-03-08

Similar Documents

Publication Publication Date Title
CN102301380B (zh) 来自出版物的经ocr处理的文本和对应图像在客户端设备上的选择性显示
JP4945813B2 (ja) 印刷構造化文書
US10902193B2 (en) Automated generation of web forms using fillable electronic documents
US10372827B2 (en) Translating phrases from image data on a GUI
US9619440B2 (en) Document conversion apparatus
US7715625B2 (en) Image processing device, image processing method, and storage medium storing program therefor
US9614984B2 (en) Electronic document generation system and recording medium
US11243670B2 (en) Information processing system, information processing apparatus, information processing method and non-transitory computer readable medium
US9864750B2 (en) Objectification with deep searchability
JP2002169637A (ja) ドキュメント表示態様変換装置、ドキュメント表示態様変換方法、記録媒体
JP2009015610A (ja) ページアクション起動装置、ページアクション起動制御方法、および、ページアクション起動制御プログラム
US9019552B2 (en) Information processing apparatus, system and method for outputting data to a medium
JP6045393B2 (ja) 情報処理システム
KR20110074422A (ko) 상세정보 이미지 파일 생성 방법 및 장치
JP4844827B2 (ja) 情報処理システム、文書作成装置、文書出力装置、追記情報処理装置、文書管理装置、プログラム
JP2024115651A (ja) データ処理システム、及び、その制御方法
JP6192603B2 (ja) 文書処理装置および文書処理プログラム
JP6175414B2 (ja) 文書処理装置および文書処理プログラム
JP2004280144A (ja) 帳票作成装置、帳票処理装置、プログラム及び記憶媒体

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant
CP01 Change in the name or title of a patent holder

Address after: California, USA

Patentee after: Google Inc.

Address before: California, USA

Patentee before: Google Inc.

CP01 Change in the name or title of a patent holder
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20140820

CF01 Termination of patent right due to non-payment of annual fee