JP4995554B2 - 光学式文字認識補正のための知識ベースを利用した個人情報の検索方法 - Google Patents
光学式文字認識補正のための知識ベースを利用した個人情報の検索方法 Download PDFInfo
- Publication number
- JP4995554B2 JP4995554B2 JP2006329176A JP2006329176A JP4995554B2 JP 4995554 B2 JP4995554 B2 JP 4995554B2 JP 2006329176 A JP2006329176 A JP 2006329176A JP 2006329176 A JP2006329176 A JP 2006329176A JP 4995554 B2 JP4995554 B2 JP 4995554B2
- Authority
- JP
- Japan
- Prior art keywords
- text
- image
- image segment
- text content
- business card
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Expired - Fee Related
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V30/00—Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
- G06V30/10—Character recognition
- G06V30/12—Detection or correction of errors, e.g. by rescanning the pattern
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V30/00—Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
- G06V30/10—Character recognition
- G06V30/26—Techniques for post-processing, e.g. correcting the recognition result
- G06V30/262—Techniques for post-processing, e.g. correcting the recognition result using context analysis, e.g. lexical, syntactic or semantic context
- G06V30/268—Lexical context
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V30/00—Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
- G06V30/10—Character recognition
Landscapes
- Engineering & Computer Science (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Multimedia (AREA)
- Theoretical Computer Science (AREA)
- Computational Linguistics (AREA)
- Character Discrimination (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Description
Claims (4)
- 個人情報を取得する方法であって、
名刺画像を取得するステップと、
前記名刺画像から複数のテキスト画像セグメントを抽出するステップと、
光学式文字認識(OCR)を前記各テキスト画像セグメントに適用して、各テキスト画像セグメントに対して1又は複数のテキスト内容候補を生成するステップと、
少なくとも1つのデータベースへの問い合わせにより返されたレコードに基づき、前記テキスト画像セグメントに対して生成されたテキスト内容候補の中から当該テキスト画像セグメントに含まれるテキストを選択する選択ステップと、
前記選択ステップにより一の前記テキスト画像セグメントに含まれるテキストを選択する際に返されたレコードに、他のテキスト画像セグメントに対して生成されたテキスト内容候補が含まれていた場合、その含まれていたテキスト内容候補を当該他のテキスト画像セグメントに含まれるテキストと決定する決定ステップと、
を含む方法。 - さらに、前記テキスト画像セグメントに対して、当該各テキストの内容を示すタグを割り当てる割当ステップを含み、
前記決定ステップは、前記選択ステップによる問い合わせにより返されたレコードに、いずれかのタグといずれかのテキスト画像セグメントに対して生成されたテキスト内容候補とが対応付けられて含まれていた場合において、その対応付けが前記割当ステップにより割り当てられたタグとテキスト画像セグメントとの対応付けと異なる場合、前記割当ステップによる対応付けを当該レコードに含まれていた対応付けで修正する、請求項1に記載の方法。 - 前記選択ステップは、
少なくとも前記返されたレコードに基づき、前記各テキスト内容候補にスコアを割り当て、
前記割り当てられたスコアに基づいて、前記テキスト内容候補から最も可能性の高い1つを選択する、請求項1に記載の方法。 - さらに、前記名刺画像からロゴ画像セグメントを抽出するステップと、
前記ロゴ画像セグメントごとに、少なくとも1つの画像データベースに問い合わせを実行するステップと、
を含み、
前記選択ステップは、前記ロゴ画像セグメントごとに少なくとも1つの前記画像データベースへの前記問い合わせにより返されたレコードに基づいて前記各テキスト内容候補にスコアを割り当てる、請求項3に記載の方法。
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US11/299,453 US7826665B2 (en) | 2005-12-12 | 2005-12-12 | Personal information retrieval using knowledge bases for optical character recognition correction |
US11/299,453 | 2005-12-12 |
Publications (2)
Publication Number | Publication Date |
---|---|
JP2007164785A JP2007164785A (ja) | 2007-06-28 |
JP4995554B2 true JP4995554B2 (ja) | 2012-08-08 |
Family
ID=37853056
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
JP2006329176A Expired - Fee Related JP4995554B2 (ja) | 2005-12-12 | 2006-12-06 | 光学式文字認識補正のための知識ベースを利用した個人情報の検索方法 |
Country Status (3)
Country | Link |
---|---|
US (1) | US7826665B2 (ja) |
EP (1) | EP1796019A1 (ja) |
JP (1) | JP4995554B2 (ja) |
Families Citing this family (55)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20070168382A1 (en) * | 2006-01-03 | 2007-07-19 | Michael Tillberg | Document analysis system for integration of paper records into a searchable electronic database |
US8306336B2 (en) * | 2006-05-17 | 2012-11-06 | Qualcomm Incorporated | Line or text-based image processing tools |
US8433729B2 (en) * | 2006-09-29 | 2013-04-30 | Sap Ag | Method and system for automatically generating a communication interface |
US8290270B2 (en) * | 2006-10-13 | 2012-10-16 | Syscom, Inc. | Method and system for converting image text documents in bit-mapped formats to searchable text and for searching the searchable text |
US7949191B1 (en) | 2007-04-04 | 2011-05-24 | A9.Com, Inc. | Method and system for searching for information on a network in response to an image query sent by a user from a mobile communications device |
US7849398B2 (en) | 2007-04-26 | 2010-12-07 | Xerox Corporation | Decision criteria for automated form population |
JP4960817B2 (ja) * | 2007-09-19 | 2012-06-27 | キヤノン株式会社 | 画像処理装置、および画像処理方法 |
JP5242202B2 (ja) * | 2008-03-14 | 2013-07-24 | 京セラ株式会社 | 携帯通信端末 |
US8103132B2 (en) * | 2008-03-31 | 2012-01-24 | International Business Machines Corporation | Fast key-in for machine-printed OCR-based systems |
US20100014124A1 (en) * | 2008-07-21 | 2010-01-21 | Kevin Deal | Portable business card scanner |
US20100104187A1 (en) * | 2008-10-24 | 2010-04-29 | Matt Broadbent | Personal navigation device and related method of adding tags to photos according to content of the photos and geographical information of where photos were taken |
US8655803B2 (en) | 2008-12-17 | 2014-02-18 | Xerox Corporation | Method of feature extraction from noisy documents |
KR200445556Y1 (ko) * | 2009-05-07 | 2009-08-11 | 김광환 | 명함이 저장관리되는 휴대용 전자지갑 |
US8452599B2 (en) * | 2009-06-10 | 2013-05-28 | Toyota Motor Engineering & Manufacturing North America, Inc. | Method and system for extracting messages |
US9135277B2 (en) | 2009-08-07 | 2015-09-15 | Google Inc. | Architecture for responding to a visual query |
US9087059B2 (en) | 2009-08-07 | 2015-07-21 | Google Inc. | User interface for presenting search results for multiple regions of a visual query |
US8670597B2 (en) | 2009-08-07 | 2014-03-11 | Google Inc. | Facial recognition with social network aiding |
US9405772B2 (en) | 2009-12-02 | 2016-08-02 | Google Inc. | Actionable search results for street view visual queries |
US8805079B2 (en) | 2009-12-02 | 2014-08-12 | Google Inc. | Identifying matching canonical documents in response to a visual query and in accordance with geographic information |
US8811742B2 (en) | 2009-12-02 | 2014-08-19 | Google Inc. | Identifying matching canonical documents consistent with visual query structural information |
US9183224B2 (en) * | 2009-12-02 | 2015-11-10 | Google Inc. | Identifying matching canonical documents in response to a visual query |
US8977639B2 (en) | 2009-12-02 | 2015-03-10 | Google Inc. | Actionable search results for visual queries |
US9852156B2 (en) | 2009-12-03 | 2017-12-26 | Google Inc. | Hybrid use of location sensor data and visual query to return local listings for visual query |
US8509534B2 (en) * | 2010-03-10 | 2013-08-13 | Microsoft Corporation | Document page segmentation in optical character recognition |
JP2012008733A (ja) * | 2010-06-23 | 2012-01-12 | King Jim Co Ltd | カード情報管理装置 |
US8340425B2 (en) | 2010-08-10 | 2012-12-25 | Xerox Corporation | Optical character recognition with two-pass zoning |
US11610653B2 (en) * | 2010-09-01 | 2023-03-21 | Apixio, Inc. | Systems and methods for improved optical character recognition of health records |
US9874454B2 (en) * | 2011-01-13 | 2018-01-23 | Here Global B.V. | Community-based data for mapping systems |
US9418304B2 (en) | 2011-06-29 | 2016-08-16 | Qualcomm Incorporated | System and method for recognizing text information in object |
CN102231188A (zh) * | 2011-07-05 | 2011-11-02 | 上海合合信息科技发展有限公司 | 结合文字识别和图像匹配的名片识别方法 |
CN102270296A (zh) * | 2011-07-05 | 2011-12-07 | 上海合合信息科技发展有限公司 | 结合文字识别和图像匹配交换名片信息的方法 |
CN102393847B (zh) * | 2011-07-05 | 2013-04-17 | 上海合合信息科技发展有限公司 | 判断联系人列表中是否存在欲添加名片的方法 |
US9082035B2 (en) * | 2011-08-29 | 2015-07-14 | Qualcomm Incorporated | Camera OCR with context information |
US11455350B2 (en) * | 2012-02-08 | 2022-09-27 | Thomson Reuters Enterprise Centre Gmbh | System, method, and interfaces for work product management |
JP5246364B1 (ja) * | 2012-05-18 | 2013-07-24 | 富士ゼロックス株式会社 | 情報処理システム及びプログラム |
US8639036B1 (en) | 2012-07-02 | 2014-01-28 | Amazon Technologies, Inc. | Product image information extraction |
US11151515B2 (en) * | 2012-07-31 | 2021-10-19 | Varonis Systems, Inc. | Email distribution list membership governance method and system |
CN104217202B (zh) * | 2013-06-03 | 2019-01-01 | 支付宝(中国)网络技术有限公司 | 信息识别方法、设备和系统 |
US20150006362A1 (en) * | 2013-06-28 | 2015-01-01 | Google Inc. | Extracting card data using card art |
CN103927352A (zh) * | 2014-04-10 | 2014-07-16 | 江苏唯实科技有限公司 | 利用知识库海量关联信息的中文名片ocr数据修正系统 |
RU2604668C2 (ru) * | 2014-06-17 | 2016-12-10 | Общество с ограниченной ответственностью "Аби Девелопмент" | Визуализация машинно-генерируемого изображения документа |
US20160125387A1 (en) | 2014-11-03 | 2016-05-05 | Square, Inc. | Background ocr during card data entry |
CN104915664B (zh) * | 2015-05-22 | 2021-02-09 | 腾讯科技(深圳)有限公司 | 联系对象标识获取方法和装置 |
US20170046668A1 (en) * | 2015-08-16 | 2017-02-16 | Google Inc. | Comparing An Extracted User Name with Stored User Data |
US10157190B2 (en) * | 2016-03-28 | 2018-12-18 | Microsoft Technology Licensing, Llc | Image action based on automatic feature extraction |
US10069955B2 (en) * | 2016-04-29 | 2018-09-04 | Samuel Philip Gerace | Cloud-based contacts management |
JP6325604B2 (ja) * | 2016-06-22 | 2018-05-16 | 株式会社ランドスケイプ | 個人情報登録・管理システム |
CN106683103A (zh) * | 2016-12-30 | 2017-05-17 | 上海云丞聚智能科技有限公司 | 题目获取方法及装置 |
US12079706B2 (en) * | 2019-04-30 | 2024-09-03 | Clari Inc. | Method for capturing and storing contact information from a physical medium using machine learning |
CN110708401A (zh) * | 2019-09-29 | 2020-01-17 | 北京百度网讯科技有限公司 | 名片生成的方法及装置 |
US11093774B2 (en) | 2019-12-04 | 2021-08-17 | International Business Machines Corporation | Optical character recognition error correction model |
CN112288548A (zh) * | 2020-11-13 | 2021-01-29 | 北京沃东天骏信息技术有限公司 | 目标对象的关键信息的提取方法、装置、介质及电子设备 |
CN113420564B (zh) * | 2021-06-21 | 2022-11-22 | 国网山东省电力公司物资公司 | 一种基于混合匹配的电力铭牌语义结构化方法及系统 |
US11487798B1 (en) * | 2022-03-30 | 2022-11-01 | Altada Technology Solutions Ltd. | Method for identifying a data segment in a data set |
US12033620B1 (en) * | 2023-09-08 | 2024-07-09 | Google Llc | Systems and methods for analyzing text extracted from images and performing appropriate transformations on the extracted text |
Family Cites Families (19)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2984862B2 (ja) * | 1991-05-29 | 1999-11-29 | 株式会社日立製作所 | 名刺電子ファイリング装置 |
US5483052A (en) | 1993-12-07 | 1996-01-09 | Smith, Iii; Herbert J. | System for reading, storing and using bar-encoded data from a coded business card or other printed material |
US5604640A (en) | 1994-01-31 | 1997-02-18 | Motorola | Business card scanner and method of use |
US5754671A (en) * | 1995-04-12 | 1998-05-19 | Lockheed Martin Corporation | Method for improving cursive address recognition in mail pieces using adaptive data base management |
US5850480A (en) | 1996-05-30 | 1998-12-15 | Scan-Optics, Inc. | OCR error correction methods and apparatus utilizing contextual comparison |
GB9809679D0 (en) | 1998-05-06 | 1998-07-01 | Xerox Corp | Portable text capturing method and device therefor |
JP2000090192A (ja) * | 1998-09-16 | 2000-03-31 | Sharp Corp | 住所および郵便番号の文字列修正方法 |
JP3774331B2 (ja) * | 1999-05-27 | 2006-05-10 | 株式会社Pfu | 個人情報記載媒体認識方法およびその個人情報記載媒体認識装置並びに記録媒体 |
US20010044324A1 (en) | 1999-12-13 | 2001-11-22 | Peter Carayiannis | Cellular telephone |
US7120302B1 (en) * | 2000-07-31 | 2006-10-10 | Raf Technology, Inc. | Method for improving the accuracy of character recognition processes |
US6823084B2 (en) | 2000-09-22 | 2004-11-23 | Sri International | Method and apparatus for portably recognizing text in an image sequence of scene imagery |
DE10104270A1 (de) | 2001-01-31 | 2002-08-01 | Siemens Ag | Verfahren und System zum Verarbeiten von auf Informationsträgern dargestellten Informationen |
US20020131636A1 (en) * | 2001-03-19 | 2002-09-19 | Darwin Hou | Palm office assistants |
US6778979B2 (en) | 2001-08-13 | 2004-08-17 | Xerox Corporation | System for automatically generating queries |
US6922487B2 (en) | 2001-11-02 | 2005-07-26 | Xerox Corporation | Method and apparatus for capturing text images |
US6783060B2 (en) | 2002-05-02 | 2004-08-31 | International Business Machines Corporation | Smart business card system |
US7106905B2 (en) | 2002-08-23 | 2006-09-12 | Hewlett-Packard Development Company, L.P. | Systems and methods for processing text-based electronic documents |
US7493322B2 (en) | 2003-10-15 | 2009-02-17 | Xerox Corporation | System and method for computing a measure of similarity between documents |
JP4597644B2 (ja) * | 2003-11-28 | 2010-12-15 | シャープ株式会社 | 文字認識装置、プログラムおよび記録媒体 |
-
2005
- 2005-12-12 US US11/299,453 patent/US7826665B2/en active Active
-
2006
- 2006-12-06 JP JP2006329176A patent/JP4995554B2/ja not_active Expired - Fee Related
- 2006-12-11 EP EP06125808A patent/EP1796019A1/en not_active Ceased
Also Published As
Publication number | Publication date |
---|---|
US20070133874A1 (en) | 2007-06-14 |
JP2007164785A (ja) | 2007-06-28 |
EP1796019A1 (en) | 2007-06-13 |
US7826665B2 (en) | 2010-11-02 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
JP4995554B2 (ja) | 光学式文字認識補正のための知識ベースを利用した個人情報の検索方法 | |
US10073859B2 (en) | System and methods for creation and use of a mixed media environment | |
US9357098B2 (en) | System and methods for use of voice mail and email in a mixed media environment | |
US8195659B2 (en) | Integration and use of mixed media documents | |
US8600989B2 (en) | Method and system for image matching in a mixed media environment | |
US7639387B2 (en) | Authoring tools using a mixed media environment | |
US7885955B2 (en) | Shared document annotation | |
US8335789B2 (en) | Method and system for document fingerprint matching in a mixed media environment | |
US9405751B2 (en) | Database for mixed media document system | |
US7669148B2 (en) | System and methods for portable device for mixed media system | |
US8838591B2 (en) | Embedding hot spots in electronic documents | |
US8521737B2 (en) | Method and system for multi-tier image matching in a mixed media environment | |
US7917554B2 (en) | Visibly-perceptible hot spots in documents | |
US8949287B2 (en) | Embedding hot spots in imaged documents | |
US8332401B2 (en) | Method and system for position-based image matching in a mixed media environment | |
US9171202B2 (en) | Data organization and access for mixed media document system | |
US7551780B2 (en) | System and method for using individualized mixed document | |
EP1917636A1 (en) | Method and system for image matching in a mixed media environment | |
JP2010217996A (ja) | 文字認識装置、文字認識プログラム、および文字認識方法 | |
EP1917637A1 (en) | Data organization and access for mixed media document system | |
US7508978B1 (en) | Detection of grooves in scanned images | |
US10165149B2 (en) | Methods and systems for automatically generating a name for an electronic document | |
JP2005055991A (ja) | 携帯情報端末及びそれを用いた文字行抽出方法 |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
A621 | Written request for application examination |
Free format text: JAPANESE INTERMEDIATE CODE: A621 Effective date: 20091204 |
|
A977 | Report on retrieval |
Free format text: JAPANESE INTERMEDIATE CODE: A971007 Effective date: 20111216 |
|
A131 | Notification of reasons for refusal |
Free format text: JAPANESE INTERMEDIATE CODE: A131 Effective date: 20111220 |
|
A521 | Request for written amendment filed |
Free format text: JAPANESE INTERMEDIATE CODE: A523 Effective date: 20120319 |
|
TRDD | Decision of grant or rejection written | ||
A01 | Written decision to grant a patent or to grant a registration (utility model) |
Free format text: JAPANESE INTERMEDIATE CODE: A01 Effective date: 20120417 |
|
A01 | Written decision to grant a patent or to grant a registration (utility model) |
Free format text: JAPANESE INTERMEDIATE CODE: A01 |
|
A61 | First payment of annual fees (during grant procedure) |
Free format text: JAPANESE INTERMEDIATE CODE: A61 Effective date: 20120510 |
|
FPAY | Renewal fee payment (event date is renewal date of database) |
Free format text: PAYMENT UNTIL: 20150518 Year of fee payment: 3 |
|
R150 | Certificate of patent or registration of utility model |
Free format text: JAPANESE INTERMEDIATE CODE: R150 |
|
R250 | Receipt of annual fees |
Free format text: JAPANESE INTERMEDIATE CODE: R250 |
|
R250 | Receipt of annual fees |
Free format text: JAPANESE INTERMEDIATE CODE: R250 |
|
R250 | Receipt of annual fees |
Free format text: JAPANESE INTERMEDIATE CODE: R250 |
|
LAPS | Cancellation because of no payment of annual fees |