JP5355625B2 - 光学式文字認識用に画像を前処理するための方法およびシステム - Google Patents
光学式文字認識用に画像を前処理するための方法およびシステム Download PDFInfo
- Publication number
- JP5355625B2 JP5355625B2 JP2011129862A JP2011129862A JP5355625B2 JP 5355625 B2 JP5355625 B2 JP 5355625B2 JP 2011129862 A JP2011129862 A JP 2011129862A JP 2011129862 A JP2011129862 A JP 2011129862A JP 5355625 B2 JP5355625 B2 JP 5355625B2
- Authority
- JP
- Japan
- Prior art keywords
- components
- height
- column
- word
- image
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Expired - Fee Related
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V30/00—Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
- G06V30/40—Document-oriented image-based pattern recognition
- G06V30/41—Analysis of document content
- G06V30/414—Extracting the geometrical structure, e.g. layout tree; Block segmentation, e.g. bounding boxes for graphics or text
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V30/00—Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
- G06V30/10—Character recognition
- G06V30/28—Character recognition specially adapted to the type of the alphabet, e.g. Latin alphabet
- G06V30/293—Character recognition specially adapted to the type of the alphabet, e.g. Latin alphabet of characters other than Kanji, Hiragana or Katakana
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Computer Graphics (AREA)
- Geometry (AREA)
- Artificial Intelligence (AREA)
- General Physics & Mathematics (AREA)
- Multimedia (AREA)
- Theoretical Computer Science (AREA)
- Character Input (AREA)
Applications Claiming Priority (2)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| US12/814,448 | 2010-06-12 | ||
| US12/814,448 US8218875B2 (en) | 2010-06-12 | 2010-06-12 | Method and system for preprocessing an image for optical character recognition |
Publications (3)
| Publication Number | Publication Date |
|---|---|
| JP2012003756A JP2012003756A (ja) | 2012-01-05 |
| JP2012003756A5 JP2012003756A5 (enExample) | 2013-07-18 |
| JP5355625B2 true JP5355625B2 (ja) | 2013-11-27 |
Family
ID=44654616
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| JP2011129862A Expired - Fee Related JP5355625B2 (ja) | 2010-06-12 | 2011-06-10 | 光学式文字認識用に画像を前処理するための方法およびシステム |
Country Status (3)
| Country | Link |
|---|---|
| US (2) | US8218875B2 (enExample) |
| EP (1) | EP2395453A3 (enExample) |
| JP (1) | JP5355625B2 (enExample) |
Families Citing this family (20)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US8218875B2 (en) | 2010-06-12 | 2012-07-10 | Hussein Khalid Al-Omari | Method and system for preprocessing an image for optical character recognition |
| US8542926B2 (en) * | 2010-11-19 | 2013-09-24 | Microsoft Corporation | Script-agnostic text reflow for document images |
| US9734132B1 (en) * | 2011-12-20 | 2017-08-15 | Amazon Technologies, Inc. | Alignment and reflow of displayed character images |
| JP5994251B2 (ja) * | 2012-01-06 | 2016-09-21 | 富士ゼロックス株式会社 | 画像処理装置及びプログラム |
| EP2836962A4 (en) * | 2012-04-12 | 2016-07-27 | Tata Consultancy Services Ltd | SYSTEM AND METHOD FOR DETECTION AND SEGMENTATION OF CHARACTERISTIC MATTERS FOR OPTICAL CHARACTER RECOGNITION (OCR) |
| EP2662802A1 (en) * | 2012-05-09 | 2013-11-13 | King Abdulaziz City for Science & Technology (KACST) | Method and system for preprocessing an image for optical character recognition |
| US9785240B2 (en) * | 2013-03-18 | 2017-10-10 | Fuji Xerox Co., Ltd. | Systems and methods for content-aware selection |
| JP5986051B2 (ja) * | 2013-05-12 | 2016-09-06 | キング・アブドゥルアジズ・シティ・フォー・サイエンス・アンド・テクノロジー(ケイ・エイ・シィ・エス・ティ)King Abdulaziz City For Science And Technology (Kacst) | アラビア語テキストを自動的に認識するための方法 |
| US20160098597A1 (en) * | 2013-06-18 | 2016-04-07 | Abbyy Development Llc | Methods and systems that generate feature symbols with associated parameters in order to convert images to electronic documents |
| US9235755B2 (en) * | 2013-08-15 | 2016-01-12 | Konica Minolta Laboratory U.S.A., Inc. | Removal of underlines and table lines in document images while preserving intersecting character strokes |
| US9292739B1 (en) * | 2013-12-12 | 2016-03-22 | A9.Com, Inc. | Automated recognition of text utilizing multiple images |
| US9288362B2 (en) | 2014-02-03 | 2016-03-15 | King Fahd University Of Petroleum And Minerals | Technique for skew detection of printed arabic documents |
| US9367766B2 (en) * | 2014-07-22 | 2016-06-14 | Adobe Systems Incorporated | Text line detection in images |
| JP2016181111A (ja) * | 2015-03-24 | 2016-10-13 | 富士ゼロックス株式会社 | 画像処理装置、及び画像処理プログラム |
| CN106156766B (zh) | 2015-03-25 | 2020-02-18 | 阿里巴巴集团控股有限公司 | 文本行分类器的生成方法及装置 |
| US10430649B2 (en) | 2017-07-14 | 2019-10-01 | Adobe Inc. | Text region detection in digital images using image tag filtering |
| US11366968B2 (en) * | 2019-07-29 | 2022-06-21 | Intuit Inc. | Region proposal networks for automated bounding box detection and text segmentation |
| US11270153B2 (en) | 2020-02-19 | 2022-03-08 | Northrop Grumman Systems Corporation | System and method for whole word conversion of text in image |
| JP7528542B2 (ja) * | 2020-06-03 | 2024-08-06 | 株式会社リコー | 画像処理装置、方法およびプログラム |
| FR3155939A1 (fr) * | 2023-11-27 | 2025-05-30 | Orange | Procédé d’analyse d’au moins une image, dispositif électronique et produit programme d’ordinateur correspondant |
Family Cites Families (26)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US5058182A (en) * | 1988-05-02 | 1991-10-15 | The Research Foundation Of State Univ. Of New York | Method and apparatus for handwritten character recognition |
| US5224179A (en) * | 1988-12-20 | 1993-06-29 | At&T Bell Laboratories | Image skeletonization method |
| US5680479A (en) * | 1992-04-24 | 1997-10-21 | Canon Kabushiki Kaisha | Method and apparatus for character recognition |
| JP3253356B2 (ja) * | 1992-07-06 | 2002-02-04 | 株式会社リコー | 文書画像の領域識別方法 |
| US5987170A (en) * | 1992-09-28 | 1999-11-16 | Matsushita Electric Industrial Co., Ltd. | Character recognition machine utilizing language processing |
| US5410611A (en) * | 1993-12-17 | 1995-04-25 | Xerox Corporation | Method for identifying word bounding boxes in text |
| CA2166248C (en) * | 1995-12-28 | 2000-01-04 | Abdel Naser Al-Karmi | Optical character recognition of handwritten or cursive text |
| JPH11232378A (ja) * | 1997-12-09 | 1999-08-27 | Canon Inc | デジタルカメラ、そのデジタルカメラを用いた文書処理システム、コンピュータ可読の記憶媒体、及び、プログラムコード送出装置 |
| JP4323606B2 (ja) * | 1999-03-01 | 2009-09-02 | 理想科学工業株式会社 | 文書画像傾き検出装置 |
| US7298903B2 (en) * | 2001-06-28 | 2007-11-20 | Microsoft Corporation | Method and system for separating text and drawings in digital ink |
| US7062090B2 (en) * | 2002-06-28 | 2006-06-13 | Microsoft Corporation | Writing guide for a free-form document editor |
| US20040096102A1 (en) * | 2002-11-18 | 2004-05-20 | Xerox Corporation | Methodology for scanned color document segmentation |
| US7499588B2 (en) * | 2004-05-20 | 2009-03-03 | Microsoft Corporation | Low resolution OCR for camera acquired documents |
| US8139828B2 (en) * | 2005-10-21 | 2012-03-20 | Carestream Health, Inc. | Method for enhanced visualization of medical images |
| JP4757001B2 (ja) * | 2005-11-25 | 2011-08-24 | キヤノン株式会社 | 画像処理装置、画像処理方法 |
| US7668394B2 (en) * | 2005-12-21 | 2010-02-23 | Lexmark International, Inc. | Background intensity correction of a scan of a document |
| US7724957B2 (en) * | 2006-07-31 | 2010-05-25 | Microsoft Corporation | Two tiered text recognition |
| JP4988842B2 (ja) * | 2007-06-28 | 2012-08-01 | 富士通株式会社 | 表データ生成プログラム、表データ生成方法および表データ生成装置 |
| US20110043869A1 (en) * | 2007-12-21 | 2011-02-24 | Nec Corporation | Information processing system, its method and program |
| US8027539B2 (en) * | 2008-01-11 | 2011-09-27 | Sharp Laboratories Of America, Inc. | Method and apparatus for determining an orientation of a document including Korean characters |
| US8009928B1 (en) * | 2008-01-23 | 2011-08-30 | A9.Com, Inc. | Method and system for detecting and recognizing text in images |
| US8150160B2 (en) * | 2009-03-26 | 2012-04-03 | King Fahd University Of Petroleum & Minerals | Automatic Arabic text image optical character recognition method |
| TWI394098B (zh) * | 2009-06-03 | 2013-04-21 | Nat Univ Chung Cheng | Shredding Method Based on File Image Texture Feature |
| US8086039B2 (en) * | 2010-02-05 | 2011-12-27 | Palo Alto Research Center Incorporated | Fine-grained visual document fingerprinting for accurate document comparison and retrieval |
| US20110280481A1 (en) * | 2010-05-17 | 2011-11-17 | Microsoft Corporation | User correction of errors arising in a textual document undergoing optical character recognition (ocr) process |
| US8218875B2 (en) | 2010-06-12 | 2012-07-10 | Hussein Khalid Al-Omari | Method and system for preprocessing an image for optical character recognition |
-
2010
- 2010-06-12 US US12/814,448 patent/US8218875B2/en not_active Expired - Fee Related
- 2010-12-28 EP EP10197110.9A patent/EP2395453A3/en not_active Withdrawn
-
2011
- 2011-06-10 JP JP2011129862A patent/JP5355625B2/ja not_active Expired - Fee Related
-
2012
- 2012-05-09 US US13/467,873 patent/US8548246B2/en not_active Expired - Fee Related
Also Published As
| Publication number | Publication date |
|---|---|
| JP2012003756A (ja) | 2012-01-05 |
| US8548246B2 (en) | 2013-10-01 |
| US20110305387A1 (en) | 2011-12-15 |
| EP2395453A2 (en) | 2011-12-14 |
| US20120219220A1 (en) | 2012-08-30 |
| EP2395453A3 (en) | 2013-08-28 |
| US8218875B2 (en) | 2012-07-10 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| JP5355625B2 (ja) | 光学式文字認識用に画像を前処理するための方法およびシステム | |
| JP5355621B2 (ja) | 光学式文字認識用に画像を前処理するための方法およびシステム | |
| US8571270B2 (en) | Segmentation of a word bitmap into individual characters or glyphs during an OCR process | |
| CN113486828A (zh) | 图像处理方法、装置、设备和存储介质 | |
| Dongre et al. | Devnagari document segmentation using histogram approach | |
| US20030012438A1 (en) | Multiple size reductions for image segmentation | |
| JPH0721319A (ja) | 自動アジア言語決定装置 | |
| CN109598185B (zh) | 图像识别翻译方法、装置、设备及可读存储介质 | |
| Shehu et al. | Character recognition using correlation & hamming distance | |
| KR101571681B1 (ko) | 동질 영역을 이용한 문서 구조의 분석 방법 | |
| Jindal et al. | A new method for segmentation of pre-detected Devanagari words from the scene images: Pihu method | |
| JP2013097561A (ja) | 単語間空白検出装置、単語間空白検出方法及び単語間空白検出用コンピュータプログラム | |
| Kshetry | Image preprocessing and modified adaptive thresholding for improving OCR | |
| CN102542269B (zh) | 西文单词切分方法和装置 | |
| JP6082306B2 (ja) | 光学式文字認識用に画像を前処理するための方法およびシステム | |
| JP3058489B2 (ja) | 文字列抽出方法 | |
| Roy et al. | An approach towards segmentation of real time handwritten text | |
| Siddique et al. | An absolute Optical Character Recognition system for Bangla script Utilizing a captured image | |
| CN117710985B (zh) | 光学字符识别方法、装置及智能终端 | |
| JP2004046528A (ja) | 文書方向推定方法および文書方向推定プログラム | |
| Zaw et al. | Segmentation Method for Myanmar Character Recognition Using Block based Pixel Count and Aspect Ratio | |
| Ajodani et al. | Line Segmentation in Persian Texts in Double Columns Using Hierarchical Clustering Algorithms | |
| Kuhl et al. | Model-based character recognition in low resolution | |
| Deivalakshmi | A simple system for table extraction irrespective of boundary thickness and removal of detected spurious lines | |
| Siddique et al. | An absolute Optical Character Recognition system for Bangla script from a captured image |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| A521 | Request for written amendment filed |
Free format text: JAPANESE INTERMEDIATE CODE: A523 Effective date: 20130530 |
|
| A621 | Written request for application examination |
Free format text: JAPANESE INTERMEDIATE CODE: A621 Effective date: 20130530 |
|
| A871 | Explanation of circumstances concerning accelerated examination |
Free format text: JAPANESE INTERMEDIATE CODE: A871 Effective date: 20130530 |
|
| A975 | Report on accelerated examination |
Free format text: JAPANESE INTERMEDIATE CODE: A971005 Effective date: 20130619 |
|
| A131 | Notification of reasons for refusal |
Free format text: JAPANESE INTERMEDIATE CODE: A131 Effective date: 20130625 |
|
| A521 | Request for written amendment filed |
Free format text: JAPANESE INTERMEDIATE CODE: A523 Effective date: 20130722 |
|
| TRDD | Decision of grant or rejection written | ||
| A01 | Written decision to grant a patent or to grant a registration (utility model) |
Free format text: JAPANESE INTERMEDIATE CODE: A01 Effective date: 20130820 |
|
| A61 | First payment of annual fees (during grant procedure) |
Free format text: JAPANESE INTERMEDIATE CODE: A61 Effective date: 20130827 |
|
| R150 | Certificate of patent or registration of utility model |
Free format text: JAPANESE INTERMEDIATE CODE: R150 |
|
| R250 | Receipt of annual fees |
Free format text: JAPANESE INTERMEDIATE CODE: R250 |
|
| LAPS | Cancellation because of no payment of annual fees |