CN102236800B - 经历ocr过程的文本的单词识别 - Google Patents
经历ocr过程的文本的单词识别 Download PDFInfo
- Publication number
- CN102236800B CN102236800B CN201110117322.0A CN201110117322A CN102236800B CN 102236800 B CN102236800 B CN 102236800B CN 201110117322 A CN201110117322 A CN 201110117322A CN 102236800 B CN102236800 B CN 102236800B
- Authority
- CN
- China
- Prior art keywords
- word
- confidence level
- segmentation lines
- data element
- text
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Expired - Fee Related
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V30/00—Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
- G06V30/10—Character recognition
- G06V30/14—Image acquisition
- G06V30/148—Segmentation of character regions
- G06V30/153—Segmentation of character regions using recognition of characters or words
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V30/00—Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
- G06V30/10—Character recognition
Landscapes
- Engineering & Computer Science (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Multimedia (AREA)
- Theoretical Computer Science (AREA)
- Character Discrimination (AREA)
Abstract
Description
Claims (16)
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US12/772,376 | 2010-05-03 | ||
US12/772,376 US8401293B2 (en) | 2010-05-03 | 2010-05-03 | Word recognition of text undergoing an OCR process |
Publications (2)
Publication Number | Publication Date |
---|---|
CN102236800A CN102236800A (zh) | 2011-11-09 |
CN102236800B true CN102236800B (zh) | 2015-12-02 |
Family
ID=44858306
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201110117322.0A Expired - Fee Related CN102236800B (zh) | 2010-05-03 | 2011-04-29 | 经历ocr过程的文本的单词识别 |
Country Status (2)
Country | Link |
---|---|
US (1) | US8401293B2 (zh) |
CN (1) | CN102236800B (zh) |
Families Citing this family (19)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US11610653B2 (en) * | 2010-09-01 | 2023-03-21 | Apixio, Inc. | Systems and methods for improved optical character recognition of health records |
US8792748B2 (en) * | 2010-10-12 | 2014-07-29 | International Business Machines Corporation | Deconvolution of digital images |
US9620122B2 (en) * | 2011-12-08 | 2017-04-11 | Lenovo (Singapore) Pte. Ltd | Hybrid speech recognition |
US9105073B2 (en) * | 2012-04-24 | 2015-08-11 | Amadeus S.A.S. | Method and system of producing an interactive version of a plan or the like |
CN103455814B (zh) * | 2012-05-31 | 2017-04-12 | 佳能株式会社 | 用于文档图像的文本行分割方法和系统 |
US9049295B1 (en) * | 2012-08-28 | 2015-06-02 | West Corporation | Intelligent interactive voice response system for processing customer communications |
US9098777B2 (en) * | 2012-09-06 | 2015-08-04 | Xerox Corporation | Method and system for evaluating handwritten documents |
CN104077593A (zh) * | 2013-03-27 | 2014-10-01 | 富士通株式会社 | 图像处理方法和装置 |
US9275554B2 (en) | 2013-09-24 | 2016-03-01 | Jimmy M Sauz | Device, system, and method for enhanced memorization of a document |
CN107092903A (zh) * | 2016-02-18 | 2017-08-25 | 阿里巴巴集团控股有限公司 | 信息识别方法及装置 |
US10646813B2 (en) * | 2016-09-23 | 2020-05-12 | Lehigh University | Gas separation apparatus and methods using same |
US10062001B2 (en) * | 2016-09-29 | 2018-08-28 | Konica Minolta Laboratory U.S.A., Inc. | Method for line and word segmentation for handwritten text images |
US10607606B2 (en) | 2017-06-19 | 2020-03-31 | Lenovo (Singapore) Pte. Ltd. | Systems and methods for execution of digital assistant |
US10482344B2 (en) | 2018-01-04 | 2019-11-19 | Wipro Limited | System and method for performing optical character recognition |
GB2571530B (en) * | 2018-02-28 | 2020-09-23 | Canon Europa Nv | An image processing method and an image processing system |
JP7338158B2 (ja) * | 2019-01-24 | 2023-09-05 | 富士フイルムビジネスイノベーション株式会社 | 情報処理装置及びプログラム |
US11270153B2 (en) | 2020-02-19 | 2022-03-08 | Northrop Grumman Systems Corporation | System and method for whole word conversion of text in image |
KR20210106814A (ko) * | 2020-02-21 | 2021-08-31 | 삼성전자주식회사 | 뉴럴 네트워크 학습 방법 및 장치 |
CN111723811A (zh) * | 2020-05-20 | 2020-09-29 | 上海积跬教育科技有限公司 | 文字识别及处理的方法、装置、介质以及电子设备 |
Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN1979529A (zh) * | 2005-12-09 | 2007-06-13 | 佳能株式会社 | 光学字符识别 |
Family Cites Families (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US5577135A (en) | 1994-03-01 | 1996-11-19 | Apple Computer, Inc. | Handwriting signal processing front-end for handwriting recognizers |
US6154579A (en) | 1997-08-11 | 2000-11-28 | At&T Corp. | Confusion matrix based method and system for correcting misrecognized words appearing in documents generated by an optical character recognition technique |
US6108444A (en) | 1997-09-29 | 2000-08-22 | Xerox Corporation | Method of grouping handwritten word segments in handwritten document images |
US6993205B1 (en) | 2000-04-12 | 2006-01-31 | International Business Machines Corporation | Automatic method of detection of incorrectly oriented text blocks using results from character recognition |
US6944340B1 (en) | 2000-08-07 | 2005-09-13 | Canon Kabushiki Kaisha | Method and apparatus for efficient determination of recognition parameters |
US7171061B2 (en) | 2002-07-12 | 2007-01-30 | Xerox Corporation | Systems and methods for triage of passages of text output from an OCR system |
US7499588B2 (en) | 2004-05-20 | 2009-03-03 | Microsoft Corporation | Low resolution OCR for camera acquired documents |
US7724957B2 (en) * | 2006-07-31 | 2010-05-25 | Microsoft Corporation | Two tiered text recognition |
US8611661B2 (en) | 2007-12-26 | 2013-12-17 | Intel Corporation | OCR multi-resolution method and apparatus |
US8571270B2 (en) * | 2010-05-10 | 2013-10-29 | Microsoft Corporation | Segmentation of a word bitmap into individual characters or glyphs during an OCR process |
-
2010
- 2010-05-03 US US12/772,376 patent/US8401293B2/en not_active Expired - Fee Related
-
2011
- 2011-04-29 CN CN201110117322.0A patent/CN102236800B/zh not_active Expired - Fee Related
Patent Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN1979529A (zh) * | 2005-12-09 | 2007-06-13 | 佳能株式会社 | 光学字符识别 |
Also Published As
Publication number | Publication date |
---|---|
US8401293B2 (en) | 2013-03-19 |
CN102236800A (zh) | 2011-11-09 |
US20110268360A1 (en) | 2011-11-03 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN102236800B (zh) | 经历ocr过程的文本的单词识别 | |
CN102982330B (zh) | 文字图像中字符识别方法和识别装置 | |
CN102289667B (zh) | 对经历光学字符识别(ocr)过程的文本文档中出现的错误的用户纠正 | |
CN105229669B (zh) | 图像处理装置及图像处理方法 | |
CN102782702B (zh) | 在光学字符识别(ocr)过程中的段落识别 | |
US8340425B2 (en) | Optical character recognition with two-pass zoning | |
US9189694B2 (en) | Image processing device and image processing method | |
US9152883B2 (en) | System and method for increasing the accuracy of optical character recognition (OCR) | |
US9098759B2 (en) | Image processing apparatus, method, and medium for character recognition | |
CN112508011A (zh) | 一种基于神经网络的ocr识别方法及设备 | |
US20090317003A1 (en) | Correcting segmentation errors in ocr | |
US11521365B2 (en) | Image processing system, image processing apparatus, image processing method, and storage medium | |
US7406201B2 (en) | Correcting segmentation errors in OCR | |
US11949828B2 (en) | Information processing apparatus, information processing system, and non-transitory computer readable medium for performing preprocessing and character recognition to acquire item and value of image | |
Fateh et al. | Enhancing optical character recognition: Efficient techniques for document layout analysis and text line detection | |
CN102467664B (zh) | 辅助光学字符识别的方法和装置 | |
US20190073571A1 (en) | Method for improving quality of recognition of a single frame | |
Kumar et al. | Line based robust script identification for indianlanguages | |
US20210019554A1 (en) | Information processing device and information processing method | |
CN109409370B (zh) | 一种远程桌面字符识别方法和装置 | |
CN114419626B (zh) | 一种基于ocr技术的高精度单据识别方法和系统 | |
Boiangiu et al. | Efficient solutions for ocr text remote correction in content conversion systems | |
CN113052179B (zh) | 多音字处理方法、装置、电子设备及存储介质 | |
Bagoriya et al. | Font type identification of hindi printed document | |
CN115100672A (zh) | 文字检测识别方法、装置、设备与计算机可读存储介质 |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
ASS | Succession or assignment of patent right |
Owner name: MICROSOFT TECHNOLOGY LICENSING LLC Free format text: FORMER OWNER: MICROSOFT CORP. Effective date: 20150717 |
|
C41 | Transfer of patent application or patent right or utility model | ||
TA01 | Transfer of patent application right |
Effective date of registration: 20150717 Address after: Washington State Applicant after: Micro soft technique license Co., Ltd Address before: Washington State Applicant before: Microsoft Corp. |
|
C14 | Grant of patent or utility model | ||
GR01 | Patent grant | ||
CF01 | Termination of patent right due to non-payment of annual fee | ||
CF01 | Termination of patent right due to non-payment of annual fee |
Granted publication date: 20151202 Termination date: 20210429 |