DE69718243T2 - Anlage zur Extraktion angeschlossenen Textes aus einem Tafelzellrahmen - Google Patents

Anlage zur Extraktion angeschlossenen Textes aus einem Tafelzellrahmen

Info

Publication number
DE69718243T2
DE69718243T2 DE69718243T DE69718243T DE69718243T2 DE 69718243 T2 DE69718243 T2 DE 69718243T2 DE 69718243 T DE69718243 T DE 69718243T DE 69718243 T DE69718243 T DE 69718243T DE 69718243 T2 DE69718243 T2 DE 69718243T2
Authority
DE
Germany
Prior art keywords
component
unconnected
connected components
components
predetermined threshold
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Lifetime
Application number
DE69718243T
Other languages
German (de)
English (en)
Other versions
DE69718243D1 (de
Inventor
Wang Shin-Ywan
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Canon Inc
Original Assignee
Canon Inc
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Canon Inc filed Critical Canon Inc
Application granted granted Critical
Publication of DE69718243D1 publication Critical patent/DE69718243D1/de
Publication of DE69718243T2 publication Critical patent/DE69718243T2/de
Anticipated expiration legal-status Critical
Expired - Lifetime legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/40Document-oriented image-based pattern recognition
    • G06V30/41Analysis of document content
    • G06V30/412Layout analysis of documents structured with printed lines or input boxes, e.g. business forms or tables
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition
    • G06V30/14Image acquisition
    • G06V30/148Segmentation of character regions
    • G06V30/155Removing patterns interfering with the pattern to be recognised, such as ruled lines or underlines
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Theoretical Computer Science (AREA)
  • Artificial Intelligence (AREA)
  • Character Input (AREA)
  • Image Analysis (AREA)
  • Processing Or Creating Images (AREA)
DE69718243T 1996-06-17 1997-06-11 Anlage zur Extraktion angeschlossenen Textes aus einem Tafelzellrahmen Expired - Lifetime DE69718243T2 (de)

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
US08/664,675 US6157738A (en) 1996-06-17 1996-06-17 System for extracting attached text

Publications (2)

Publication Number Publication Date
DE69718243D1 DE69718243D1 (de) 2003-02-13
DE69718243T2 true DE69718243T2 (de) 2003-08-28

Family

ID=24666972

Family Applications (1)

Application Number Title Priority Date Filing Date
DE69718243T Expired - Lifetime DE69718243T2 (de) 1996-06-17 1997-06-11 Anlage zur Extraktion angeschlossenen Textes aus einem Tafelzellrahmen

Country Status (4)

Country Link
US (1) US6157738A (enExample)
EP (1) EP0814422B1 (enExample)
JP (1) JP4077904B2 (enExample)
DE (1) DE69718243T2 (enExample)

Families Citing this family (18)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6112216A (en) * 1997-12-19 2000-08-29 Microsoft Corporation Method and system for editing a table in a document
US6330357B1 (en) * 1999-04-07 2001-12-11 Raf Technology, Inc. Extracting user data from a scanned image of a pre-printed form
JP3204259B2 (ja) * 1999-10-06 2001-09-04 インターナショナル・ビジネス・マシーンズ・コーポレーション 文字列抽出方法、手書き文字列抽出方法、文字列抽出装置、および画像処理装置
JP3425408B2 (ja) * 2000-05-31 2003-07-14 株式会社東芝 文書読取装置
EP1271403B1 (en) * 2001-06-26 2005-03-09 Nokia Corporation Method and device for character location in images from digital camera
JP2004088585A (ja) * 2002-08-28 2004-03-18 Fuji Xerox Co Ltd 画像処理システムおよびその方法
JP4897520B2 (ja) * 2006-03-20 2012-03-14 株式会社リコー 情報配信システム
US20070253615A1 (en) * 2006-04-26 2007-11-01 Yuan-Hsiang Chang Method and system for banknote recognition
US8331680B2 (en) * 2008-06-23 2012-12-11 International Business Machines Corporation Method of gray-level optical segmentation and isolation using incremental connected components
CN102314608A (zh) * 2010-06-30 2012-01-11 汉王科技股份有限公司 文字图像中行提取的方法和装置
US20130163871A1 (en) * 2011-12-22 2013-06-27 General Electric Company System and method for segmenting image data to identify a character-of-interest
US9842281B2 (en) * 2014-06-05 2017-12-12 Xerox Corporation System for automated text and halftone segmentation
US20160055376A1 (en) * 2014-06-21 2016-02-25 iQG DBA iQGATEWAY LLC Method and system for identification and extraction of data from structured documents
CN104268545B (zh) * 2014-09-15 2017-09-29 同方知网(北京)技术有限公司 一种电子档版式文件中的表格区域识别与内容栅格化方法
JP6173542B1 (ja) * 2016-08-10 2017-08-02 株式会社Pfu 画像処理装置、画像処理方法、および、プログラム
CN115240214A (zh) * 2021-04-09 2022-10-25 华南理工大学广州学院 一种表格结构识别方法
CN113221778B (zh) * 2021-05-19 2022-05-10 北京航空航天大学杭州创新研究院 手写表格的检测与识别方法及装置
CN113901950A (zh) * 2021-11-05 2022-01-07 上海派拉软件股份有限公司 一种高准确率的表格ocr识别方法及系统

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US4377803A (en) * 1980-07-02 1983-03-22 International Business Machines Corporation Algorithm for the segmentation of printed fixed pitch documents
JPS63268081A (ja) * 1987-04-17 1988-11-04 インタ−ナショナル・ビジネス・マシ−ンズ・コ−ポレ−ション 文書の文字を認識する方法及び装置
US5588072A (en) * 1993-12-22 1996-12-24 Canon Kabushiki Kaisha Method and apparatus for selecting blocks of image data from image data having both horizontally- and vertically-oriented blocks
US5848186A (en) * 1995-08-11 1998-12-08 Canon Kabushiki Kaisha Feature extraction system for identifying text within a table image

Also Published As

Publication number Publication date
EP0814422A3 (en) 1998-01-28
US6157738A (en) 2000-12-05
EP0814422A2 (en) 1997-12-29
EP0814422B1 (en) 2003-01-08
DE69718243D1 (de) 2003-02-13
JPH1083431A (ja) 1998-03-31
JP4077904B2 (ja) 2008-04-23

Similar Documents

Publication Publication Date Title
DE69718243T2 (de) Anlage zur Extraktion angeschlossenen Textes aus einem Tafelzellrahmen
DE69610882T2 (de) Blockselektionsystem, bei dem überlappende Blöcke aufgespaltet werden
DE69332459T2 (de) Verfahren und Vorrichtung zur Zeichenerkennung
DE69724557T2 (de) Dokumentenanalyse
DE69619606T2 (de) Merkmalermittlungsanlage
DE60120810T2 (de) Verfahren zur Dokumenterkennung und -indexierung
DE69724755T2 (de) Auffinden von Titeln und Photos in abgetasteten Dokumentbildern
DE69033079T2 (de) Aufbereitung von Text in einem Bild
DE69525401T2 (de) Verfahren und Gerät zur Identifikation von Wörtern, die in einem portablen elektronischen Dokument beschrieben sind
DE68922772T2 (de) Verfahren zur Zeichenkettenermittlung.
DE69432585T2 (de) Verfahren und Gerät zur Auswahl von Text und/oder Non-Text-Blöcken in einem gespeicherten Dokument
DE3722444C2 (de) Verfahren und Vorrichtung zum Erzeugen von Entwurfsmusterdaten
DE68926068T2 (de) Dokumentenverarbeitungssystem
DE69425084T2 (de) Verfahren und Gerät zur Erkennung von Textzeilen, Wörtern und räumlichen Merkmalen von Zeichenzellen
DE69429962T2 (de) Bildverarbeitungsvorrichtung und -verfahren
DE60129872T2 (de) Verfahren zur Extrahierung von Titeln aus numerischen Bildern
DE69428082T2 (de) Verfahren zur Detektion finanzieller Beträge in binären Bildern
DE69523970T2 (de) Dokumentspeicher- und Wiederauffindungssystem
DE69610478T2 (de) Zeichenerkennungssystembestimmung von abgetasteten und "echtzeit"-handgeschriebenen zeichen
DE69520123T2 (de) Handschrifterkennungssystem
DE3851867T2 (de) Zeichenerkennungsgerät.
DE69605255T2 (de) Vorrichtung und Verfahren für die Extraktion von Artikeln eines Dokuments
DE69428475T2 (de) Verfahren und Gerät zur automatischen Spracherkennung
DE69506610T2 (de) Programmierbare Funktionstasten für vernetzten persönlichen Bildcomputer
DE69530025T2 (de) Editieren eingescannter Bilddokumente unter Benutzung einfacher Interpretationen

Legal Events

Date Code Title Description
8364 No opposition during term of opposition