DE69428475T2 - Verfahren und Gerät zur automatischen Spracherkennung - Google Patents

Verfahren und Gerät zur automatischen Spracherkennung

Info

Publication number
DE69428475T2
DE69428475T2 DE69428475T DE69428475T DE69428475T2 DE 69428475 T2 DE69428475 T2 DE 69428475T2 DE 69428475 T DE69428475 T DE 69428475T DE 69428475 T DE69428475 T DE 69428475T DE 69428475 T2 DE69428475 T2 DE 69428475T2
Authority
DE
Germany
Prior art keywords
script
determining
language
feature
connected component
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Lifetime
Application number
DE69428475T
Other languages
German (de)
English (en)
Other versions
DE69428475D1 (de
Inventor
A. Lawrence Spitz
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Fujifilm Business Innovation Corp
Xerox Corp
Original Assignee
Fuji Xerox Co Ltd
Xerox Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Fuji Xerox Co Ltd, Xerox Corp filed Critical Fuji Xerox Co Ltd
Publication of DE69428475D1 publication Critical patent/DE69428475D1/de
Application granted granted Critical
Publication of DE69428475T2 publication Critical patent/DE69428475T2/de
Anticipated expiration legal-status Critical
Expired - Lifetime legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/10Image acquisition
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/263Language identification
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition
    • G06V30/24Character recognition characterised by the processing or recognition method
    • G06V30/242Division of the character sequences into groups prior to recognition; Selection of dictionaries
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition
    • G06V30/28Character recognition specially adapted to the type of the alphabet, e.g. Latin alphabet
    • G06V30/287Character recognition specially adapted to the type of the alphabet, e.g. Latin alphabet of Kanji, Hiragana or Katakana characters

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • General Engineering & Computer Science (AREA)
  • Artificial Intelligence (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Character Discrimination (AREA)
  • Character Input (AREA)
DE69428475T 1993-04-19 1994-04-18 Verfahren und Gerät zur automatischen Spracherkennung Expired - Lifetime DE69428475T2 (de)

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
US08/047,673 US5425110A (en) 1993-04-19 1993-04-19 Method and apparatus for automatic language determination of Asian language documents

Publications (2)

Publication Number Publication Date
DE69428475D1 DE69428475D1 (de) 2001-11-08
DE69428475T2 true DE69428475T2 (de) 2002-05-08

Family

ID=21950309

Family Applications (1)

Application Number Title Priority Date Filing Date
DE69428475T Expired - Lifetime DE69428475T2 (de) 1993-04-19 1994-04-18 Verfahren und Gerät zur automatischen Spracherkennung

Country Status (6)

Country Link
US (1) US5425110A (enExample)
EP (1) EP0621541B1 (enExample)
JP (1) JPH0721319A (enExample)
KR (1) KR960015594B1 (enExample)
DE (1) DE69428475T2 (enExample)
TW (1) TW256905B (enExample)

Families Citing this family (33)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5555556A (en) * 1994-09-30 1996-09-10 Xerox Corporation Method and apparatus for document segmentation by background analysis
US5999706A (en) * 1997-04-28 1999-12-07 Pitney Bowes, Inc. Method and apparatus for substituting a 2-byte font character standard in a printer
US5909510A (en) * 1997-05-19 1999-06-01 Xerox Corporation Method and apparatus for document classification from degraded images
US6005986A (en) * 1997-12-03 1999-12-21 The United States Of America As Represented By The National Security Agency Method of identifying the script of a document irrespective of orientation
US8855998B2 (en) 1998-03-25 2014-10-07 International Business Machines Corporation Parsing culturally diverse names
US8812300B2 (en) 1998-03-25 2014-08-19 International Business Machines Corporation Identifying related names
US6963871B1 (en) 1998-03-25 2005-11-08 Language Analysis Systems, Inc. System and method for adaptive multi-cultural searching and matching of personal names
US6292772B1 (en) 1998-12-01 2001-09-18 Justsystem Corporation Method for identifying the language of individual words
US6889147B2 (en) * 2002-09-17 2005-05-03 Hydrogenics Corporation System, computer program product and method for controlling a fuel cell testing device
US7218779B2 (en) * 2003-01-21 2007-05-15 Microsoft Corporation Ink divider and associated application program interface
EP1613972A1 (en) * 2003-04-17 2006-01-11 Hydrogenics Corporation Alarm recovery system and method for fuel cell testing systems
US20040229954A1 (en) * 2003-05-16 2004-11-18 Macdougall Diane Elaine Selective manipulation of triglyceride, HDL and LDL parameters with 6-(5-carboxy-5-methyl-hexyloxy)-2,2-dimethylhexanoic acid monocalcium salt
US7353085B2 (en) * 2003-09-22 2008-04-01 Hydrogenics Corporation Electrolyzer cell stack system
US20070005586A1 (en) * 2004-03-30 2007-01-04 Shaefer Leonard A Jr Parsing culturally diverse names
US7986307B2 (en) * 2005-04-22 2011-07-26 Microsoft Corporation Mechanism for allowing applications to filter out or opt into tablet input
US20060267958A1 (en) * 2005-04-22 2006-11-30 Microsoft Corporation Touch Input Programmatical Interfaces
US7928964B2 (en) * 2005-04-22 2011-04-19 Microsoft Corporation Touch input data handling
US7702699B2 (en) * 2006-05-31 2010-04-20 Oracle America, Inc. Dynamic data stream histograms for large ranges
CN100440250C (zh) * 2007-03-09 2008-12-03 清华大学 印刷体蒙古文字符识别方法
US9141607B1 (en) * 2007-05-30 2015-09-22 Google Inc. Determining optical character recognition parameters
US8340430B2 (en) * 2007-07-10 2012-12-25 Sharp Laboratories Of America, Inc. Methods and systems for identifying digital image characteristics
EP2120130A1 (en) * 2008-05-11 2009-11-18 Research in Motion Limited Mobile electronic device and associated method enabling identification of previously entered data for transliteration of an input
US8160365B2 (en) * 2008-06-30 2012-04-17 Sharp Laboratories Of America, Inc. Methods and systems for identifying digital image characteristics
US8744171B1 (en) * 2009-04-29 2014-06-03 Google Inc. Text script and orientation recognition
US8326602B2 (en) * 2009-06-05 2012-12-04 Google Inc. Detecting writing systems and languages
US8468011B1 (en) 2009-06-05 2013-06-18 Google Inc. Detecting writing systems and languages
RU2613847C2 (ru) 2013-12-20 2017-03-21 ООО "Аби Девелопмент" Выявление китайской, японской и корейской письменности
RU2648638C2 (ru) 2014-01-30 2018-03-26 Общество с ограниченной ответственностью "Аби Девелопмент" Способы и системы эффективного автоматического распознавания символов, использующие множество кластеров эталонов символов
RU2640322C2 (ru) 2014-01-30 2017-12-27 Общество с ограниченной ответственностью "Аби Девелопмент" Способы и системы эффективного автоматического распознавания символов
US20150269135A1 (en) * 2014-03-19 2015-09-24 Qualcomm Incorporated Language identification for text in an object image
US9589185B2 (en) 2014-12-10 2017-03-07 Abbyy Development Llc Symbol recognition using decision forests
US20170068868A1 (en) * 2015-09-09 2017-03-09 Google Inc. Enhancing handwriting recognition using pre-filter classification
US10431203B2 (en) 2017-09-05 2019-10-01 International Business Machines Corporation Machine training for native language and fluency identification

Family Cites Families (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US3755780A (en) * 1971-06-28 1973-08-28 Pattern Analysis & Recognition Method for recognizing characters
JPS5837779A (ja) * 1981-08-31 1983-03-05 Ricoh Co Ltd 文書処理装置
JPS5960574A (ja) * 1982-09-30 1984-04-06 Fujitsu Ltd 文字認識方式
US4817186A (en) * 1983-01-07 1989-03-28 International Business Machines Corporation Locating individual images in a field for recognition or the like
US5062143A (en) * 1990-02-23 1991-10-29 Harris Corporation Trigram-based method of language identification
US5181259A (en) * 1990-09-25 1993-01-19 The United States Of America As Represented By The Administrator Of The National Aeronautics And Space Administration General method of pattern classification using the two domain theory
US5253307A (en) * 1991-07-30 1993-10-12 Xerox Corporation Image analysis to obtain typeface information
JPH0540846A (ja) * 1991-08-06 1993-02-19 Oki Electric Ind Co Ltd 文書画像の和文・欧文判定方法

Also Published As

Publication number Publication date
EP0621541A3 (en) 1995-05-17
DE69428475D1 (de) 2001-11-08
EP0621541A2 (en) 1994-10-26
EP0621541B1 (en) 2001-10-04
JPH0721319A (ja) 1995-01-24
US5425110A (en) 1995-06-13
KR940024627A (ko) 1994-11-18
TW256905B (enExample) 1995-09-11
KR960015594B1 (ko) 1996-11-18

Similar Documents

Publication Publication Date Title
DE69428475T2 (de) Verfahren und Gerät zur automatischen Spracherkennung
DE69423926T2 (de) Verfahren und Gerät zur automatischen Schriftermittlung
DE69425084T2 (de) Verfahren und Gerät zur Erkennung von Textzeilen, Wörtern und räumlichen Merkmalen von Zeichenzellen
DE69423254T2 (de) Verfahren und Gerät zur automatischen Spracherkennung von Dokumenten
DE69516751T2 (de) Bildvorverarbeitung für Zeichenerkennungsanlage
DE69519323T2 (de) System zur Seitensegmentierung und Zeichenerkennung
DE3881392T2 (de) System und Verfahren für automatische Segmentierung.
DE2801536C2 (de) Zeichenformkodiervorrichtung
DE69432585T2 (de) Verfahren und Gerät zur Auswahl von Text und/oder Non-Text-Blöcken in einem gespeicherten Dokument
DE69523970T2 (de) Dokumentspeicher- und Wiederauffindungssystem
DE69724755T2 (de) Auffinden von Titeln und Photos in abgetasteten Dokumentbildern
DE4311172C2 (de) Verfahren und Einrichtung zum Identifizieren eines Schrägenwinkels eines Vorlagenbildes
DE69532847T2 (de) System zur Seitenanalyse
DE69226846T2 (de) Verfahren zur Bestimmung von Wortgrenzen im Text
DE69715076T2 (de) Vorrichtung zur Erzeugung eines Binärbildes
DE69132206T2 (de) Verfahren und Gerät zur Bildverarbeitung
DE3633743C2 (enExample)
EP0040796B1 (de) Verfahren zum automatischen Klassifizieren von Bild- und Text- oder Graphikbereichen auf Druckvorlagen
EP1665132B1 (de) Verfahren und system zum erfassen von daten aus mehreren maschinell lesbaren dokumenten
DE69728546T2 (de) Automatisierte Bildqualitätsanalyse und -verbesserung beim Abtasten und Reproduzieren von Dokumentvorlagen
DE69033042T2 (de) Datenverarbeitung
DE69421117T2 (de) Gerät zur Bildinformationsverarbeitung und -wiedergabe
DE19953608B4 (de) Vorrichtung und Verfahren zum Erkennen eines Fonts eines Textes in einem Dokumentenverarbeitungssystem
DE3523042C2 (enExample)
DE69230127T2 (de) Diagrammerkennungssystem

Legal Events

Date Code Title Description
8364 No opposition during term of opposition