WO2008116843A3 - Method for word recognition in character sequences - Google Patents

Method for word recognition in character sequences Download PDF

Info

Publication number
WO2008116843A3
WO2008116843A3 PCT/EP2008/053430 EP2008053430W WO2008116843A3 WO 2008116843 A3 WO2008116843 A3 WO 2008116843A3 EP 2008053430 W EP2008053430 W EP 2008053430W WO 2008116843 A3 WO2008116843 A3 WO 2008116843A3
Authority
WO
WIPO (PCT)
Prior art keywords
grams
characters
character
list
gram
Prior art date
Application number
PCT/EP2008/053430
Other languages
German (de)
French (fr)
Other versions
WO2008116843A2 (en
Inventor
Frank Deinzer
Original Assignee
Frank Deinzer
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Frank Deinzer filed Critical Frank Deinzer
Priority to EP08718135A priority Critical patent/EP2132656A2/en
Publication of WO2008116843A2 publication Critical patent/WO2008116843A2/en
Publication of WO2008116843A3 publication Critical patent/WO2008116843A3/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/274Converting codes to words; Guess-ahead of partial word inputs

Abstract

The method according to the invention for word recognition in sequences of N characters, of which one or more characters may be ambiguous, uses a memory (15), a display (13), and a processor device (12). The memory comprises n-grams (character chains with a length n) and frequency values associated with said character chains, with the total number of all n-grams in a language sample used for word recognition being used as the frequency value of an n-gram. The display (12) shows selected n-grams and/or recognized words, wherein the processor device (12) is connected to the memory (15) and the display (13). A list L of all n-grams with N characters that may be formed from the individual characters in the N-character sequence, taking into account the ambiguity of the characters present in said sequence, is prepared from an examined character sequence. All n-gram combinations with a word probability of zero are removed from the list L of possible n-gram combinations, wherein the word probability p = ∏ pn is determined from the n-grams included in the character sequence with n = 1 to N-1. The words (14) represented by the remaining n-gram combinations from the list L are displayed.
PCT/EP2008/053430 2007-03-26 2008-03-20 Method for word recognition in character sequences WO2008116843A2 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
EP08718135A EP2132656A2 (en) 2007-03-26 2008-03-20 Method for word recognition in character sequences

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
DE102007014405A DE102007014405B4 (en) 2007-03-26 2007-03-26 Method for word recognition in character sequences
DE102007014405.0 2007-03-26

Publications (2)

Publication Number Publication Date
WO2008116843A2 WO2008116843A2 (en) 2008-10-02
WO2008116843A3 true WO2008116843A3 (en) 2009-01-29

Family

ID=39736022

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/EP2008/053430 WO2008116843A2 (en) 2007-03-26 2008-03-20 Method for word recognition in character sequences

Country Status (3)

Country Link
EP (1) EP2132656A2 (en)
DE (1) DE102007014405B4 (en)
WO (1) WO2008116843A2 (en)

Families Citing this family (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
DE102008058271A1 (en) 2008-11-20 2010-05-27 Airbus Deutschland Gmbh Supply unit for flexible supply channels
US9424246B2 (en) 2009-03-30 2016-08-23 Touchtype Ltd. System and method for inputting text into electronic devices
US9189472B2 (en) 2009-03-30 2015-11-17 Touchtype Limited System and method for inputting text into small screen devices
GB0917753D0 (en) 2009-10-09 2009-11-25 Touchtype Ltd System and method for inputting text into electronic devices
GB0905457D0 (en) 2009-03-30 2009-05-13 Touchtype Ltd System and method for inputting text into electronic devices
GB201610984D0 (en) 2016-06-23 2016-08-10 Microsoft Technology Licensing Llc Suppression of input images

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP0319193A2 (en) * 1987-11-30 1989-06-07 Bernard N. Riskin Method and apparatus for identifying words entered on DTMF pushbuttons
US5952942A (en) * 1996-11-21 1999-09-14 Motorola, Inc. Method and device for input of text messages from a keypad

Family Cites Families (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CA2227904C (en) 1995-07-26 2000-11-14 Tegic Communications, Inc. Reduced keyboard disambiguating system
FI974576A (en) 1997-12-19 1999-06-20 Nokia Mobile Phones Ltd A method for writing text to a mobile station and a mobile station
GB2373907B (en) 2001-03-29 2005-04-06 Nec Technologies Predictive text algorithm
US6794966B2 (en) 2002-07-01 2004-09-21 Tyco Electronics Corporation Low noise relay
US7129932B1 (en) * 2003-03-26 2006-10-31 At&T Corp. Keyboard for interacting on small devices
EP1710668A1 (en) 2005-04-04 2006-10-11 Research In Motion Limited Handheld electronic device with text disambiguation employing advanced editing feature

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP0319193A2 (en) * 1987-11-30 1989-06-07 Bernard N. Riskin Method and apparatus for identifying words entered on DTMF pushbuttons
US5952942A (en) * 1996-11-21 1999-09-14 Motorola, Inc. Method and device for input of text messages from a keypad

Also Published As

Publication number Publication date
DE102007014405B4 (en) 2010-05-27
WO2008116843A2 (en) 2008-10-02
EP2132656A2 (en) 2009-12-16
DE102007014405A1 (en) 2008-10-09

Similar Documents

Publication Publication Date Title
CN102122298B (en) Method for matching Chinese similarity
WO2008116843A3 (en) Method for word recognition in character sequences
WO2006115598A3 (en) Method and system for generating spelling suggestions
TW200707404A (en) Speech recognition assisted autocompletion of composite characters
WO2006010163A3 (en) User interface and database structure for chinese phrasal stroke and phonetic text input
WO2004059461A3 (en) Electronic dictionary with example sentences
CN103309926A (en) Chinese and English-named entity identification method and system based on conditional random field (CRF)
US20150286628A1 (en) Information extraction system, information extraction method, and information extraction program
JP7102710B2 (en) Information generation program, word extraction program, information processing device, information generation method and word extraction method
Joshi et al. Enhanced version of Punjabi stemmer using synset
CN101882006A (en) Zero-memory simple sub-character splitting input method
CA2554397A1 (en) Handheld electronic device with disambiguation of compound word text input employing separating input
CA2511714A1 (en) Adding interrogative punctuation to an electronic message
CN101436205A (en) Method and apparatus for enquiring unique word by explanation
CN105786802B (en) A kind of transliteration method and device of foreign language
Ziółko et al. Triphone statistics for Polish language
Lehal et al. Automatic Bilingual Legacy-Fonts Identification and Conversion System.
CN101246489A (en) Work input query method with analog phonetic symbol emendation function
Van Driem The creoloid origins of Chinese
Shabnam et al. A faster approach to sort unicode represented bengali words
Senapati et al. A computational approach for corpus based analysis of reduplicated words in Bengali
CN107247708A (en) A kind of Sex criminals method and system
Nay et al. Automatic Generating Vocabulary File in Myanmar Information Retrieval
Lehal et al. A transliteration based word segmentation system for Shahmukhi script
CA2584033A1 (en) Handheld electronic device and method for performing spell checking during text entry and for integrating the output from such spell checking into the output from disambiguation

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 08718135

Country of ref document: EP

Kind code of ref document: A2

NENP Non-entry into the national phase

Ref country code: DE

WWE Wipo information: entry into national phase

Ref document number: 2008718135

Country of ref document: EP