WO2005001711A3 - Method, computer device and computer program for assistance in adding vowels to words in arabic - Google Patents

Method, computer device and computer program for assistance in adding vowels to words in arabic Download PDF

Info

Publication number
WO2005001711A3
WO2005001711A3 PCT/FR2004/001603 FR2004001603W WO2005001711A3 WO 2005001711 A3 WO2005001711 A3 WO 2005001711A3 FR 2004001603 W FR2004001603 W FR 2004001603W WO 2005001711 A3 WO2005001711 A3 WO 2005001711A3
Authority
WO
WIPO (PCT)
Prior art keywords
words
dictionary
vowels
arabic
word
Prior art date
Application number
PCT/FR2004/001603
Other languages
French (fr)
Other versions
WO2005001711A2 (en
Inventor
Fathi Debili
Original Assignee
Centre Nat Recherche
Ecole Normale Superieure Lettr
Fathi Debili
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Centre Nat Recherche, Ecole Normale Superieure Lettr, Fathi Debili filed Critical Centre Nat Recherche
Publication of WO2005001711A2 publication Critical patent/WO2005001711A2/en
Publication of WO2005001711A3 publication Critical patent/WO2005001711A3/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/232Orthographic correction, e.g. spell checking or vowelisation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/274Converting codes to words; Guess-ahead of partial word inputs
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/40Processing or translation of natural language
    • G06F40/53Processing of non-Latin text

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Machine Translation (AREA)

Abstract

The invention relates to a computer-assisted method of adding vowels to an Arabic text. The inventive method consists in using a first dictionary (D1) containing words with no vowels and a second dictionary (D2) containing groups of one or more words with vowels, each of said groups being stored in a memory element and associated with a no-vowel word. For a common, no-vowel word, the inventive method consists in: comparing a string of characters forming the common word with a string of characters stored in the first dictionary, and extracting from the second dictionary a group of possible vowel words which correspond to the word identified in the first dictionary.
PCT/FR2004/001603 2003-06-25 2004-06-24 Method, computer device and computer program for assistance in adding vowels to words in arabic WO2005001711A2 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
FR0307665A FR2856816B1 (en) 2003-06-25 2003-06-25 METHOD, COMPUTER DEVICE AND COMPUTER PROGRAM FOR AIDING VOYELLATION OF WORDS IN ARABIC LANGUAGE
FR03/07665 2003-06-25

Publications (2)

Publication Number Publication Date
WO2005001711A2 WO2005001711A2 (en) 2005-01-06
WO2005001711A3 true WO2005001711A3 (en) 2005-05-26

Family

ID=33515391

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/FR2004/001603 WO2005001711A2 (en) 2003-06-25 2004-06-24 Method, computer device and computer program for assistance in adding vowels to words in arabic

Country Status (2)

Country Link
FR (1) FR2856816B1 (en)
WO (1) WO2005001711A2 (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113011135B (en) * 2021-03-03 2024-08-23 科大讯飞股份有限公司 Arabic vowel recovery method, apparatus, device and storage medium

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US4760528A (en) * 1985-09-18 1988-07-26 Levin Leonid D Method for entering text using abbreviated word forms
US4858170A (en) * 1986-10-24 1989-08-15 Dewick Sr Robert S Shorthand notation and transcribing method

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US4760528A (en) * 1985-09-18 1988-07-26 Levin Leonid D Method for entering text using abbreviated word forms
US4858170A (en) * 1986-10-24 1989-08-15 Dewick Sr Robert S Shorthand notation and transcribing method

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
"ABBREVIATED TYPING FOR WORD PROCESSING", IBM TECHNICAL DISCLOSURE BULLETIN, vol. 21, no. 9, February 1979 (1979-02-01), Armonk, NY, US, pages 3796 - 3797, XP002933207, ISSN: 0018-8689 *
DEBILI ET AL: "Voyellation automatique de l'arabe", COMPUTATIONAL APPROACHES TO SEMITIC LANGUAGES - PROCEEDINGS OF THE WORKSHOP, 16 August 1998 (1998-08-16), Montreal, Quebec, CA, XP002280197, Retrieved from the Internet <URL:http://acl.ldc.upenn.edu/W/W98/W98-1006.pdf> [retrieved on 20040513] *
HOWELL ET AL: "MESSAGE COMPRESSION WITH HUMAN-READABLE ABBREVIATIONS", IBM TECHNICAL DISCLOSURE BULLETIN, vol. 25, no. 2, July 1982 (1982-07-01), Armonk, NY, US, pages 678 - 682, XP000714026 *

Also Published As

Publication number Publication date
FR2856816A1 (en) 2004-12-31
WO2005001711A2 (en) 2005-01-06
FR2856816B1 (en) 2008-07-04

Similar Documents

Publication Publication Date Title
Beeston The Arabic language today
Buckwalter Issues in Arabic orthography and morphology analysis
US8041559B2 (en) System and method for disambiguating non diacritized arabic words in a text
WO2005059672A3 (en) Communication device and method for inputting and predicting text
Bakr et al. A hybrid approach for converting written Egyptian colloquial dialect into diacritized Arabic
WO2004042697A3 (en) Multi-lingual speech recognition with cross-language context modeling
JP2006512629A (en) Systems, methods, program products, and network uses for recognizing words and their parts of speech in one or more natural languages
EP1577793A3 (en) Systems and methods for spell checking
BR0214042A (en) Method for preprocessing a pronunciation dictionary for compression into a data processing device, electronic device for converting a text string input into a sequence of phoneme units, electronic device configured to convert voice information input into a sequence of character units, system comprising first and second electronic devices, and, computer program
WO2004059461A3 (en) Electronic dictionary with example sentences
WO2006124853A3 (en) System and method for censoring randomly generated character strings
WO2005001711A3 (en) Method, computer device and computer program for assistance in adding vowels to words in arabic
Khorsheed A HMM-based system to diacritize Arabic text
Zayyan et al. Automatic diacritics restoration for modern standard Arabic text
KR100946227B1 (en) Method of Converting Korean Spelling to Roman Spelling Using Computer and Computer Memory Device Recording Computer Program Performing the Method
Bouma et al. Syllabification in Middle Dutch
Wang et al. Rule-based korean grapheme to phoneme conversion using sound patterns
JP2006053866A (en) Detection method of notation variability of katakana character string
JPWO2021130892A5 (en) CONVERSION TABLE GENERATION DEVICE, VOICE DIALOGUE SYSTEM, CONVERSION TABLE GENERATION METHOD AND COMPUTER PROGRAM
Norkevičius et al. Knowledge-based grapheme-to-phoneme conversion of Lithuanian words
Gaup et al. From Xerox to Aspell: A first prototype of a north sámi speller based on twol technology
Hall Muak Sa-aak: Challenges of an extensive phoneme inventory for a contained Latin-based orthography
Lehal et al. A transliteration based word segmentation system for Shahmukhi script
Schrock Unlocking the Ik instrumental case
TW200636634A (en) Pronunciation method of English

Legal Events

Date Code Title Description
AK Designated states

Kind code of ref document: A2

Designated state(s): AE AG AL AM AT AU AZ BA BB BG BR BW BY BZ CA CH CN CO CR CU CZ DE DK DM DZ EC EE EG ES FI GB GD GE GH GM HR HU ID IL IN IS JP KE KG KP KR KZ LC LK LR LS LT LU LV MA MD MG MK MN MW MX MZ NA NI NO NZ OM PG PH PL PT RO RU SC SD SE SG SK SL SY TJ TM TN TR TT TZ UA UG US UZ VC VN YU ZA ZM ZW

AL Designated countries for regional patents

Kind code of ref document: A2

Designated state(s): BW GH GM KE LS MW MZ NA SD SL SZ TZ UG ZM ZW AM AZ BY KG KZ MD RU TJ TM AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HU IE IT LU MC NL PL PT RO SE SI SK TR BF BJ CF CG CI CM GA GN GQ GW ML MR NE SN TD TG

121 Ep: the epo has been informed by wipo that ep was designated in this application
122 Ep: pct application non-entry in european phase