WO2005001711A2

WO2005001711A2 - Method, computer device and computer program for assistance in adding vowels to words in arabic

Info

Publication number: WO2005001711A2
Application number: PCT/FR2004/001603
Authority: WO
Inventors: Fathi Debili
Original assignee: Centre National De La Recherche Scientifique (Cnrs); Ecole Normale Superieure Lettres Et Sciences Humaines
Priority date: 2003-06-25
Filing date: 2004-06-24
Publication date: 2005-01-06
Also published as: WO2005001711A3; FR2856816A1; FR2856816B1

Abstract

The invention relates to a computer-assisted method of adding vowels to an Arabic text. The inventive method consists in using a first dictionary (D1) containing words with no vowels and a second dictionary (D2) containing groups of one or more words with vowels, each of said groups being stored in a memory element and associated with a no-vowel word. For a common, no-vowel word, the inventive method consists in: comparing a string of characters forming the common word with a string of characters stored in the first dictionary, and extracting from the second dictionary a group of possible vowel words which correspond to the word identified in the first dictionary.

Description

Method, computer device and computer program for assisting with the verbelling of words in Arabic language

The invention relates to the vowel of a text in Arabic, assisted by computer means.

The writing of the Arabic language mainly foresees two types of characters. A first type concerns consonants, which constitute the body of the text. A second type concerns vowels, which, in Arabic script, are added to consonants by adding vowel signs above or below each consonant.

Generally, texts published in Arabic contain words represented only by their consonants. Only educational books for learning the Arabic language have consonants with the signs of vowel.

Referring to FIG. 1a, the word represented in this figure comprises three successive letters 1, 2 and 3, corresponding respectively to the consonants K, T and B. This word, in its context, usually means "he wrote" and is KATABA bed. A reader of an Arabic text, fluent in this language, will therefore naturally interpret the succession of the three letters of figure la as corresponding to the word KATABA, which, when it is vowel, has horizontal bars 4 appearing above the letters 1, 2 and 3, as shown in Figure 1b. Referring to FIG. 1b, it will thus be understood that these horizontal bars 4, placed above the consonants K, T, B, correspond to the vowel A and a reader not initiated in the Arabic language can now deduce without ambiguity from the expression represented in the figure lb that he is the word KATABA.

However, by referring to figure le, the uninitiated reader will not know if the non-vowel word in the figure corresponds to it: - to the right combination of KATABA vowels (bearing the reference A in figure le),

- to the incorrect combination of KATABO vowels (bearing the reference B in Figure le),

- to the wrong combination of KOTOBO vowels (bearing the reference C in figure le), or to any other combination among 27 possible combinations for these three consonants.

Indeed, we can count in all 9 possible signs of vowel for a consonant (a, o, i, an, oun, in, no vowel associated with the consonant, hamza. And chedda).

This difficulty is all the more increased as certain words, not vowels, can be read according to a plurality of possible interpretations. For example, the word "man", not vowel, reads both "man" and "foot", because the word "foot", in Arabic, has the same succession of consonants as the word "man".

In other applications currently envisaged such as speech synthesis (involving a conversion of writing characters into voiced speech signals), the Vowelization of words appears necessary because a simple succession of consonants cannot by itself allow the construction of an exact speech signal.

On the other hand, manual vowelization of a complete text, edited electronically, is tedious because the operator must systematically press a key for a consonant and at least two keys to further edit the vowel sign associated with this consonant ( including the "SHIFT" key and another key on the keyboard).

Thus, there is today a real need for an automatic vowel of words in the Arabic language.

There is known for this purpose a process assisted by computer means and based on the cutting of words into a plurality of sections such as, in particular, a prefix, a radical, a suffix. Following this example, each type of prefix is stored in a first dictionary, each type of radical is stored in a second dictionary, and each type of suffix is stored in a third dictionary. We proceed in the same way for conjugated verbs. Finally, this method provides for a multiplicity of dictionaries forming databases which are stored in a memory of the aforementioned computer means.

Thus, a word to vowel is cut into several sections. Each section, comprising an identified succession of consonants, is compared with a succession of corresponding consonants in the dictionary which is specific to this type of segment. Vowel rules encoded in the form of computer program instructions define the vowel to be applied to this section. Finally, the word vowel is reconstructed by concatenating the different vowels sections.

This process, although promising, presents numerous errors in its implementation. By way of illustration, it will be understood for example that the word "INFORMATION" includes the radical "INFORM-" and the same suffix "-ATION" as the word "PERTURBATION". However, the word "NATION" cannot be cut in the same way with the letter only "N-", on the one hand, and the succession of letters "-ATION", on the other hand. The same problem arises in the Arabic language.

The present invention improves the situation.

Based on a completely different approach, it proposes for this purpose a process of voelling a text in Arabic, assisted by computer means, in which: a) a first memory area is provided in which a first dictionary is stored comprising non-vowel words, b) a second memory area is provided in which a second dictionary is stored comprising groups of at least one vowel word, each group being stored in correspondence with a non-vowel word of said first dictionary, c) for a current word, not vowel, a character string forming at least said current word is compared with character strings stored in the first memory area, in order to isolate at least one word from the first dictionary comprising the same character string as the current word, and d) a group of candidate words, vowels, corresponding to said isolated word of the first dictionary, is extracted from the second dictionary.

The present invention also relates to a computer device for assisting with the shelling of a text in the Arabic language, comprising:

- a first memory area in which is stored a first dictionary comprising non-vowel words, - a second memory area in which is stored a second dictionary comprising groups of at least one vowel word, each group being stored in correspondence with a non-vowel word of said first dictionary, - a memory zone in which instructions of a computer routine suitable for: c) are compared, for a current word, non-vowel, a character string forming at least said current word with strings of characters stored in the first memory area, to isolate at least one word from the first dictionary comprising the same character string as the current word, and d) extract from the second dictionary a group of candidate words, vowels, corresponding to said isolated word from the first dictionary. As such, the present invention also relates to a computer program for assisting with the vellification of a text in Arabic language, stored in a memory of a computer device or, in an equivalent manner, on a medium intended to cooperate with a reader of a computer device, comprising:

a first database arranged according to a first dictionary comprising non-vowels words, a second database arranged according to a second dictionary comprising groups of at least one vowel word, each group of the second base being indexed in correspondence with a non vowel word from the first base, and

a computer routine suitable for: c) comparing, for a current word, not vowel, a character string forming at least said current word with character strings stored in the first memory area, to isolate at least one word from the first dictionary comprising the same character string as the current word, and d) extracting from the second dictionary a group of candidate words, vowels, corresponding to said isolated word from the first dictionary.

It will thus be understood that a vowel, within the meaning of the invention, is based solely on two dictionaries, one comprising non-vowel words and the other, comprising groups of vowel words. We will see in the description of a preferred embodiment and variants of this embodiment, given below, how a vowel candidate word is selected to replace a current non-vowel word.

Other characteristics and advantages of the invention will appear on examining the detailed description below, and the attached drawings in which:

- the figure illustrates a non-vowel Arabic word,

FIG. 1B illustrates the word in FIG. 1a, but vowel now, - FIG. 1 illustrates the word in FIG. 1a, with several possible vowelings of this word,

FIG. 2 diagrammatically represents a computer device for implementing the present invention, FIG. 3 diagrammatically represents the content of memory areas of a memory of the central unit 24 of FIG. 2,

FIGS. 4a, 4b and 4c respectively represent a text comprising a non-vowel sentence, a vowel sentence without case vowels and a vowel sentence with case vowels, FIG. 5 represents a general flow diagram of the method according to a preferred embodiment of

1 invention, - Figure 6 shows a dialog box implemented by a man / machine interface module, to propose possible vowelings of a current word, and

- Figure 7 shows a dialog box offering possible grammatical labels of a current word. First of all, reference is made to FIG. 2 in which a computer device conventionally comprises a central unit 24, to which are connected a display screen 21, an input device such as a keyboard 22 or a mouse 23, as well as '' a COM communication interface, for example with a remote server, via a wide area network of the INTERNET type. The central unit 24 further comprises a reader 25 capable of cooperating with a memory medium such as a CD-ROM, a DVD-ROM, a floppy disk, or any other memory medium. It will thus be understood that a computer program, within the meaning of the invention, can be stored on a memory medium of this type, while updates to the aforementioned dictionaries can be downloaded from the remote server or even obtained on another memory medium. .

FIG. 3 represents a structure of a memory (for example of ROM type) in which the first and second abovementioned dictionaries are stored. It is indicated that the central unit 24 comprises a memory, for example a permanent ROM type memory, in which are stored in digital form successions of Arabic characters forming words of the first and second dictionaries.

A first memory area D1 stores a first dictionary comprising non-vowel words 31, 32. A second memory area D2 stores a second dictionary comprising groups 3-1, 3-2 of one or more vowel words 311,312; 321,322. Preferably, each group 3-1, 3-2 of the second dictionary D2 is stored in correspondence of a non-vowel word 31, 32 of the first dictionary Dl, as illustrated by the correspondence arrows Fil, F12, F21, F22 in FIG. 3. We find for example in the first dictionary Dl ,. the succession of the three consonants K, T, B (word 31) of figure la and, in the second dictionary D2, the word KATABA 311.

It is indicated that, in a preferred embodiment, only the words vowels which have a meaning are listed in the aforementioned second dictionary. However, as a variant, provision may be made for forming a second initial dictionary comprising all the possible combinations of vowels for a given succession of consonants, while a user deletes from the second dictionary, as and when used, the combinations outliers and which correspond to words which have no meaning. In this case, the second dictionary is formed by learning by eliminating outliers from the memory area D2.

However, in the preferred embodiment, the second dictionary is initially constructed with vowel words which have a meaning, so as to offer a pleasant and user-friendly use of the program within the meaning of the invention.

Of course, for a computer program for assisting with vellification within the meaning of the invention, stored in a memory of a computer device or on a medium capable of cooperating with a reader of a device data processing, the first and second dictionaries are respectively in the form: - of a first database Dl whose structure is arranged according to the first dictionary which includes non-vowels words, and

- a second database D2, the structure of which is arranged according to the second dictionary which comprises groups of at least one vowel word.

Each group of the second database D2 is indexed in correspondence of a non-vowel word of the first database D1, as further shown by the correspondence arrows Fil to F22 of FIG. 3.

Reference is now made to FIGS. 4a and 4b which respectively represent a non-vowel text containing a complete sentence delimited by two points PI and P2 and a partially vowel text containing said sentence delimited by the points PI and P2. Remember that Arabic can be read from right to left. It will thus be understood that a succession of words can be in the form of a complete sentence defined by a character string between two punctuation characters PI and P2, the different words of this sentence being able to be vowels according to their position in the sentence, as we will see later.

It is simply indicated here that the text of FIG. 4b does not systematically include so-called "case" vowels which are most often assigned at the end of the word. On the other hand, the text of figure 4c is vowel π

of _._ in a complete way and also includes the occasional vowels which appear in particular at the last letter 431 of the word 43 (with a horizontal line under this last letter 431 and to compare with the last non-vowel letter 421 of the word 42 (partially vowel) in Figure 4b).

In addition, we will recognize in Figure 4a the word non-vowel, referenced 45, which includes the succession of characters 1, 2, 3 of Figure la, corresponding to consonants K, T, B. We will also recognize in Figure 4b word vowel 451 which corresponds to the word KATABA in FIG. 1b and vowel by horizontal lines 4 above the consonants, which are representative of the vowel "A".

These sentences of FIGS. 4a, 4b and 4c thus appear on the screen 21 of the computer device and the characters of the texts forming these sentences are conventionally stored in digital form TXT (FIG. 3) in a working memory Z4 (for example of the RAM type). ) of the central unit 24 of the computing device.

Referring again to FIG. 3, the computer device also comprises a memory area Z3 in which instructions of a computer program PGM are stored which are specific to:

- compare, for a current non-vowel word (bearing the reference 45 in FIG. 4a), a character string (in this case the consonants 1, 2 and 3 of FIG. la) forming this current word 45, with strings of characters 31 stored in the first memory area Dl, to isolate the word 31 from the first dictionary Dl comprising the same character string as the current word 45, and - extract from the second dictionary D2 a group 3-1 of candidate words 311, 321, vowels and which correspond (arrows Fil and F12) to the isolated word 31 of the first dictionary Dl.

We now refer to FIG. 5 to describe the flow of the computer routine of the PGM program. We try to vowel here a word 45 which appears in a text electronically edited on the screen 21 of FIG. 2. This routine first locates, for example by character recognition, in step 51, the characters (the consonants 1, 2, 3) of the non-vowel word 45. The routine then performs, in step 52, a comparison with words not vowels and listed in the dictionary Dl to isolate, in step 53, a non-word vowel 31 with the same succession of consonants 1, 2, 3.

In step 54, the program PGM determines, as a function of the memory location in the memory area D1 of word 31, the memory location of group 3-1 in the memory area D2 and comprising the vowels words 311 and 312, from the second dictionary of vowels words. In step 55, the program PGM extracts from the memory area D2 the group of candidate words 311 and 312 comprising the same succession of consonants but vowels differently.

In a preferred embodiment, a man / machine interface module is also provided, preferably in the form of computer instructions forming part of the PGM program. In FIG. 6, a screenshot 21 is shown showing, for a text 62 electronically edited, a dialog box 61 which is one of the functionalities of this man / machine interface. For a current word 45, non-vowel, selected by a user (from an input device such as the mouse 23) and which appears, for this reason, contrasted in the text 62, the dialog box 61 first indicates what is the word 31 analyzed in correspondence in the first dictionary Dl. Then, the dialog box 61 proposes potential vowelings of this current word 45, which correspond to candidate vowels words 312 and

311 from the second dictionary D2, for the same succession of consonants as word 31 from the first dictionary. Thus, in the second frame of the dialog box 61, the man / machine interface offers a user a choice list of candidate words 311 and 312.

Referring again to FIG. 5, in a preferred embodiment, the user chooses, in step 56, a candidate word 311 from the list of candidate words 311,

312 from word group 3-1. In step 57, the chosen word 311, vowel, automatically replaces word 45, not vowel, in the text edited electronically. It is further specified that the "choice" of the user is stored in step 58, in a memory zone Z5 of the computer device. Preferably, this memory zone Z5 is in correspondence with the memory zone D2 in which the second dictionary is stored, so as to enrich the latter. More particularly, the chosen word 311, thus vowel, is stored with the words preceding and / or succeeding it in part of the edited text. Preferably, the chosen word 311 is stored with the complete sentence in which it appears, with a view to perfecting the voyellation within the meaning of the present invention, by learning, as will be seen below. It is simply indicated here that, if the current word 45 to be vellified is part of a current succession of words, such as a complete sentence, following the choice of a word 311 by the user (in the list of candidate words 311 , 312), the selected vowel word 311 and the succession of words which comprise it are stored in the aforementioned memory zone Z5.

Thus, in the third frame of the dialog box 61 of FIG. 6, the man / machine interface indicates to the user the chosen word 311, which will be edited in the text 62 to replace the word 45 not vowel and preferably memorized with a succession of words preceding and / or succeeding it.

Reference is again made to FIGS. 4a to 4c to describe below a vowelization of the words according to their context.

In Figure 4a, we are particularly interested in the first word of the sentence following the point PI, knowing that Arabic can be read from right to left. We recognize this first word of the sentence in Figure 3 which corresponds to the non-vowel expression 32 of the first dictionary Dl. However, this word, non-vowel 32 admits two possible vowelings 321 (meaning the expression "he went") and 322 (meaning the metal "gold") in the second dictionary D2.

Generally, in the Arabic language, a word beginning a sentence corresponds to a verb. Thus, the word which succeeds the first point PI of FIG. 4a is a verb whose vowel form corresponds with almost certainty to the conjugated verb 321 of the second dictionary D2 of FIG. 3.

Thus, if the current word is part of a succession of words, a string of characters forming this succession of words comprising the current word is compared more broadly with strings of characters stored in the aforementioned zone Z5 in correspondence of the second memory area D2, to identify a plurality of words comprising the same character string as this succession of words. This step corresponds, in a broader perspective, to step 51 represented in FIG. 5.

It is then indicated that the PGM program can include instructions for carrying out this comparison "extended to a succession of words". For example, for a complete sentence, a computer routine can be provided to isolate the characters of the complete sentence between the two punctuation marks PI and P2.

Then, for the current word to vowel, one selects from the group of candidate words vowels extracted from the second dictionary D2, a word vowel (here the verb 321) according to the succession of identified words and, in particular, of a position of the current word 32 in this succession of identified words. Here, the word 32 begins the sentence and therefore corresponds to the verb vowel 321.

Advantageously, one can then proceed to an automatic replacement, in the electronically edited text, of the current non-vowel word 32 by the word vowel 321, automatically selected from the group of candidate words 321 and 322.

It will thus be understood that this automatic vowel is advantageously provided here by memorization of complete sentences and / or succession of words, the vowel of which is validated by the user, as and when the computer software for assistance is used. the vowel, so by learning. Computer learning routines are known per se. It is indicated for example that routines such as those used by the software ViaVoice ^® of the company Microsoft ^® are well suited to the determination of written characters by learning.

However, in the event of uncertainty about the vowel, the man / machine interface advantageously offers the user a choice list comprising words selected from candidate words of the second dictionary. This situation is represented in FIG. 6 where two possible vowelings 312 and 311, which are consistent as a function of the context of the current word 45, are proposed to the user. Even more advantageously, this list is hierarchical, according to the context, in order of relevance of the proposed vowels. In particular, this hierarchy can be deduced by learning, by analyzing the form of vowel preferred by the user and which returns most often during use.

Referring to FIG. 7, advantageously, grammatical labels corresponding to each word 311 in each group 3-1 of the second dictionary D2 are stored in a memory area (not shown), so that the man / machine, in particular the dialog box 61 in FIG. 7, furthermore indicates to the user a grammatical label 70 of each of the words selected from the candidate words 311, 312. If necessary, this grammatical label is validated by the user, in frame 71 of the dialog box. It is indicated that this grammatical label corresponds for example to a syntactic description of a word, of the type "common name, in the singular, defined, placed as subject in the sentence, etc.". Of course, this grammatical label is defined and validated as a function of the position of the word analyzed 45 in the current sentence.

To this end, a memory area is provided (for example still in correspondence with the second memory area D2) to further store grammatical labels 70 each corresponding to a vowel word 311 of the second dictionary.

As shown in Figures 6 and 7, it is specified that the PGM computer program, for the implementation of the invention and the interface module man / machine are compatible with electronic means of editing text in Arabic language, such as MICROSOFT WORD ^® software.

Another possible type of automatic voyellation, called "casual", is described below. Case vowels are most often assigned to end-of-word consonants, depending on the context of that word in a sentence. For example, the word 42 in FIG. 4b, in its context, admits a vowel of its last letter 421, by the sound "i" which corresponds to a horizontal bar 431 under this end letter.

We recall that there are, in the Arabic language, a plurality of possible variations for a common name, such as nominative (determined or indeterminate), accusative (determined or indeterminate), ablative (determined or indeterminate), etc. These declensions correspond to end-of-word vowels with the following sounds: - "0" = determined nominative, - "OUN" = unspecified nominative, - "A" = determined accusative, - "AN" = unspecified accusative, - "I "= determined ablative, -" IN "= indeterminate ablative, etc.

For example, by referring again to FIGS. 4b and 4c, the preposition corresponding to the word 44 is identified in the succession of words in which the word 43 appears. This preposition 44 necessarily leads to an ablative variation of the word 43 which follows, with an automatic occasional vowelization by the sound "i" of the last letter 431 of the word 43.

Thus, as before, the computer routine of the PGM program includes instructions for comparing the current succession of words in FIG. 4b, with successions of words stored beforehand. Where appropriate, the preposition 44 is identified, with a position which just precedes the word 42 to be vowel. A routine of the PGM program then selects, as a function of this comparison, the word vowel 43 ending with the sound "i" which corresponds to a declension in the ablative, brought about by the position of this preposition 44 with respect to word 43. We indicates that the occasional vowel is proposed as an option by the man / machine interface of the PGM program, in a preferred embodiment.

In general, it will be understood that the steps described above, in particular those with reference to FIG. 5, are implemented by the execution of instructions or computer routines of the PGM program, which is therefore intended to be installed in a memory of a machine or of a computer device of the type represented in FIG. 2. Initially, this program, for example stored on CD-ROM, comprises the first and second memory areas D1 and D2 arranged in the form of databases

(with, where appropriate, the data of the grammar labels), which can be loaded and copied in memory (for example permanent ROM type) of the aforementioned computer device. It will be understood that these databases, once copied to the memory of the device, can then be enriched, in particular by learning. In particular, the same applies to said memory area Z5 in correspondence with the second memory area, which is intended to store the successions of words or complete sentences. The database stored in zone Z5 (in a memory of the device) is thus enriched as and when used.

Claims

claims

1. Method for voelling a text in Arabic, assisted by computer means, in which: a) a first memory area is provided in which a first dictionary containing non-vowel words is stored, b) a second area is provided memory in which is stored a second dictionary comprising groups of at least one vowel word, each group being stored in correspondence with a non-vowel word of said first dictionary, c) for a current word, non-vowel, a chain of characters forming at least said current word with character strings stored in the first memory area, to isolate at least one word from the first dictionary comprising the same character string as the current word, and d) a group of candidate words, vowels, corresponding to said isolated word of the first dictionary.

2. Method according to claim 1, in which a computer routine is provided capable of carrying out said comparison of the character strings and said extraction of the group of candidate words.

3. The method of claim 1, wherein there is further provided a man / machine interface suitable for proposing to a user a choice list of said candidate words.

4. Method according to claim 1, in which, said current word being part of a succession of words, c1) a character string forming said succession of words comprising the current word is compared with strings of characters stored in an area memory in correspondence of the second memory area, to identify a plurality of words comprising the same character string as said succession of words, and d2) for said current word, at least one vowel word is selected from said group of candidate word vowels as a function of the succession of identified words and of a position of the current word in said succession of identified words.

5. The method of claim 4, wherein said succession of words is a complete sentence defined by a character string between two punctuation characters.

6. The method of claim 4, wherein automatically replacing in an electronically edited text said current word by said vowel word, selected from the group of candidate words.

7. The method of claim 3 and claim 4, wherein the man / machine interface offers a user a list of choices comprising words selected from said candidate words.

8. The method of claim 7, wherein the grammatical labels are further stored in correspondence with each word in each group of the second dictionary, and in which the man / machine interface further indicates to the user a grammatical label of each of the words selected from said candidate words.

9. The method of claim 3, wherein, said current word being part of a current succession of words, following the choice of a word by said user from the list of candidate words, the word chosen is stored with said succession of words, in a memory area in correspondence of said second memory area.

10. The method as claimed in claim 8 and claim 4, in which the selection of the word vowel from said group of candidate vowels words is carried out by learning, by comparing the current succession of words with successions of words stored in said corresponding memory area. from the second memory area.

11. Computer device for assisting with the shelling of a text in Arabic, comprising:

- a first memory area in which is stored a first dictionary comprising non-vowel words, - a second memory area in which is stored a second dictionary comprising groups of at least one vowel word, each group being stored in correspondence with a non-vowel word of said first dictionary, a memory zone in which instructions of a computer routine specific to: c) compare, for a current, non-vowel word, a character string forming at least said current word with character strings stored in the first memory area, to isolate at least one word from the first dictionary comprising the same character string than the current word, and d) extract from the second dictionary a group of candidate words, vowels, corresponding to said isolated word from the first dictionary.

12. The computer device as claimed in claim 11, further comprising a man / machine interface suitable for proposing to a user a choice list of said candidate words.

13. The computer device as claimed in claim 11, wherein, said current word being part of a succession of words, said computer routine is arranged for: c) comparing a character string forming said succession of words comprising the current word, with character strings stored in a memory zone corresponding to the second memory zone, to identify a plurality of words comprising the same character string as said succession of words, and d2) for said current word, select from said group of candidate words vowels , at least one vowel word as a function of the succession of identified words and of a position of the current word in said succession of identified words.

14. The computer device as claimed in claim 13, in which said succession of words is a complete sentence defined by a string of characters between two punctuation characters, and in which said computer routine is arranged to isolate the characters of the complete sentence between the two punctuation marks.

15. A computer device according to claim 11, further comprising electronic text editing means, in ^' Arabic language, and wherein said computer routine is able to cooperate with said text editing means.

16. Computer device according to claim 15 and claim 13, wherein the computer routine is arranged to automatically replace in an edited text said current word by said vowel word, selected from the group of candidate words.

17. A computer device according to claim 12 and claim 13, in which the man / machine interface is arranged to propose a choice list comprising words selected from said candidate words.

18. The computer device as claimed in claim 12, in which, said current word being part of a current succession of words, the computer routine furthermore comprises instructions for storing the word chosen with said succession of words, in a corresponding memory zone. of said second memory area.

19. The computer device as claimed in claim 18 and claim 13, in which the computer routine comprises instructions for comparing the current succession of words with successions of words stored in said memory zone in correspondence with the second memory zone, and selecting, in based on this comparison, at least one vowel word among said group of vowel candidate words.

20. The computer device as claimed in claim 17, comprising a memory area for further storing grammatical labels corresponding to each word in each group of the second dictionary, and in which the man / machine interface further indicates to the user a label grammatical of each of the words selected from said candidate words.

21. Computer program for assisting with the shelling of a text in Arabic language, stored in a memory of a computer device or on a medium intended to cooperate with a reader of a computer device, comprising:

a first database arranged according to a first dictionary comprising non-vowel words,

a second database arranged according to a second dictionary comprising groups of at least one vowel word, each group of the second base being indexed in correspondence with a non-vowel word of the first base, and

- a computer routine specific to: c) comparing, for a current, non-vowel word, a character string forming at least said current word with character strings stored in the first memory area, to isolate at least one word from the first dictionary comprising the same character string as the current word, and d) extract from the second dictionary a group of candidate words, vowels, corresponding to said isolated word from the first dictionary.

22. The computer program as claimed in claim 21, intended to be installed in a memory of a computer machine and comprising a man / machine interface module capable of proposing to a user a choice list of said candidate words.

23. The computer program as claimed in claim 21, wherein, said current word being part of a succession of words, the program comprises instructions for: c) comparing a character string forming said succession of words comprising the current word, with character strings stored in a memory zone corresponding to the second memory zone, to identify a plurality of words comprising the same character string as said succession of words, and d2) for said current word, select from said group of words candidate vowels, at least one vowel word as a function of the succession of identified words and of a position of the current word in said succession of identified words.

24. The computer program as claimed in claim 23, in which said succession of words is a complete sentence defined by a character string between two punctuation characters, and in which the program comprises instructions for isolating the characters of the complete sentence between the two punctuation marks.

25. A computer program according to claim 21, compatible and able to cooperate with a text editing program in Arabic.

26. The computer program as claimed in claim 25 and claim 23, intended to be installed in a memory of a computer device and comprising instructions for automatically replacing in a edited text said current word by said vowel word, selected from the group. of candidate words.

27. The computer program as claimed in claim 22 and claim 23, in which the man / machine interface module is arranged to propose a choice list comprising words selected from said candidate words.

28. The computer program according to claim 22, wherein, said current word being part of a current succession of words, the computer program further comprises instructions for storing the word chosen with said succession of words, in a memory area in correspondence of said second memory area.

29. The computer program according to claim 28 and claim 23, wherein the computer program comprises instructions for comparing the current succession of words with successions of words stored in said memory area in correspondence with the second memory area, and selecting , according to this comparison, at least one vowel word from said group of candidate vowels.

30. The computer program as claimed in claim 27, comprising a database stored in correspondence of each word of the second dictionary and comprising grammatical labels for each word in each group of the second dictionary, in which the man / machine interface comprises instructions for further indicating to the user a grammatical label for each of the words selected from said candidate words.