FR2803928A1

FR2803928A1 - Processing of natural language text to evaluate the content for marking in an educational context, uses comparison of entered text to set of stored key words to determine score

Info

Publication number: FR2803928A1
Application number: FR0000590A
Authority: FR
Inventors: Bernard Gaston Francois Muller
Original assignee: AURALOG
Current assignee: AURALOG
Priority date: 2000-01-18
Filing date: 2000-01-18
Publication date: 2001-07-20
Anticipated expiration: 2020-01-18
Also published as: FR2803928B1

Abstract

The processing system has an interface to allow the user to introduce their response in text form. The processing makes use of a stored (13) list of key words associated with the question posed, and compares (11) the text with key words to detect coincidence. A score (14) is computed based on the coincidence, and made available for further processing.

Description

L'invention a pour objet un système de traitement de données pour l'évaluation d'un texte. The invention relates to a data processing system for the evaluation of a text.

Dans le domaine de l'enseignement, notamment l'apprentissage des langues, l'évaluation de la qualité et de la pertinence d'un texte écrit en langage naturel en réponse à une consigne ou une question repose sur une appréciation portée par une personne .physique, le correcteur. In the field of education, in particular language learning, the evaluation of the quality and relevance of a text written in natural language in response to an instruction or a question is based on an assessment made by a person. physical, the corrector.

Par ailleurs, il existe des logiciels éducatifs, notamment d'apprentissage des langues, qui, au moyen d'un ordinateur personnel, permettent à un individu d'effectuer un apprentissage sans intervention d'un enseignant ou correcteur personne physique. Ces systèmes permettent de corriger automatiquement les réponses faites par l'utilisateur à des questions qui lui sont posées dans le cas où il existe un nombre fini et limité de réponses possibles à ces questions. In addition, there are educational software, in particular for learning languages, which, by means of a personal computer, allows an individual to carry out a learning process without the intervention of a teacher or corrector of a natural person. These systems make it possible to automatically correct the answers given by the user to questions put to him in the event that there is a finite and limited number of possible answers to these questions.

L'invention vise à fournir un système de traitement de données permettant d'évaluer automatiquement la pertinence d'un texte rédigé librement en langage naturel en réponse à une consigne. on entend ici par consigne l'ensemble des indications données à un utilisateur du système pour rédiger son texte. Cette consigne peut se présenter par exemple sous la forme d'un texte (tel qu'une question, un thème, etc.) associé à un document (tel qu'un enregistrement audio, une image fixe, une image vidéo, etc.). Cette consigne est présentée à l'utilisateur par des moyens d'interface tels que transducteur acoustique, écran d'affichage, écran vidéo, etc. The invention aims to provide a data processing system for automatically evaluating the relevance of a text freely written in natural language in response to an instruction. The term “instruction” is understood here to mean all the indications given to a user of the system for drafting his text. This instruction can be presented for example in the form of a text (such as a question, a theme, etc.) associated with a document (such as an audio recording, a still image, a video image, etc.). . This instruction is presented to the user by interface means such as acoustic transducer, display screen, video screen, etc.

A cet effet, l'invention a pour objet un système de traitement de données pour l'évaluation d'un texte en langage naturel élaboré par un utilisateur en réponse à une consigne transmise audit utilisateur, comprenant - des premiers moyens d'interface pour l'introduction par ledit utilisateur dudit texte en réponse à ladite consigne, et - des moyens de traitement de données pour l'évaluation dudit texte, caractérisé en ce que lesdits moyens de traitement de données comprennent * des moyens de mémorisation d'au moins une liste de mots-clés associés à ladite consigne, * des moyens de comparaison pour identifier des mots dudit texte coïncidant avec des mots de ladite liste de mots-clés mémorisés, et * des moyens de calcul pour générer une donnée d'évaluation dudit texte en fonction du résultat de ladite comparaison. To this end, the subject of the invention is a data processing system for the evaluation of a text in natural language produced by a user in response to an instruction transmitted to said user, comprising - first interface means for the introduction by said user of said text in response to said instruction, and - data processing means for evaluating said text, characterized in that said data processing means comprise * means for storing at least one list of keywords associated with said instruction, * comparison means for identifying words of said text coinciding with words from said list of stored keywords, and * calculation means for generating data for evaluating said text as a function of the result of said comparison.

De préférence, le système selon l'invention comporte en outre une ou plusieurs des caractéristiques suivantes considérées seules ou en combinaison - lesdits moyens de mémorisation contiennent plusieurs listes de mots-clés, chaque liste regroupant un ensemble de mots-clés affectés à un concept associé à ladite consigne ; - lesdits moyens de calcul sont adaptés pour calculer une note fonction du nombre de listes dont ledit texte contient au moins un mot-clé ; - lesdits moyens de mémorisation contiennent un coefficient de pertinence affecté à chacun desdits mots- clés et lesdits moyens de calcul sont adaptés pour calculer ladite note en fonction du coefficient de pertinence d'au moins une partie des mots-clés contenus dans le texte ; - à chaque liste est affectée une valeur maximale et lesdits moyens de calcul sont adaptés pour * calculer pour chaque liste une valeur pondérée fonction du coefficient de pertinence d'au moins un mot-clé de ladite liste contenu dans ledit texte, * calculer ladite note en fonction de la somme des valeurs pondérées desdites listes ; - le système comprend des moyens de vérification orthographique et/ou grammaticale et/ou sémantique dudit texte et lesdits moyens de calcul sont adaptés pour générer ladite donnée d'évaluation en fonction des résultats de ladite vérification et de ladite comparaison ; - lesdits moyens de calcul sont adaptés pour calculer une note de qualité fonction du nombre de fautes détectées dans ledit texte par lesdits moyens de vérification et une note de pertinence fonction du résultat de ladite comparaison ; - lesdits moyens de calcul sont adaptés pour générer ladite donnée d'évaluation en fonction desdites notes de qualité et de pertinence ; - le système comporte des moyens de génération de ladite consigne et, pour la transmission de ladite consigne audit utilisateur, des seconds moyens d'interface comprenant au moins l'un parmi les moyens de * affichage alphanumérique, * affichage graphique, * affichage vidéo, * reproduction de messages audio ; - lesdits moyens d'introduction dudit texte comprennent au moins l'un parmi plusieurs moyens comprenant un clavier, des moyens de reconnaissance de l'écriture, des moyens de reconnaissance vocale. Preferably, the system according to the invention further comprises one or more of the following characteristics considered alone or in combination - said storage means contain several lists of keywords, each list grouping together a set of keywords assigned to an associated concept to said instruction; - Said calculation means are adapted to calculate a score depending on the number of lists of which said text contains at least one keyword; - Said storage means contain a relevance coefficient assigned to each of said keywords and said calculation means are adapted to calculate said score as a function of the relevance coefficient of at least part of the keywords contained in the text; - each list is assigned a maximum value and said calculation means are adapted to * calculate for each list a weighted value as a function of the relevance coefficient of at least one keyword from said list contained in said text, * calculate said score as a function of the sum of the weighted values of said lists; the system comprises means for orthographic and / or grammatical and / or semantic verification of said text and said means of calculation are adapted to generate said evaluation data as a function of the results of said verification and of said comparison; said calculation means are adapted to calculate a quality score depending on the number of faults detected in said text by said verification means and a relevance score depending on the result of said comparison; said calculation means are adapted to generate said evaluation data as a function of said quality and relevance scores; the system includes means for generating said instruction and, for the transmission of said instruction to said user, second interface means comprising at least one of the means of * alphanumeric display, * graphic display, * video display, * reproduction of audio messages; - Said means for introducing said text comprise at least one of several means comprising a keyboard, means for recognizing handwriting, means for voice recognition.

L'invention vise également l'application d'un système de traitement de données tel que défini ci-dessus à l'apprentissage des langues étrangères. The invention also relates to the application of a data processing system as defined above to the learning of foreign languages.

D'autres caractéristiques et avantages de l'invention ressortiront de la description qui va suivre, faite en se référant aux dessin annexés sur lesquels - la figure 1 est un schéma-bloc matériel simplifié d'un exemple de réalisation du système de traitement de données selon l'invention basé sur un ordinateur personnel, et - la figure 2 est un schéma-bloc fonctionnel illustrant les fonctions mises en oeuvre dans le système de traitement de données selon l'invention. Other characteristics and advantages of the invention will emerge from the description which follows, made with reference to the appended drawings in which - FIG. 1 is a simplified hardware block diagram of an exemplary embodiment of the data processing system according to the invention based on a personal computer, and - Figure 2 is a functional block diagram illustrating the functions implemented in the data processing system according to the invention.

Selon l'exemple de réalisation de la figure 1, le système de traitement de données selon l'invention est basé sur un ordinateur personnel (PC) 1 convenablement programmé. De manière essentielle, le PC 1 est équipé de moyens de traitement de données 2 (microprocesseur) et de mémoires 3, ainsi que d'un certain nombre d'interfaces. Ces interfaces comprennent un écran d'affichage 9, un clavier 5, un dispositif 6 d'acquisition des données et programmes nécessaires à l'exécution des fonctions qui sont décrites dans la suite, et facultativement un ou plusieurs transducteurs électroacoustiques (HP) 7. Le dispositif 6 peut être constitué, par exemple, par un lecteur de disquette, CD-ROM, DVD-ROM ou autre moyen de stockage de données. Il peut s'agir également d'un dispositif d'échange de données au moyen duquel l'ordinateur personnel 1 se trouve relié par un réseau de communications tel qu'un réseau local ou Internet à un serveur à partir duquel les programmes précités, ou une partie de ceux-ci, sont téléchargés. According to the exemplary embodiment of FIG. 1, the data processing system according to the invention is based on a suitably programmed personal computer (PC) 1. Essentially, the PC 1 is equipped with data processing means 2 (microprocessor) and memories 3, as well as a certain number of interfaces. These interfaces include a display screen 9, a keyboard 5, a device 6 for acquiring the data and programs necessary for the execution of the functions which are described below, and optionally one or more electroacoustic (HP) transducers 7. The device 6 can be constituted, for example, by a floppy drive, CD-ROM, DVD-ROM or other means of data storage. It may also be a data exchange device by means of which the personal computer 1 is connected by a communications network such as a local network or the Internet to a server from which the aforementioned programs, or some of these are downloaded.

I1 s'agit là d'un simple exemple de réalisation du système de traitement de données suivant l'invention et celui-ci pourrait revêtir d'autres formes, par exemple celle d'un ordinateur central contenant les programmes précités et auxquels l'utilisateur a accès via un terminal. This is a simple embodiment of the data processing system according to the invention and it could take other forms, for example that of a central computer containing the aforementioned programs and to which the user has access via a terminal.

On se reportera également à la figure 2 sur laquelle sont explicitées les fonctions mises en oeuvre par le système de traitement de données de la figure 1. Lorsqu'un utilisateur a accédé sur son PC 1 à l'application concernée, stockée par exemple sur un CD ROM 8 lu par le lecteur 6, il lui est présentée une consigne l'invitant à rédiger un texte en langage naturel. Reference will also be made to FIG. 2 on which the functions implemented by the data processing system of FIG. 1 are explained. When a user has accessed on his PC 1 the application concerned, stored for example on a CD ROM 8 read by the reader 6, it is presented to him an instruction inviting him to write a text in natural language.

Cette consigne consiste en des indications relatives au sujet ou thème du texte que doit élaborer l'utilisateur. Cette consigne peut se présenter sous la forme d'une ou plusieurs questions, d'un texte définissant le sujet ou thème à traiter, d'une image fixe, d'une séquence vidéo ou d'une combinaison d'un ou plusieurs de ces média. Cette consigne est présentée à l'utilisateur via les moyens d'interface du PC tels que l'écran 4 et/ou le transducteur électroacoustique 7. This instruction consists of indications relating to the subject or theme of the text that the user must prepare. This instruction can take the form of one or more questions, a text defining the subject or theme to be treated, a still image, a video sequence or a combination of one or more of these media. This instruction is presented to the user via the PC interface means such as the screen 4 and / or the electroacoustic transducer 7.

En réponse à la consigne, l'utilisateur élabore un texte en langage naturel et l'introduit dans le PC 1 au moyen du clavier 5. En variante, le texte pourrait être introduit dans le PC 1 par d'autres moyens d'interface non représentés à la figure 1, par exemple oralement via un microphone et des moyens de reconnaissance vocale, ou sous forme manuscrite via une ardoise électronique et des moyens de reconnaissance d'écriture. In response to the instruction, the user prepares a text in natural language and introduces it into PC 1 by means of the keyboard 5. As a variant, the text could be introduced into PC 1 by other means of interface not represented in FIG. 1, for example orally via a microphone and voice recognition means, or in handwritten form via an electronic slate and writing recognition means.

Le texte 10 introduit dans le PC 1 est soumis respectivement en 11 à un processus d'évaluation de sa pertinence et en 12 à un processus d'évaluation de sa qualité. The text 10 introduced in the PC 1 is subjected respectively in 11 to a process of evaluation of its relevance and in 12 to a process of evaluation of its quality.

Le processus 11 d'évaluation de la pertinence du texte repose sur le stockage d'un certain nombre de mots- clés qui sont associés à la consigne et qui permettent de vérifier l'adéquation de la réponse (le texte 10) à la question et/ou le document de référence (la consigne). De préférence, ces mots-clés sont organisés en un certain nombre de listes correspondant respectivement à différents concepts-clés associés à la consigne. Chaque concept-clé est ainsi défini par une liste de mots qui illustrent le concept. Le terme liste doit être compris dans une acception large comme désignant un ensemble de mots-clés stockés en mémoire avec un lien les rattachant entre eux et les distinguant des mots-clés d'autres ensembles ou liste. The process 11 for evaluating the relevance of the text is based on the storage of a certain number of keywords which are associated with the instruction and which make it possible to verify the adequacy of the answer (text 10) to the question and / or the reference document (the instructions). Preferably, these keywords are organized into a certain number of lists corresponding respectively to different key concepts associated with the deposit. Each key concept is thus defined by a list of words which illustrate the concept. The term list must be understood in a broad sense as designating a set of keywords stored in memory with a link linking them together and distinguishing them from the keywords of other sets or list.

En outre, un coefficient de pertinence est de préférence associé à chaque mot-clé, pour rendre compte de sa proximité sémantique avec le concept-clé correspondant. Par exemple, au mot "maison" pourraient être associés les mots "maison", "chalet", "appartement" avec un coefficient de pertinence de valeur maximale (valeur 1 par exemple), et les mots "hutte", "suite", "habitat" "habitation" "caserne" "château"... avec un coefficient de pertinence de valeur plus faible (compris entre 0 et 1 par exemple) . Les mots définissant un concept-clé dépendent évidemment du concept, mais ils peuvent être également liés au contexte d'utilisation de celui-ci selon la question posée, le document de référence utilisé... Les listes de mots-clés et les coefficients de pertinence qui leur sont associés sont élaborés par les concepteurs de l'application et stockés en mémoire comme indiqué précédemment et comme illustré par la référence 13 à la figure 2. In addition, a relevance coefficient is preferably associated with each keyword, to account for its semantic proximity to the corresponding key concept. For example, the word "house" could be associated with the words "house", "chalet", "apartment" with a relevance coefficient of maximum value (value 1 for example), and the words "hut", "suite", "habitat" "habitation" "barracks" "castle" ... with a lower relevance coefficient (between 0 and 1 for example). The words defining a key concept obviously depend on the concept, but they can also be linked to the context of use thereof depending on the question asked, the reference document used ... The lists of keywords and the coefficients of relevance associated with them are developed by the designers of the application and stored in memory as indicated above and as illustrated by reference 13 in FIG. 2.

Ainsi, au niveau du bloc 11, les moyens de traitement de données 2 procèdent à une comparaison entre les mots du texte 10 et ceux des listes 12 de mots-clés. A partir de cette comparaison, les moyens de traitement 2 calculent une note de pertinence qui est fonction du nombre de listes ou concepts-clés dont au moins un mot- clé est contenu dans le texte 10. Le mode de calcul de la note peut être adapté en fonction des besoins. Thus, at block 11, the data processing means 2 carry out a comparison between the words of the text 10 and those of the lists 12 of keywords. From this comparison, the processing means 2 calculate a relevance score which is a function of the number of lists or key concepts of which at least one keyword is contained in the text 10. The method of calculating the score can be adapted according to needs.

A titre d'exemple, si le texte 10 contient plusieurs mots-clés d'une même liste, il peut être choisi de ne retenir que celui ayant le coefficient de pertinence de valeur la plus élevée. La valeur maximale de chaque liste ou concept-clé peut être fixée à 1 et le coefficient de pertinence des mots-clés compris entre 0 et 1. La note attribuée à la pertinence du texte 10 sera donc alors constituée de la somme des coefficients de pertinence des différents mots-clés retenus dans chaque liste (à savoir, dans cet exemple, un seul par liste) rapportés au nombre de listes ou concepts-clés. For example, if text 10 contains several keywords from the same list, it can be chosen to select only the one with the relevance coefficient of highest value. The maximum value of each list or key concept can be fixed at 1 and the relevance coefficient of the keywords between 0 and 1. The score assigned to the relevance of text 10 will therefore be made up of the sum of the relevance coefficients of the various keywords retained in each list (namely, in this example, only one per list) related to the number of lists or key concepts.

En outre, il peut être prévu dans les listes de mots-clés des mots pénalisants, c'est-à-dire des mots qui ne devraient pas être utilisés dans le texte 10 compte tenu de la consigne et de son contexte, par exemple de faux amis. De préférence, ces mots sont affectés de coefficients de pondération négatifs et viennent donc, lorsqu'ils se rencontrent dans le texte 10, diminuer la note élaborée en 11. Celle-ci est désignée Note 1 au bloc 14. In addition, penalizing words can be provided in the keyword lists, that is to say words which should not be used in the text 10 taking into account the instruction and its context, for example fake friends. Preferably, these words are assigned negative weighting coefficients and therefore come, when they meet in text 10, to decrease the note worked out in 11. This one is designated Note 1 in block 14.

Parallèlement au processus d'évaluation de la pertinence du texte en 11, il est procédé en 12 à l'évaluation de sa qualité au moyen d'un correcteur grammatical 15, d'un correcteur orthographique 16 et, facultativement, d'un correcteur sémantique 17. In parallel with the process of evaluating the relevance of the text in 11, there is proceeded in 12 to the evaluation of its quality by means of a grammar corrector 15, a spelling corrector 16 and, optionally, a semantic corrector 17.

Un correcteur orthographique est un logiciel permettant d'indiquer, dans un texte quelconque, tous les mots qui ne figurent pas dans un dictionnaire de référence. Idéalement, ce dictionnaire contient tous les mots, avec leurs déclinaisons, existant dans la langue du texte. A spell checker is software that allows you to indicate, in any text, all the words that do not appear in a reference dictionary. Ideally, this dictionary contains all the words, with their declensions, existing in the language of the text.

Un correcteur grammatical est un logiciel permettant d'indiquer si un texte est grammaticalement correct et, le cas échéant, d'indiquer où se situent les erreurs et la nature de celles-ci. Les erreurs peuvent concerner par exemple les accords, la formation des phrases, le respect des règles de grammaire, etc. A grammar checker is software for indicating whether a text is grammatically correct and, where appropriate, for indicating where the errors lie and the nature of the errors. Errors can relate, for example, to chords, sentence formation, compliance with grammar rules, etc.

En pratique, un correcteur grammatical intègre un correcteur orthographique, mais ils ont été représentés sous forme séparée sur le dessin pour des raisons de clarté. In practice, a grammar checker incorporates a spell checker, but they have been shown in separate form on the drawing for reasons of clarity.

Les correcteurs orthographiques et grammaticaux sont largement utilisés en association avec les logiciels de traitement de texte les plus connus et ne seront donc pas décrits. On citera pour mémoire les produits suivants - CORRECT ENGLISH, de la société Lernout et Hauspie (Belgique) pour la langue anglaise, - CORRECTEUR 101 et EL CORRECTOR de la société Machina Sapiens (Canada) pour les langues française et espagnole respectivement ; - ERRATA CORRIGE de la société Expert Systems (Italie) pour la langue italienne. Spell and grammar checkers are widely used in combination with the most popular word processing software and will therefore not be described. The following products will be mentioned for memory - CORRECT ENGLISH, from the company Lernout and Hauspie (Belgium) for the English language, - CORRECTEUR 101 and EL CORRECTOR from the company Machina Sapiens (Canada) for the French and Spanish languages respectively; - ERRATA CORRIGE from Expert Systems (Italy) for the Italian language.

Le correcteur grammatical 15 et le correcteur orthographique 16 permettent de détecter des fautes dans le texte 10, et de calculer en 18 une seconde note, désignée Note 2, en fonction du nombre de fautes détectées. The grammar checker 15 and the spell checker 16 make it possible to detect faults in the text 10, and to calculate in 18 a second note, designated Note 2, as a function of the number of faults detected.

Dans le système décrit dans la présente demande, le correcteur grammatical 15 et le correcteur orthographique 16 sont utilisés essentiellement à des fins de vérification pour noter la qualité du texte 10. Bien entendu, ces logiciels peuvent, au choix du concepteur de l'application, être utilisés également dans leur fonction de correcteurs en présentant à l'auteur du texte les fautes orthographiques et grammaticales qu'il a commises, par exemple par voie d'affichage dans le texte considéré. In the system described in the present application, the grammar checker 15 and the spell checker 16 are used essentially for verification purposes to note the quality of the text 10. Of course, this software can, at the choice of the application designer, be used also in their function as proofreaders by presenting the author of the text with spelling and grammatical errors he has committed, for example by posting in the text under consideration.

De manière facultative, une troisième note, désignée Note 3 au bloc 19, peut être élaborée au moyen du correcteur sémantique 17. Un correcteur sémantique est un logiciel permettant de vérifier la cohérence sémantique du texte analysé. Il permet par exemple de rejeter les phrases grammaticalement correctes, mais absurdes, telles que par exemple "la carotte dévore le lapin". En variante, le nombre de fautes détectées par le correcteur sémantique 17 peut constituer un paramètre de calcul de la note 2 en 18 au lieu de donner lieu au calcul d'une note séparée en 19 comme représenté à la figure 2. Optionally, a third note, designated Note 3 in block 19, can be produced using the semantic corrector 17. A semantic corrector is software enabling the semantic consistency of the analyzed text to be checked. It allows for example to reject grammatically correct, but absurd sentences, such as for example "the carrot devours the rabbit". As a variant, the number of faults detected by the semantic corrector 17 may constitute a parameter for calculating the note 2 at 18 instead of giving rise to the calculation of a separate note at 19 as shown in FIG. 2.

D'autres paramètres tels que le nombre de mots du texte 10, la longueur moyenne des phrases, le temps mis par l'utilisateur pour formuler sa réponse (texte 10), peuvent également être pris en compte en 20 pour calculer une quatrième note désignée Note 4 en 21. Other parameters such as the number of words in text 10, the average length of sentences, the time taken by the user to formulate his answer (text 10), can also be taken into account at 20 to calculate a designated fourth note. Note 4 in 21.

Enfin, à partir des notes calculées en 14, 18 et éventuellement 19 et 21, les moyens de traitement 2 calculent en 22 une Note finale qui est la note globale d'évaluation de la qualité et de la pertinence du texte 10. Cette Note finale est communiquée à l'auteur du texte 10, par exemple par affichage sur l'écran 4. Naturellement, il est également possible d'afficher pour l'auteur du texte 10 les notes individuelles calculées en 14, 18 et éventuellement 19 et 21. Finally, from the scores calculated in 14, 18 and possibly 19 and 21, the processing means 2 calculate in 22 a final score which is the overall score for evaluating the quality and relevance of the text 10. This final score is communicated to the author of the text 10, for example by display on the screen 4. Naturally, it is also possible to display for the author of the text 10 the individual notes calculated in 14, 18 and possibly 19 and 21.

La Note finale du bloc 22 peut se présenter soit sous la forme d'un nombre de points rapportés à une valeur maximale, soit comme un degré dans une échelle de notation, soit encore sous n'importe quelle forme appropriée. Le calcul de la Note finale en 22 peut bien entendu faire appel à des coefficients appliqués aux notes des blocs 14, 18, 19 et 21. De même, de tels coefficients peuvent être appliqués par le correcteur grammatical 15, le correcteur orthographique 16 et le correcteur sémantique 17 en fonction de la gravité des fautes détectées. The Final Note of block 22 can be presented either in the form of a number of points related to a maximum value, or as a degree in a rating scale, or in any other suitable form. The calculation of the Final Score in 22 can of course use coefficients applied to the notes in blocks 14, 18, 19 and 21. Likewise, such coefficients can be applied by the grammar checker 15, the spell checker 16 and the semantic corrector 17 depending on the severity of the faults detected.

A titre d'exemple, si on veut donner une note globale sur 20 (en supposant qu'il n'y a pas de correcteur sémantique 17 et de correcteur 20 en fonction d'autres paramètres), on peut noter sur 10 le résultat du correcteur grammatical et orthographique 15, 16 (10 - le nombre de fautes détectées) et sur la 10 la présence des concepts-clés (si cinq concepts-clés ou listes de mots- clés sont définis pour une consigne, on peut attribuer deux points pour chaque concept-clé retrouvé dans le texte, le coefficient de pertinence des mots utilisés servant à moduler l'attribution de ces points). As an example, if we want to give an overall score on 20 (assuming that there is no semantic corrector 17 and corrector 20 according to other parameters), we can note on 10 the result of grammar and spell checker 15, 16 (10 - the number of faults detected) and on the 10 the presence of key concepts (if five key concepts or lists of keywords are defined for an instruction, two points can be awarded for each key concept found in the text, the relevance coefficient of the words used to modulate the allocation of these points).

Le système de traitement décrit peut être utilisé par exemple pour la mise en oeuvre d'un logiciel multimédia d'apprentissage des langues étrangères. L'utilisateur prend connaissance d'un document (par exemple une photo ou un texte affiché sur l'écran 4) et répond à une question ou indication relative à ce document (par exemple : "décrivez la photo", "résumer le texte", etc.). La consigne peut comprendre des directives, par exemple quant au nombre maximal de mots que doit contenir le texte. L'utilisateur introduit celui-ci dans le PC 1, par exemple au moyen du clavier 5, et lorsqu'il a validé définitivement ce texte, il se voit attribuer une note finale comme décrit ci-dessus en regard de la figure 2. The processing system described can be used for example for the implementation of multimedia software for learning foreign languages. The user reads a document (for example a photo or a text displayed on screen 4) and answers a question or indication relating to this document (for example: "describe the photo", "summarize the text" , etc.). The instructions may include instructions, for example as to the maximum number of words that the text may contain. The user enters this into the PC 1, for example by means of the keyboard 5, and when he has definitively validated this text, he is assigned a final score as described above with reference to FIG. 2.

Il va de soi que le mode de réalisation décrit n'est qu'un exemple et l'on pourrait le modifier, notamment par substitution d'équivalents techniques, sans sortir pour cela du cadre de l'invention.It goes without saying that the embodiment described is only an example and it could be modified, in particular by substitution of technical equivalents, without departing from the scope of the invention.

Claims

1. Data processing system for the evaluation of a text in natural language developed by a user in response to an instruction transmitted to said user, comprising - first interface means for the introduction by said user of said text in response to said instruction, and - data processing means for evaluating said text, characterized in that said data processing means (2) comprise * means (8) for storing at least one list (13) of keywords associated with said instruction, * comparison means (2) for identifying words of said text (10) coinciding with words from said list of stored keywords, and * calculation means (2) for generating data (14,22) for evaluating said text as a function of the result of said comparison.

2. System according to claim 1, characterized in that said storage means (8) contain several lists of keywords (13), each list grouping together a set of keywords assigned to a concept associated with said instruction.

3. System according to claim 2, characterized in that said calculating means (2) are adapted to calculate a note (14) depending on the number of lists whose said text contains at least one keyword.

4. System according to claim 3, characterized in that said storage means (8) contain a relevance coefficient assigned to each of said keywords and in that said calculation means (2) are adapted to calculate said score (14 ) as a function of the relevance coefficient of at least part of the keywords contained in said text (10).

5. System according to claim 4, characterized in that each list is assigned a maximum value and in that said calculation means (2) are adapted to * calculate for each list a weighted value as a function of the relevance coefficient of at least one keyword from said list contained in said text, * calculating said score (14) as a function of the sum of the weighted values of said lists.

6. System according to any one of claims 1 to 5, characterized in that it comprises means (15, 16, 17) for checking spelling and / or grammatical and / or semantics of said text and in that said means of calculation (2) are adapted to generate said evaluation data (22) according to said verification and said comparison.

7. System according to claim 6, characterized in that said calculation means (2) are adapted to calculate a quality score (18,19) as a function of the number of faults detected in said text by said verification means and a score of relevance (14) depending on the result of said comparison.

8. System according to claim 7, characterized in that said calculation means (2) are adapted to generate said evaluation data (22) as a function of said quality scores (18,19) and relevance (14).

9. System according to any one of claims 1 to 8, characterized in that it comprises means (8) for generating said instruction and, for the transmission of said instruction to said user, second interface means (4 , 7) comprising at least one of the means for * alphanumeric display, * graphic display, * video display, * reproduction of audio messages.

10. System according to any one of claims 1 to 9, characterized in that said means (5) for introducing said text comprise at least one of several means comprising a keyboard, means for recognizing writing, voice recognition means.

11. System according to any one of claims 1 to 10, characterized in that it is applied to the learning of foreign languages.