FR2803928A1 - Processing of natural language text to evaluate the content for marking in an educational context, uses comparison of entered text to set of stored key words to determine score - Google Patents

Processing of natural language text to evaluate the content for marking in an educational context, uses comparison of entered text to set of stored key words to determine score Download PDF

Info

Publication number
FR2803928A1
FR2803928A1 FR0000590A FR0000590A FR2803928A1 FR 2803928 A1 FR2803928 A1 FR 2803928A1 FR 0000590 A FR0000590 A FR 0000590A FR 0000590 A FR0000590 A FR 0000590A FR 2803928 A1 FR2803928 A1 FR 2803928A1
Authority
FR
France
Prior art keywords
means
text
characterized
system according
instruction
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
FR0000590A
Other languages
French (fr)
Other versions
FR2803928B1 (en
Inventor
Bernard Gaston Francois Muller
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
AURALOG
Original Assignee
AURALOG
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by AURALOG filed Critical AURALOG
Priority to FR0000590A priority Critical patent/FR2803928B1/en
Publication of FR2803928A1 publication Critical patent/FR2803928A1/en
Application granted granted Critical
Publication of FR2803928B1 publication Critical patent/FR2803928B1/en
Anticipated expiration legal-status Critical
Expired - Fee Related legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G09EDUCATION; CRYPTOGRAPHY; DISPLAY; ADVERTISING; SEALS
    • G09BEDUCATIONAL OR DEMONSTRATION APPLIANCES; APPLIANCES FOR TEACHING, OR COMMUNICATING WITH, THE BLIND, DEAF OR MUTE; MODELS; PLANETARIA; GLOBES; MAPS; DIAGRAMS
    • G09B19/00Teaching not covered by other main groups of this subclass
    • G09B19/06Foreign languages

Abstract

The processing system has an interface to allow the user to introduce their response in text form. The processing makes use of a stored (13) list of key words associated with the question posed, and compares (11) the text with key words to detect coincidence. A score (14) is computed based on the coincidence, and made available for further processing.

Description

The invention relates to a data processing system for the evaluation of a text.

In the field of education, in particular language learning, the evaluation of the quality and relevance of a text written in natural language in response to an instruction or a question is based on an assessment made by a person. physical, the corrector.

In addition, there are educational software, in particular for learning languages, which, by means of a personal computer, allows an individual to carry out a learning process without the intervention of a teacher or corrector of a natural person. These systems make it possible to automatically correct the answers given by the user to questions put to him in the event that there is a finite and limited number of possible answers to these questions.

The invention aims to provide a data processing system for automatically evaluating the relevance of a text freely written in natural language in response to an instruction. The term “instruction” is understood here to mean all the indications given to a user of the system for drafting his text. This instruction can be presented for example in the form of a text (such as a question, a theme, etc.) associated with a document (such as an audio recording, a still image, a video image, etc.). . This instruction is presented to the user by interface means such as acoustic transducer, display screen, video screen, etc.

To this end, the subject of the invention is a data processing system for the evaluation of a text in natural language produced by a user in response to an instruction transmitted to said user, comprising - first interface means for the introduction by said user of said text in response to said instruction, and - data processing means for evaluating said text, characterized in that said data processing means comprise * means for storing at least one list of keywords associated with said instruction, * comparison means for identifying words of said text coinciding with words from said list of stored keywords, and * calculation means for generating data for evaluating said text as a function of the result of said comparison.

Preferably, the system according to the invention further comprises one or more of the following characteristics considered alone or in combination - said storage means contain several lists of keywords, each list grouping together a set of keywords assigned to an associated concept to said instruction; - Said calculation means are adapted to calculate a score depending on the number of lists of which said text contains at least one keyword; - Said storage means contain a relevance coefficient assigned to each of said keywords and said calculation means are adapted to calculate said score as a function of the relevance coefficient of at least part of the keywords contained in the text; - each list is assigned a maximum value and said calculation means are adapted to * calculate for each list a weighted value as a function of the relevance coefficient of at least one keyword from said list contained in said text, * calculate said score as a function of the sum of the weighted values of said lists; the system comprises means for orthographic and / or grammatical and / or semantic verification of said text and said means of calculation are adapted to generate said evaluation data as a function of the results of said verification and of said comparison; said calculation means are adapted to calculate a quality score depending on the number of faults detected in said text by said verification means and a relevance score depending on the result of said comparison; said calculation means are adapted to generate said evaluation data as a function of said quality and relevance scores; the system includes means for generating said instruction and, for the transmission of said instruction to said user, second interface means comprising at least one of the means of * alphanumeric display, * graphic display, * video display, * reproduction of audio messages; - Said means for introducing said text comprise at least one of several means comprising a keyboard, means for recognizing handwriting, means for voice recognition.

The invention also relates to the application of a data processing system as defined above to the learning of foreign languages.

Other characteristics and advantages of the invention will emerge from the description which follows, made with reference to the appended drawings in which - FIG. 1 is a simplified hardware block diagram of an exemplary embodiment of the data processing system according to the invention based on a personal computer, and - Figure 2 is a functional block diagram illustrating the functions implemented in the data processing system according to the invention.

According to the exemplary embodiment of FIG. 1, the data processing system according to the invention is based on a suitably programmed personal computer (PC) 1. Essentially, the PC 1 is equipped with data processing means 2 (microprocessor) and memories 3, as well as a certain number of interfaces. These interfaces include a display screen 9, a keyboard 5, a device 6 for acquiring the data and programs necessary for the execution of the functions which are described below, and optionally one or more electroacoustic (HP) transducers 7. The device 6 can be constituted, for example, by a floppy drive, CD-ROM, DVD-ROM or other means of data storage. It may also be a data exchange device by means of which the personal computer 1 is connected by a communications network such as a local network or the Internet to a server from which the aforementioned programs, or some of these are downloaded.

This is a simple embodiment of the data processing system according to the invention and it could take other forms, for example that of a central computer containing the aforementioned programs and to which the user has access via a terminal.

Reference will also be made to FIG. 2 on which the functions implemented by the data processing system of FIG. 1 are explained. When a user has accessed on his PC 1 the application concerned, stored for example on a CD ROM 8 read by the reader 6, it is presented to him an instruction inviting him to write a text in natural language.

This instruction consists of indications relating to the subject or theme of the text that the user must prepare. This instruction can take the form of one or more questions, a text defining the subject or theme to be treated, a still image, a video sequence or a combination of one or more of these media. This instruction is presented to the user via the PC interface means such as the screen 4 and / or the electroacoustic transducer 7.

In response to the instruction, the user prepares a text in natural language and introduces it into PC 1 by means of the keyboard 5. As a variant, the text could be introduced into PC 1 by other means of interface not represented in FIG. 1, for example orally via a microphone and voice recognition means, or in handwritten form via an electronic slate and writing recognition means.

The text 10 introduced in the PC 1 is subjected respectively in 11 to a process of evaluation of its relevance and in 12 to a process of evaluation of its quality.

The process 11 for evaluating the relevance of the text is based on the storage of a certain number of keywords which are associated with the instruction and which make it possible to verify the adequacy of the answer (text 10) to the question and / or the reference document (the instructions). Preferably, these keywords are organized into a certain number of lists corresponding respectively to different key concepts associated with the deposit. Each key concept is thus defined by a list of words which illustrate the concept. The term list must be understood in a broad sense as designating a set of keywords stored in memory with a link linking them together and distinguishing them from the keywords of other sets or list.

In addition, a relevance coefficient is preferably associated with each keyword, to account for its semantic proximity to the corresponding key concept. For example, the word "house" could be associated with the words "house", "chalet", "apartment" with a relevance coefficient of maximum value (value 1 for example), and the words "hut", "suite", "habitat" "habitation" "barracks" "castle" ... with a lower relevance coefficient (between 0 and 1 for example). The words defining a key concept obviously depend on the concept, but they can also be linked to the context of use thereof depending on the question asked, the reference document used ... The lists of keywords and the coefficients of relevance associated with them are developed by the designers of the application and stored in memory as indicated above and as illustrated by reference 13 in FIG. 2.

Thus, at block 11, the data processing means 2 carry out a comparison between the words of the text 10 and those of the lists 12 of keywords. From this comparison, the processing means 2 calculate a relevance score which is a function of the number of lists or key concepts of which at least one keyword is contained in the text 10. The method of calculating the score can be adapted according to needs.

For example, if text 10 contains several keywords from the same list, it can be chosen to select only the one with the relevance coefficient of highest value. The maximum value of each list or key concept can be fixed at 1 and the relevance coefficient of the keywords between 0 and 1. The score assigned to the relevance of text 10 will therefore be made up of the sum of the relevance coefficients of the various keywords retained in each list (namely, in this example, only one per list) related to the number of lists or key concepts.

In addition, penalizing words can be provided in the keyword lists, that is to say words which should not be used in the text 10 taking into account the instruction and its context, for example fake friends. Preferably, these words are assigned negative weighting coefficients and therefore come, when they meet in text 10, to decrease the note worked out in 11. This one is designated Note 1 in block 14.

In parallel with the process of evaluating the relevance of the text in 11, there is proceeded in 12 to the evaluation of its quality by means of a grammar corrector 15, a spelling corrector 16 and, optionally, a semantic corrector 17.

A spell checker is software that allows you to indicate, in any text, all the words that do not appear in a reference dictionary. Ideally, this dictionary contains all the words, with their declensions, existing in the language of the text.

A grammar checker is software for indicating whether a text is grammatically correct and, where appropriate, for indicating where the errors lie and the nature of the errors. Errors can relate, for example, to chords, sentence formation, compliance with grammar rules, etc.

In practice, a grammar checker incorporates a spell checker, but they have been shown in separate form on the drawing for reasons of clarity.

Spell and grammar checkers are widely used in combination with the most popular word processing software and will therefore not be described. The following products will be mentioned for memory - CORRECT ENGLISH, from the company Lernout and Hauspie (Belgium) for the English language, - CORRECTEUR 101 and EL CORRECTOR from the company Machina Sapiens (Canada) for the French and Spanish languages respectively; - ERRATA CORRIGE from Expert Systems (Italy) for the Italian language.

The grammar checker 15 and the spell checker 16 make it possible to detect faults in the text 10, and to calculate in 18 a second note, designated Note 2, as a function of the number of faults detected.

In the system described in the present application, the grammar checker 15 and the spell checker 16 are used essentially for verification purposes to note the quality of the text 10. Of course, this software can, at the choice of the application designer, be used also in their function as proofreaders by presenting the author of the text with spelling and grammatical errors he has committed, for example by posting in the text under consideration.

Optionally, a third note, designated Note 3 in block 19, can be produced using the semantic corrector 17. A semantic corrector is software enabling the semantic consistency of the analyzed text to be checked. It allows for example to reject grammatically correct, but absurd sentences, such as for example "the carrot devours the rabbit". As a variant, the number of faults detected by the semantic corrector 17 may constitute a parameter for calculating the note 2 at 18 instead of giving rise to the calculation of a separate note at 19 as shown in FIG. 2.

Other parameters such as the number of words in text 10, the average length of sentences, the time taken by the user to formulate his answer (text 10), can also be taken into account at 20 to calculate a designated fourth note. Note 4 in 21.

Finally, from the scores calculated in 14, 18 and possibly 19 and 21, the processing means 2 calculate in 22 a final score which is the overall score for evaluating the quality and relevance of the text 10. This final score is communicated to the author of the text 10, for example by display on the screen 4. Naturally, it is also possible to display for the author of the text 10 the individual notes calculated in 14, 18 and possibly 19 and 21.

The Final Note of block 22 can be presented either in the form of a number of points related to a maximum value, or as a degree in a rating scale, or in any other suitable form. The calculation of the Final Score in 22 can of course use coefficients applied to the notes in blocks 14, 18, 19 and 21. Likewise, such coefficients can be applied by the grammar checker 15, the spell checker 16 and the semantic corrector 17 depending on the severity of the faults detected.

As an example, if we want to give an overall score on 20 (assuming that there is no semantic corrector 17 and corrector 20 according to other parameters), we can note on 10 the result of grammar and spell checker 15, 16 (10 - the number of faults detected) and on the 10 the presence of key concepts (if five key concepts or lists of keywords are defined for an instruction, two points can be awarded for each key concept found in the text, the relevance coefficient of the words used to modulate the allocation of these points).

The processing system described can be used for example for the implementation of multimedia software for learning foreign languages. The user reads a document (for example a photo or a text displayed on screen 4) and answers a question or indication relating to this document (for example: "describe the photo", "summarize the text" , etc.). The instructions may include instructions, for example as to the maximum number of words that the text may contain. The user enters this into the PC 1, for example by means of the keyboard 5, and when he has definitively validated this text, he is assigned a final score as described above with reference to FIG. 2.

It goes without saying that the embodiment described is only an example and it could be modified, in particular by substitution of technical equivalents, without departing from the scope of the invention.

Claims (11)

1. Data processing system for the evaluation of a text in natural language developed by a user in response to an instruction transmitted to said user, comprising - first interface means for the introduction by said user of said text in response to said instruction, and - data processing means for evaluating said text, characterized in that said data processing means (2) comprise * means (8) for storing at least one list (13) of keywords associated with said instruction, * comparison means (2) for identifying words of said text (10) coinciding with words from said list of stored keywords, and * calculation means (2) for generating data (14,22) for evaluating said text as a function of the result of said comparison.
2. System according to claim 1, characterized in that said storage means (8) contain several lists of keywords (13), each list grouping together a set of keywords assigned to a concept associated with said instruction.
3. System according to claim 2, characterized in that said calculating means (2) are adapted to calculate a note (14) depending on the number of lists whose said text contains at least one keyword.
4. System according to claim 3, characterized in that said storage means (8) contain a relevance coefficient assigned to each of said keywords and in that said calculation means (2) are adapted to calculate said score (14 ) as a function of the relevance coefficient of at least part of the keywords contained in said text (10).
5. System according to claim 4, characterized in that each list is assigned a maximum value and in that said calculation means (2) are adapted to * calculate for each list a weighted value as a function of the relevance coefficient of at least one keyword from said list contained in said text, * calculating said score (14) as a function of the sum of the weighted values of said lists.
6. System according to any one of claims 1 to 5, characterized in that it comprises means (15, 16, 17) for checking spelling and / or grammatical and / or semantics of said text and in that said means of calculation (2) are adapted to generate said evaluation data (22) according to said verification and said comparison.
7. System according to claim 6, characterized in that said calculation means (2) are adapted to calculate a quality score (18,19) as a function of the number of faults detected in said text by said verification means and a score of relevance (14) depending on the result of said comparison.
8. System according to claim 7, characterized in that said calculation means (2) are adapted to generate said evaluation data (22) as a function of said quality scores (18,19) and relevance (14).
9. System according to any one of claims 1 to 8, characterized in that it comprises means (8) for generating said instruction and, for the transmission of said instruction to said user, second interface means (4 , 7) comprising at least one of the means for * alphanumeric display, * graphic display, * video display, * reproduction of audio messages.
10. System according to any one of claims 1 to 9, characterized in that said means (5) for introducing said text comprise at least one of several means comprising a keyboard, means for recognizing writing, voice recognition means.
11. System according to any one of claims 1 to 10, characterized in that it is applied to the learning of foreign languages.
FR0000590A 2000-01-18 2000-01-18 Data processing system for text evaluation Expired - Fee Related FR2803928B1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
FR0000590A FR2803928B1 (en) 2000-01-18 2000-01-18 Data processing system for text evaluation

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
FR0000590A FR2803928B1 (en) 2000-01-18 2000-01-18 Data processing system for text evaluation

Publications (2)

Publication Number Publication Date
FR2803928A1 true FR2803928A1 (en) 2001-07-20
FR2803928B1 FR2803928B1 (en) 2002-11-29

Family

ID=8846017

Family Applications (1)

Application Number Title Priority Date Filing Date
FR0000590A Expired - Fee Related FR2803928B1 (en) 2000-01-18 2000-01-18 Data processing system for text evaluation

Country Status (1)

Country Link
FR (1) FR2803928B1 (en)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7281219B2 (en) 2003-12-05 2007-10-09 International Business Machines Corporation Blended learning experience tool and method
US7371070B2 (en) 2003-12-05 2008-05-13 International Business Machines Corporation Operationalizing a learning solution
US7914288B2 (en) 2004-10-07 2011-03-29 International Bussiness Machines Corporation On demand learning
FR3030809A1 (en) * 2014-12-22 2016-06-24 Shortedition Method for automatically analyzing the literary quality of a text
US9472114B2 (en) 2003-05-01 2016-10-18 International Business Machines Corporation Computer-implemented method, system and program product for providing an educational program

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US4730270A (en) * 1984-07-31 1988-03-08 Hitachi, Ltd. Interactive foreign language translating method and apparatus
WO1997008604A2 (en) * 1995-08-16 1997-03-06 Syracuse University Multilingual document retrieval system and method using semantic vector matching
US5727950A (en) * 1996-05-22 1998-03-17 Netsage Corporation Agent based instruction system and method
US5820386A (en) * 1994-08-18 1998-10-13 Sheppard, Ii; Charles Bradford Interactive educational apparatus and method

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US4730270A (en) * 1984-07-31 1988-03-08 Hitachi, Ltd. Interactive foreign language translating method and apparatus
US5820386A (en) * 1994-08-18 1998-10-13 Sheppard, Ii; Charles Bradford Interactive educational apparatus and method
WO1997008604A2 (en) * 1995-08-16 1997-03-06 Syracuse University Multilingual document retrieval system and method using semantic vector matching
US5727950A (en) * 1996-05-22 1998-03-17 Netsage Corporation Agent based instruction system and method

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9472114B2 (en) 2003-05-01 2016-10-18 International Business Machines Corporation Computer-implemented method, system and program product for providing an educational program
US7281219B2 (en) 2003-12-05 2007-10-09 International Business Machines Corporation Blended learning experience tool and method
US7371070B2 (en) 2003-12-05 2008-05-13 International Business Machines Corporation Operationalizing a learning solution
US7914288B2 (en) 2004-10-07 2011-03-29 International Bussiness Machines Corporation On demand learning
FR3030809A1 (en) * 2014-12-22 2016-06-24 Shortedition Method for automatically analyzing the literary quality of a text

Also Published As

Publication number Publication date
FR2803928B1 (en) 2002-11-29

Similar Documents

Publication Publication Date Title
Webb et al. How vocabulary is learned
US10325517B2 (en) Systems and methods for extracting keywords in language learning
Campbell Translation into the second language
McNamara et al. Automated evaluation of text and discourse with Coh-Metrix
Singer Psychology of Language (PLE: Psycholinguistics): An Introduction to Sentence and Discourse Processes
Capone The pragmatics of indirect reports
Elgort et al. L2 vocabulary learning from reading: Explicit and tacit lexical knowledge and the role of learner and item variables
Vajjala et al. On improving the accuracy of readability classification using insights from second language acquisition
Bailin et al. Readability: Text and context
Barcroft Lexical input processing and vocabulary learning
Liu et al. What is morphological awareness? Tapping lexical compounding awareness in Chinese third graders.
Kumar et al. Improving literacy in developing countries using speech recognition-supported games on mobile devices
Halliday Spoken and written modes of meaning
Sealey et al. Applied linguistics as social science
Tony Standard English: the widening debate
Trenkic Variability in second language article production: Beyond the representational deficit vs. processing constraints debate
Canale From communicative competence to communicative language pedagogy
LaCapra Rethinking intellectual history and reading texts
House Translation quality assessment: A model revisited
O'Donnell et al. The development of formulaic sequences in first and second language writing: Investigating effects of frequency, association, and native norm
Alderson Assessing reading
Barton Input and interaction in language acquisition
Wesche et al. Lexical inferencing in a first and second language: Cross-linguistic dimensions
Zuraw The role of phonetic knowledge in phonological patterning: corpus and survey evidence from Tagalog infixation
Barnbrook et al. Collocation: Applications and implications

Legal Events

Date Code Title Description
PLFP Fee payment

Year of fee payment: 17

ST Notification of lapse

Effective date: 20170929