GB2465585A

GB2465585A - Method and system for vocabulary learning by study word selection

Info

Publication number: GB2465585A
Application number: GB0821296A
Authority: GB
Inventors: Patrick Rene Tschorn; Philip Glenny Edmonds
Original assignee: Sharp Corp
Current assignee: Sharp Corp
Priority date: 2008-11-21
Filing date: 2008-11-21
Publication date: 2010-05-26
Also published as: CN101739849A; JP5384298B2; JP2010122676A; GB0821296D0

Abstract

A computer-implemented vocabulary learning method is provided where, following the selection of a text for communication to the learner, the text comprising a plurality of words, for each of at least some of the words in the text, (a) assigning a status to the word in dependence upon first, second and third measures for the word, where the assigned status is selected from a plurality of predetermined statuses, where each status is associated with at least one corresponding recommended action to be taken by the learner, where the first measure is dependent on the learner, where the second measure is independent of the text, and where the third measure is dependent on the other words of the text; and (b) communicating the assigned status to the learner, at least if requested. The statuses are preferably "Mastered", if the learner has already learnt the word, "Study" if the word is high in a context-sensitive ranking, and "Ignore" to all other words. The statuses can then be reported to the learner. The context-sensitive ranking may be based upon a context-sensitive utility value. A general utility value may also be given to the words.

Description

Method and system for effective vocabulary learning by study word selection

Technical field

The present invention relates to a computer-implemented learning method and apparatus.

Background

Acquiring the amount of vocabulary necessary to communicate effectively in a foreign or second language usually takes many years and poses a significant challenge for language learners.

Extensive reading is a method in which language skills such as grammar or vocabulary are learned by reading large amounts of authentic text at an appropriate level of difficulty.

To read with minimal disturbance from unknown vocabulary, it is estimated that language users need a vocabulary of 15000-20000 words (as described by I. Nation in Learning Vocabulary in Another Language published by Cambridge University Press, 2001).

The time learners can devote to acquiring new vocabulary is limited. It is therefore desirable to use the available time as effectively as possible.

Words or phrases in a language (or subject area such as e.g. physics) differ with respect to their utility. Generally, highly frequent words have a high utility while rare words have a low utility, but utility can also be dependent on a specific subject area and curriculum.

In this document, when we use the term "word", we mean word, phrase, term, or other unit of vocabulary. Vocabulary acquisition (or learning) refers to the process in people of learning vocabulary units and their associated pronunciations, written forms, concepts, or other relationships. Thus it can refer to the standard task of language learning or to a more general educational task of learning from language material, such as a physics or geography textbook or video.

Learners should not spend time on studying low utility words before they have mastered high utility words. For example, if a learner lacks knowledge of certain basic words, the acquisition of more advanced words is made more difficult as the learner is likely to have problems understanding the context in which the more advanced words are presented. Such gaps in knowledge make it hard to connect new knowledge to existing knowledge; the result is high cognitive load and often frustration.

Cognitive load, a term from the field of psychology, refers to the aggregate demand on the human mind while performing a task such as language learning or reasoning and problem solving in general. If cognitive load is too high, it becomes hard or even impossible to concentrate on and carry out a task. For example, trying to remember the meaning of a large number of unfamiliar words during reading can be so demanding that the text becomes incomprehensible. It is thus beneficial to reduce unnecessary cognitive load.

It is hard for a learner to judge whether a particular word has high or low utility.

Moreover, a certain word may have high relative utility in current language material, for example, because it is a key word in the language material and so must be understood before the language material itself can be understood. Thus utility, to some extent, depends on the context of the current language material. In addition, the learner's current knowledge is continuously changing as a result of learning. Therefore, a word's utility can change with respect to a learner and other factors over time..

An ideal system for vocabulary acquisition would thus indicate which words have a high utility in the context of language material and a learner's current knowledge, and recommend possible learning actions to the learner, in a timely fashion.

Traditionally, the process of vocabulary acquisition is managed through a teacher and a curriculum or by the learners themselves. It is an obvious burden to manage the task oneself.

When the process is managed by a teacher or curriculum, there is very limited scope for diagnosing and correcting individual weaknesses. A learner may have knowledge gaps that, if not addressed, will hamper his future progress.

Ranked vocabulary lists are an important source of information for managing vocabulary acquisition, since frequency correlates with utility. The prior art includes systems that use vocabulary lists ranked by frequency (GB2352543A, US7108512, US20070269778A1, W0061 21 542A2, US200600631 39A1, US6726486, JP2000047559A2). Some of these systems do not include a reading component or any language material and are based entirely on flash card activities or quizzes (GB2352543A, U571 08512, US200600631 39A1, U56726486, JP2000047559A2).

Of the abovementioned systems that do include a reading component and a ranked vocabulary list, none gives a clear recommendation as to whether a particular word should be studied now or left for later.

U520070269778A1, for example, shows a difficulty level and rarity score for words that the learner selects in the reading component, however, no actual recommendation is given whether it is the right time for a given individual learner to study a particular word.

W006121542A2 creates a prioritized target vocabulary learning list which may be used by other learning components. The method described in WO06121542A2 does not analyze the learner's current reading material when creating a prioritized target vocabulary learning list. W0061 21 542A2 does not use the target word list to give a clearly indicated recommendation to a particular learner whether a particular word should be studied.

Some prior art systems give clear recommendations, but do not do so in a context-sensitive manner.

The "Follow You!" system (Chi-Chiang Shei. FollowYou! An automatic language lesson generation system. CALL, 14(2), 2001) is based on a ranked vocabulary list.

The learner's knowledge is represented by a pointer to a rank in the list; the system assumes that the learner knows all words with a higher rank. When the learner opens a text in "Follow You!", the system selects as words for study a number of words that occur in the text and have a rank below the learner's current knowledge rank. The only property of the text that is taken into account when study words are selected is whether a word occurs or does not occur. "Follow You!" does not take into account contextual properties of the imported text such as local frequencies or semantic field membership when selecting study words. Words that do not occur in the current textual material and that have a higher rank than the lowest-ranking practiced word will subsequently be assumed known. This approach leads to the formation of gaps in the learner's vocabulary knowledge. "Follow You!" displays the study words in a list in its user interface, which despite being a form of recommendation, does not take into account contextual properties of the current text, as noted above. Moreover, the recommendations cannot be updated in a timely manner, for example, right after a learner demonstrates knowledge of a word.

The REAP system (Heilman, M. & Eskenazi, M. (2006). Language Learning: Challenges for Intelligent Tutoring Systems. Proceedings of the Workshop of Intelligent Tutoring Systems for Ill-Defined Domains. 8th International Conference on Intelligent Tutoring Systems) selects appropriate reading material for an individual learner based on a learner profile and a curriculum profile. The curriculum profile is a frequency-ranked vocabulary list. The learner profile keeps track of active and passive vocabulary by counting how often the learner has demonstrated knowledge of a particular word and how often the learner has been exposed to a particular word.

Based on these two profiles, REAP retrieves short texts that contain 2-4 study words (i.e., words that the learner has not yet mastered) as well as a certain percentage of words that the system believes the learner has mastered (J. Brown and M. Eskenazi.

(2005). Student, text and curriculum modeling for reader-specific document retrieval.

Proceedings of the lASTED International Conference on Human-Computer Interaction 2005. Phoenix, AZ.). REAP does not employ a context-sensitive measure of utility; that is, study words are not chosen on the basis of the retrieved text; they are chosen beforehand. REAP selects study words by giving priority to the most frequent and the least frequent words within the given curriculum profile (frequency-ranked vocabulary list) that the learner has not yet demonstrated sufficient knowledge of. When the retrieved text is presented to the learner, study words are highlighted, which is a form of recommendation. The status of words cannot be changed immediately, since the profiles are updated only after a learner finishes reading a retrieved text, and performs word exercises. (J. Brown and M. Eskenazi. (2004). Retrieval of authentic documents for reader-specific lexical practice. Proceedings of InST/U/CALL Symposium 2004.

Venice, Italy).

Therefore, although the Follow You! system and the REAP system take into account the learner's current knowledge state (at the time a text is first accessed), neither uses the context of the current text in determining the utility of words. Nor can the systems give immediate or timely recommendations of learning actions to keep up with the continuously changing knowledge state of the learner.

What is needed is a method of determining the current utility of words dependent on the current language material and the learner's current and changing knowledge state.

Summary

According to a first aspect of the present invention there is provided a computer-implemented vocabulary learning method comprising, following selection of a text for communication to the learner, the text comprising a plurality of words: for each of at least some of the words in the text, (a) assigning a status to the word in dependence upon first, second and third measures for the word, where the assigned status is selected from a plurality of predetermined statuses, where each status is associated with at least one corresponding recommended action to be taken by the learner, where the first measure is dependent on the learner, where the second measure is independent of the text, and where the third measure is dependent on the other words of the text; and (b) communicating the assigned status to the learner, at least if requested.

Step (a) may comprise: determining a context-sensitive utility value for the word in dependence upon the second and third measures; and assigning the status in dependence upon the context-sensitive utility value.

The method may comprise determining the context-sensitive utility value also in dependence upon the first measure.

The method may comprise assigning a "study" status to a predetermined number of words having the highest determined context-sensitive utility values, a "study" status indicating that a learning action is recommended.

The method may comprise assigning a "study" status to words having a context-sensitive utility value higher than a predetermined threshold, a "study" status indicating that a learning action is recommended.

The method may comprise assigning an "ignore" status to those words not assigned a "study" status, an "ignore" status indicating that a learning action is not yet recommended.

The second measure may be independent of the learner.

The second measure may be a function of the general utility of the word.

The general utility of the word may be based upon any one or more of: word frequency; concreteness; word difficulty; word field membership; personal study list membership; imageability; and importance to a subject area or curriculum.

The second measure may be zero for all words of a certain type, for example nouns, and the general utility of the word otherwise.

The third measure may be a function of the utility of the word within the context of the text.

The utility of the word within the context of the text may be determined in dependence upon the frequency of the word within the text.

The utility of the word within the context of the text may be determined in dependence upon the relative frequency of the word within the text compared to its overall frequency within a predetermined scope of words, where the predetermined scope comprises for example all the words in a sample of the language concerned.

The third measure may be a function of the local importance of the word within the text.

Each word may be associated with a tag placing it as part of a group of words relating to a common concept, and wherein the third measure is a function of the frequency of the word's associated tag within the text.

The method may comprise assigning the status in a way as to limit the number and/or density of words assigned to a "study" status, a "study" status indicating that a learning action is recommended.

The first measure may be a function of the learner's existing knowledge of the word.

The first measure may provide an indication of the extent or degree to which the learner has mastered the word.

The first measure may be related to the number of times that the learner has seen or read the word.

The first measure may effectively apply a distribution to the extent or degree to which the learner has mastered the word, for example a Gaussian distribution, that is arranged to focus the selection of study words onto a narrower range of potentially useful words.

The method may comprise determining from the first measure a Boolean indication of whether of not the learner has mastered the word.

The method may comprise, for at least some of the words of the text for which the first measure is determined to place those words within a predetermined category of words, assigning the status in dependence upon the first measure but not the second or third measures.

The predetermined category of words may include those words that are considered already to be learned by the learner.

The method may comprise assigning a "mastered" status to those words determined to be in the predetermined category, the "mastered" status indicating for example that no learning action is required or that a revision exercise be suggested.

The method may comprise re-assigning the status of at least some of the words when it is signalled that one or more of the first to third measures has changed or might have changed.

Communicating the text to the learner may comprise displaying the text on a display, providing the text in printed form, and/or emitting an audio representation of the text.

According to a second aspect of the present invention there is provided a vocabulary learning apparatus comprising means for processing each of at least some of the words in a text selected for communication to the learner by: (a) assigning a status to the word in dependence upon first, second and third measures for the word, where the assigned status is selected from a plurality of predetermined statuses, where each status is associated with at least one corresponding recommended action to be taken by the learner, where the first measure is dependent on the learner, where the second measure is independent of the text, and where the third measure is dependent on the other words of the text; and (b) communicating the assigned status to the learner, at least if requested.

According to a third aspect of the present invention there is provided a program for controlling an apparatus to perform a method according to the first aspect of the present invention or which, when loaded into an apparatus, causes the apparatus to become an apparatus according to the second aspect of the present invention. The program may be carried on a carrier medium. The carrier medium may be a storage medium. The carrier medium may be a transmission medium.

According to a fourth aspect of the present invention there is provided an apparatus programmed by a program according to the third aspect of the present invention.

According to a fifth aspect of the present invention there is provided a storage medium containing a program according to the third aspect of the present invention.

An embodiment of the present invention provides a learning system that can immediately determine the context-specific utility of words in language material that a learner is currently attending to, upon any changes in the learning system, so as to indicate recommended learning actions to the learner.

Recall that in this document, when we use the term "word", we mean word, phrase, term, or other unit of vocabulary.

In the preferred embodiment, a language learning device such an electronic book reading device that includes a learner model and a vocabulary model, e.g. as is disclosed in the method and apparatus of GB0702298A0, is modified to include the present system.

The system combines information from a model of the learner, a model of vocabulary utility, and an analysis of the current language material. The system can immediately re-determine word utility when changes occur to the learner model, the vocabulary model, or the current language material. Changes can result from learner actions or internal system processes.

The system can indicate recommendations to a learner about possible learning actions such as whether a particular word should be studied now or later, dependent on the utility of words.

The system is context-sensitive because it takes into account the particular properties of the language material currently being studied and the learner's current knowledge state (estimated). These properties of the language material include but are not limited to local word frequencies, local word importance, and semantic relations between A learner model for a particular learner tracks for each word in a list of words whether the learner has mastered the particular word, and can be updated every time the learner performs an action in the system, such as reading the word in context.

A vocabulary model associates each word with its general utility to all learners.

An embodiment of the invention has one or more of the following advantages.

An advantage of the system is that it reduces cognitive load and makes vocabulary acquisition more efficient and effective.

A further advantage of the system is that it can enable clearly indicated recommendations, i.e. to clearly show the status of any or every word in the language material to the learner.

A further advantage of the method is that it can immediately reflect changes in the knowledge sources it consults. In particular, changes in the learner model can be taken into account immediately to adapt the selection of study words (and in general the status of any particular word).

A further advantage of the method is that it can address gaps in a particular learner's vocabulary knowledge.

A further advantage of the method is that it works on any kind of language material (spoken or written) and any subject matter such as for example physics or history, or indeed any domain that includes vocabulary with which learners have to familiarize themselves.

Brief description of the drawings

Figure 1 is a block diagram of the preferred embodiment.

Figure 2 is a flow chart of the word status determination process.

Figure 3 shows a page of an example text being displayed in a text-reading interface.

Figure 4 is an excerpt of the word frequencies in the example text.

Figure 5 is an example learner model.

Figure 6 is an excerpt of a vocabulary model.

Figure 7 shows an excerpt of the ranking of the unmastered words according to context-sensitive utility.

Figure 8 shows a page of the example text with highlighted study words.

Figure 9 shows an alternative way of indicating study words Figure 10 shows a way of indicating word status IGNORE.

Figure 11 shows a way of indicating word status MASTERED.

Figure 12 shows an excerpt of the ranking of unmastered words according to a context-sensitive utility measure that takes semantic field membership into account.

Detailed description

A preferred embodiment of the present invention provides timely word status determination and notification within a reading-based device for language learning and in particular vocabulary learning.

Figure 1 is a block diagram of the components of the preferred embodiment.

A device for language learning and in particular vocabulary learning has a text-reading interface 100. The text-reading interface displays the current text 130. The device contains a learner model 110 that stores a particular learner's vocabulary knowledge.

The device contains a vocabulary model 120 that associates words with general utility values. The word status determination component 140 has access to the learner model 110, the vocabulary model 120, the current text 130, and optionally to further information sources 150. The word status determination component 140 determines and reports word status to a word status notification component 160 that controls indication of word status in the text-reading interface 100.

Those skilled in the art will appreciate that a device for language learning may include further components and that the components may communicate to each other in ways not explicitly shown in Figure 1. Those skilled in the art will appreciate that the components illustrated in Figure 1 may be implemented as separate components or several or all of them may be combined into a single component.

Figure 3 shows an example of a text-reading interface 100 that will be referred to in the proceeding text.

The function of the components shown in Figure 1 will now be described in greater detail.

The text-reading interface 100 displays electronic text and provides user controls for a variety of possible user actions including, but not limited to, moving between pages and selecting words. When text is loaded into the text-reading interface 100, an event can be signaled.

The learner model 110 stores an estimate of a learner's degree of mastery of words.

Learner modeling is well known in the prior art, and any suitable learner model can be employed in the preferred embodiment. Since a learner's actual mastery of a word cannot be directly observed, conventional learner models estimate degree of mastery of particular words based on evidence gained from learner interactions with the system.

In the preferred embodiment, the learner model maintains probabilities that a particular learner has mastered a particular word. Numerical values such as probabilities can be converted into Boolean (i.e., True/False) values by the application of a threshold. For example, probabilities greater than 0.8 could be converted into True (i.e., the learner has mastered the word) and other probabilities into False. Figure 5 shows an example of a learner model, represented as a table, at a certain point in time.

The learner model 110 can be queried by the word status determination component 140. Given a particular word, the learner model 110 will return the learner's estimated degree of mastery of the word. Upon any change in its internal state, an event can be signaled. The state of the learner model can change, for example, when a learner reads a word in the context of the current text, or when the learner explicitly demonstrates knowledge of the word by giving a correct answer in a vocabulary test.

The vocabulary model 120 associates words with a numerical value of general utility.

The model can contain any list of words deemed useful for learning. In the preferred embodiment, the model includes only open-class words such as nouns, verbs, and adjectives. However, the model could be designed to support a particular language curriculum, for example, to learn the words of animals and plants, in which case only such words would be included. Or, for example, in learning the subject area of physics, the model could contain terminology relevant to motion, gravity, and so on. The vocabulary model 120 may determine the utility value for each word by consulting frequency-ranked vocabulary lists or vocabulary lists that are ranked by any combination of word frequency, concreteness, word difficulty, word field membership, personal study list membership, imageability, importance to a subject (e.g., to physics) or to a curriculum, or other factors not limited by this list. The vocabulary model 120 may be chosen from a plurality of vocabulary models in dependence of a function of the learner such as an area of interest or a course the learner is enrolled in.

The word status determination component 140 can query the vocabulary model 120 for the general utility value of any given word. Upon any change in internal state of the vocabulary model 120, an event can be signaled. The state might change for example, if the curriculum on which the model is based is changed by a teacher. Figure 6 shows an example of a vocabulary model, represented as a table.

The word status determination component 140 may optionally augment the general utility value by taking into account further factors such as for example semantic fields by consulting further information sources 150. Details are provided below.

The word status determination component 140 determines word status by performing a word status determination process that uses the learner model 110, the vocabulary model 120 and an analysis of the current text 130 to determine the status of a word in the text 130. The process is described in detail below.

In the preferred embodiment word status values include, but are not limited to, MASTERED, STUDY, and IGNORE, corresponding to words that the learner has already mastered, should actively study, or should ignore, respectively. A further possible status value is ASSUMED_KNOWN, corresponding to words that the learner is assumed to know without having actively studied them through the reading-based language learning device. After determining word status, component 140 reports the status to the word status notification component 160.

The word status notification component 160 provides a means of notifying the learner of the status of individual words displayed in the text reading interface 100, thereby giving clear recommendations. In the preferred embodiment, a word having the status STUDY, leads to the recommendation that the learner should study the word now, and is indicated by highlighting the word by any visual means, for example using reverse video, bold typeface, or a different text colour. A word with another status (IGNORE or MASTERED) leads to a recommendation that the learner should not study the word, and is indicated by lack of highlighting. Figure 8 shows the example page in the text-reading interface with visual highlighting.

Other types of recommended learning actions can include reviewing a word, putting a word on a list for later study or revision, associating personal notes or drawings with the word, and looking up information about the word (e.g. a definition or personal notes).

Other types of indication include pop-up information boxes (see variation below), sounds (see variation below), change of mouse cursor shape, and displaying word status in a designated area on the screen.

Figure 2 is a flow chart of the word status determination process performed by component 140.

The word status determination process is triggered when an event is received in step 200. An event can be that a text 130 is loaded into the text-reading interface, that a change in the learner model 110 occurred, that a change in the vocabulary model 120 occurred, that the learner selected a word in the text-reading interface 100 or any other change of state. Those skilled in the art will appreciate that event signalling is only one possible method of starting the process. Other methods might include direct calls to the function by other components or message passing mechanisms.

The second step 210 obtains the words that occur in the current text 130.

Step 220 assigns the status MASTERED to all words in the current text that the learner has mastered according to the learner model 110.

Step 225 obtains word information from the current text 130. Word frequencies in the current text 130 can indicate, for example, how many opportunities there are to learn a specific word. The step can perform a linguistic analysis of the text to segment the text into vocabulary units such as words. (Recall that by the term "word", we mean word, phrase, term, or other unit of vocabulary in a text.) The step can also perform a linguistic analysis that converts words in the text into a normalized representation, for example, root forms or citation forms, that coincide with the forms used in the vocabulary model 120 and the learner model 110. The linguistic analysis can include stemming, part-of-speech tagging, named entity recognition, phrase identification, or other processes not limited by this list. Finally, in the preferred embodiment, a frequency analysis is performed by computing the number of times each unique word occurs in the text. Figure 4 shows an example of the word frequencies that are obtained from a text. Other types of word information can be gathered. For example, the importance of a word within the current text can be computed by comparing the relative frequency of a word in the text to its frequency in the language in general, using a well-known statistical means such as mutual information or t-scores.

Step 230 ranks the remaining words by their context-sensitive utility value, which is calculated as a function of at least two other values: the general utility and the frequency in the text 130. In the preferred embodiment, the function is defined as follows: CSU(w) = k x GU(w) x Freq(w) where GU(w) is the general utility of a word w according to the vocabulary model 120, Freq(w) is the frequency of that word in the text 130, as computed in step 210, and k is a constant. Figure 7 is an example of a ranking produced by step 230.

More generally, the function is defined as follows: CSU(w) = k x f(w)a x g(w)b x h(w)' where f is a function over the learner model (first measure), g is a function over general utility (second measure), h is function over the current text analysis (third measure). a, b, c are constant weighting factors, and k is a normalizing constant. For example, in one variation, g could assign a value of zero to all words that are not nouns, and the general utility value otherwise. In one variation, f could return the direct estimate that the learner has mastered the word w. In this case, the context-sensitive utility value takes into account the degree of mastery rather than a simple Boolean mastered /unmastered distinction. In another variation, the f function could apply a distribution to the mastery value, such as, for example, a Gaussian distribution. This would have the effect of focusing the CSU values onto a shorter range of potentially useful words. In another variation, the f function could include the number of times that the learner has seen (or read) the word w. In another variation, the h function could return the local importance of the word w in the current text. Constants a, b, c, and k are calibrated by experimentation.

Step 240 assigns the status STUDY to a predetermined number of top-ranking words in the ranked list or to all words with a context-sensitive utility value above a predetermined threshold.

Step 250 assigns the status IGNORE to the remainder of the ranked list.

Step 260 finally reports the word status assignments to the word status notification component 160.

The word status determination process will now be illustrated by means of an example.

Figure 3 shows one page of a longer example text (130) being displayed in a text-reading interface 100. Figure 7 is an example of a ranking produced by step 230 by using the example text of Figure 3, the example learner model of Figure 5 and the example vocabulary model of Figure 6. The top 20 ranked words are shown.

Figure 7 shows that "say" is the word with the highest context-sensitive utility value in the example text. "man" has a lower general utility than for example "year", "time", and "make" but occurs more frequently in the text 130 and thus has a higher context-sensitive utility.

In the preferred embodiment, when any system event occurs, for example, a learner selects a word in the text-reading interface 100, or the learner model 110 is updated as a result of a re-estimation that the learner has mastered a word, the word status determination process is performed again, and thus indications of word status in the text-reading interface can change immediately and dynamically.

Those skilled in the art will appreciate that optimizations of the preferred embodiment are possible, for example, to recalculate context-sensitive utility values only for words whose estimate of degree of mastery has changed in the learner model, or to calculate the context-sensitive utility value only for the word selected by the learner in the text-reading interface.

Several variations of the preferred embodiment are permitted.

In one variation of the preferred embodiment, the context-sensitive utility function of Step 230 takes into account semantic field membership in the text 130. A further source of knowledge 150 assigns to each word in the text 130 a semantic tag.

Examples of semantic tags include "emotion", "food", "home", "money", "sports", "movement", "numbers", "materials", "communication", "social life", "time", "technology", and "parts of the body". The frequencies of each type of semantic tag can then be taken into account by the context-sensitive utility function that is used in Step 230 to rank the candidate study words. In this embodiment, the h function (above) takes into account the frequency of the semantic tag of the word in the current text 130. Figure 12 shows an excerpt of a ranking of unmastered words according to this context-sensitive utility measure.

In another variation of the preferred embodiment, recommendations are indicated by means of pop-up information boxes when the learner selects a word in the text-reading interface 100. Figure 9 shows an example of a pop-up that indicates that the word "father" has the status STUDY and should be studied now (in this example the pop-up not only includes the suggestion "STUDY", but also a brief explanation of the word to enable an immediate learning opportunity; in other examples the user could be linked to a learning task or directed to some other learning resource). Figure 10 shows an example of a pop-up that indicates that the word "superior" has the status IGNORE and should not be studied until later (this is indicated by the suggestion "SKIP" in the pop-up). Figure 11 shows an example of a pop-up that indicates that the word "be" has the status MASTERED, and could be studied now to restore memory of the word (this is indicated by the suggestion "REMEMBER?" in the pop-up). In general, any information or recommended learning action can be associated with such pop-ups or other user interface means.

In another variation of the preferred embodiment, Step 220 assigns either the status MASTERED or the status ASSUMED_KNOWN to all the words in the current text that the learner has mastered according to the learner model 110. The status MASTERED is assigned to a word if the learner has previously carried out learning actions for that word using the language learning device. If the learner has not previously carried out learning actions for that word using the language learning device, the word status ASSUMED_KNOWN is assigned. Keeping a record of which words the learner has carried out learning actions for can be made the responsibility of the learner model 110 or another component of the system.

In another variation of the preferred embodiment, the learner can control the maximum density of word occurrences assigned the status STUDY in step 240, rather than use the predetermined number of words. The learner can control the density by, for example, selecting a percentage between 0% and 100% using for example, a slide control.

In another variation of the preferred embodiment, the language material is in audio or video form. The learner can navigate through the material and interrupt playback. This is analogous to navigating through textual material and selecting a word in the text-reading interface 100. When a word is selected in the audio or video material, a learning action is recommended according to the word status determined by the word status determination component 140. The word status notification component 160 may indicate study words by inserting sound or visual signals into the audio or video material.

In another variation of the preferred embodiment, the subject to be learned is not language per se. The textual material is for example a textbook or an encyclopaedia entry about a subject area to be learned such as physics, geography, or history. The textual material may be in the learner's first language. The vocabulary model 120 assigns general utility to words with respect to the subject area and may contain words that are relevant only to that subject area. The context-sensitive utility measure can use a further information source 150 that defines prerequisite relations between the words in the vocabulary model 120. This enables the word status determination component 140 to assign the status STUDY only to words whose prerequisite words have been mastered according to the learner model 110.

It will be appreciated that operation of one or more of the above-described components can be controlled by a program operating on the device or apparatus. Such an operating program can be stored on a computer-readable medium, or could, for example, be embodied in a signal such as a downloadable data signal provided from an Internet website. The appended claims are to be interpreted as covering an operating program by itself, or as a record on a carrier, or as a signal, or in any other form.

Claims

CLAIMS: 1. A computer-implemented vocabulary learning method comprising, following selection of a text for communication to the learner, the text comprising a plurality of words: for each of at least some of the words in the text, (a) assigning a status to the word in dependence upon first, second and third measures for the word, where the assigned status is selected from a plurality of predetermined statuses, where each status is associated with at least one corresponding recommended action to be taken by the learner, where the first measure is dependent on the learner, where the second measure is independent of the text, and where the third measure is dependent on the other words of the text; and (b) communicating the assigned status to the learner, at least if requested.
2. A method as claimed in claim 1, wherein step (a) comprises: determining a context-sensitive utility value for the word in dependence upon the second and third measures; and assigning the status in dependence upon the context-sensitive utility value.
3. A method as claimed in claim 2, comprising determining the context-sensitive utility value also in dependence upon the first measure.
4. A method as claimed in claim 2 or 3, comprising assigning a "study" status to a predetermined number of words having the highest determined context-sensitive utility values, a "study" status indicating that a learning action is recommended.
5. A method as claimed in claim 2, 3 or 4, comprising assigning a "study" status to words having a context-sensitive utility value higher than a predetermined threshold, a "study" status indicating that a learning action is recommended.
6. A method as claimed in claim 4 or 5, comprising assigning an "ignore" status to those words not assigned a "study" status, an "ignore" status indicating that a learning action is not yet recommended.
7. A method as claimed in any preceding claim, wherein the second measure is independent of the learner.
8. A method as claimed in any preceding claim, wherein the second measure is a function of the general utility of the word.
9. A method as claimed in claim 8, wherein the general utility of the word is based upon any one or more of: word frequency; concreteness; word difficulty; word field membership; personal study list membership; imageability; and importance to a subject area or curriculum.
10. A method as claimed in claim 8 or 9, wherein the second measure is zero for all words of a certain type, for example nouns, and the general utility of the word otherwise.
11. A method as claimed in any preceding claim, wherein the third measure is a function of the utility of the word within the context of the text.
12. A method as claimed in claim 11, wherein the utility of the word within the context of the text is determined in dependence upon the frequency of the word within the text.
13. A method as claimed in claim 11 or 12, wherein the utility of the word within the context of the text is determined in dependence upon the relative frequency of the word within the text compared to its overall frequency within a predetermined scope of words, where the predetermined scope comprises for example all the words in a sample of the language concerned.
14. A method as claimed in claim 11, 12 or 13, wherein the third measure is a function of the local importance of the word within the text.
15. A method as claimed in any preceding claim, wherein each word is associated with a tag placing it as part of a group of words relating to a common concept, and wherein the third measure is a function of the frequency of the word's associated tag within thetext.
16. A method as claimed in any preceding claim, comprising assigning the status in a way as to limit the number and/or density of words assigned to a study" status, a "study" status indicating that a learning action is recommended.
17. A method as claimed in any preceding claim, wherein the first measure is a function of the learner's existing knowledge of the word.
18. A method as claimed in claim 17, wherein the first measure provides an indication of the extent or degree to which the learner has mastered the word.
19. A method as claimed in claim 17 or 18, wherein the first measure is related to the number of times that the learner has seen or read the word.
20. A method as claimed in claim 17, 18 or 19, wherein the first measure effectively applies a distribution to the extent or degree to which the learner has mastered the word, for example a Gaussian distribution, that is arranged to focus the selection of study words onto a narrower range of potentially useful words.
21. A method as claimed in any one of claims 17 to 20, comprising determining from the first measure a Boolean indication of whether of not the learner has mastered the word.
22. A method as claimed in any preceding claim, comprising, for at least some of the words of the text for which the first measure is determined to place those words within a predetermined category of words, assigning the status in dependence upon the first measure but not the second or third measures.
23. A method as claimed in claim 22, when dependent on claim 17, wherein the predetermined category of words include those words that are considered already to be learned by the learner.
24. A method as claimed in claim 23, comprising assigning a "mastered" status to those words determined to be in the predetermined category, the "mastered" status indicating for example that no learning action is required or that a revision exercise be suggested.
25. A method as claimed in any preceding claim, comprising re-assigning the status of at least some of the words when it is signalled that one or more of the first to third measures has changed or might have changed.
26. A method as claimed in any preceding claim, wherein communicating the text to the learner comprises displaying the text on a display, providing the text in printed form, and/or emitting an audio representation of the text.
27. A method substantially as hereinbefore described with reference to the accompanying drawings.
28. A vocabulary learning apparatus comprising means for processing each of at least some of the words in a text selected for communication to the learner by: (a) assigning a status to the word in dependence upon first, second and third measures for the word, where the assigned status is selected from a plurality of predetermined statuses, where each status is associated with at least one corresponding recommended action to be taken by the learner, where the first measure is dependent on the learner, where the second measure is independent of the text, and where the third measure is dependent on the other words of the text; and (b) communicating the assigned status to the learner, at least if requested.
29. A program for controlling an apparatus to perform a method as claimed in any one of claims 1 to 27.
30. A storage medium containing a program as claimed in claim 29.