WO2020074067A1

WO2020074067A1 - Automatic language proficiency level determination

Info

Publication number: WO2020074067A1
Application number: PCT/EP2018/077467
Authority: WO
Inventors: Hannes LESKELÄ; Charles Li
Original assignee: Signum International Ag
Priority date: 2018-10-09
Filing date: 2018-10-09
Publication date: 2020-04-16

Abstract

There is provided a computerized system and method,and a computer program product for determining a language proficiency level, LPL, based on an input text string, the method comprising: sending a first signal (S1) to a user device, the first signal (S1) comprising or being indicative of a first text string (Question); receiving a second signal (S2) from the user device, wherein the second signal (S2) is generated by a user interacting with the user device, the second signal (S2) comprising or being indicative of a second text string (Answer); processing the second text string (Answer) to derive at least one characteristic (C); determining, using a machine learning, ML, model: a first prediction(P1) based on a selection of the at least one characteristic (C); and a second prediction(P2) based on the raw data of the second text string (Answer); and determining a language proficiency level, LPL, to which the second text string (Answer) belongs, based on the first prediction(P1) and the second prediction(P2).

Description

AUTOMATIC LANGUAGE PROFICIENCY LEVEL DETERMINATION

TECHNICAL FIELD

The present disclosure relates to a computerized system, computerized method and computer program product for automatically determining a language proficiency level of a foreign language learner based on an input text string.

Specifically, in embodiments, the present disclosure relates to performing such automatic determination of a language level using a machine learning model specifically adapted for the task, and further to the updating of such an adapted machine learning model.

BACKGROUND

Students wishing to learn a foreign language are likely to start with varying degrees of proficiency. In order to offer a foreign language student appropriate exercises and content to optimize the student’s learning progress, an understanding of the student’s present proficiency level of the foreign language to be studied first needs to be determined. This includes the student’s understanding of the language and how well the student uses the language. The proficiency level of the foreign language to be studied may hereinafter also be referred to as a student’s language knowledge or proficiency level, or a user’s language proficiency level (LPLUSER)·

A student of a foreign language, especially a student who is fairly new to studying the foreign language, is likely to be discouraged if the student is assigned a language proficiency level that is far from their actual language proficiency level, and may disengage completely, or feel the need to change level before having the possibility to properly assimilate the teaching material and make progress.

A student’s language proficiency level may be determined through assessment and assignment by a language teacher, or, if applicable, be based on the level of a foreign language education or class that the student is currently in.

Currently, to determine the level of language proficiency of a student and thereby be able to adapt the language education to the student’s previous proficiency level, each student under- take a lengthy multiple-choice test, for example in a digital online or offline environment, the test often taking around 40 minutes. This solution works well in the meaning of accurately assessing the students’ level of knowledge or proficiency. However, since it takes so long it is only applicable for classic“old school” courses. A student wanting to find an online foreign language course via a computer or smart device application or web-based interface, for self-studies or distance learning, for example, is not interested in spending 40 minutes under-taking a test, and then waiting even longer for a result, before being granted access to the language learning content. This is likely to disengage the student before the assessment is complete. If the student is to attend a regular course with a teacher and scheduled classes, the test may be more acceptable, but it is still not desirable that it takes so long to perform and evaluate.

A student is also likely to be discouraged by too much writing, so the assessment should be made as short as possible, but with a shorter test the accuracy of the language level assessment would risk being inaccurate, and we are back at the problem of discouraging the student because the assigned a language proficiency level that is far from their actual language proficiency level and the content presented therefore not relevant.

Another problem is that the test results may sometimes be inaccurate, as it is very easy to improve the result if the test is taken more than once by the same student.

Some present solutions provide slight adaptation of the test questions by presenting harder subsequent questions to a student getting the previous questions right, or easier subsequent questions to a student getting the previous questions wrong. By this slightly less static approach, the chance that the test result will be accurate increases and the risk that a student re-taking the test obtains an improved result is somewhat reduced. However, the accuracy problem is not entirely solved and the time issue is not at all addressed by this adaptation.

There is clearly a need for a solution to these problems.

SUMMARY

Embodiments of the present disclosure solve, or at least ameliorate, the identified problems.

Advantageously, embodiments presented herein enable a two-step prediction, or assessment, of the language proficiency level of a user based on different aspects of the content of a text string representing a user input answer to one or more questions prompted to the user. Embodiments presented herein obtain this aim by the use of an adapted and specifically trained machine learning model. The two-step prediction takes into consideration different aspects of text, writing and language characteristics, as well as analysis of the entire answer in raw text format. The machine learning model may comprise a combination of two different types of machine learning models of which each is configured to perform one of the two steps of the two step prediction. The machine learning model may in these cases for example comprise a regression or classification based model in combination with a neural network (NN) based model.

A further advantage that the inventors have realized that by using the two-step prediction, only one or more open questions needs to be prompted to the user, and the question can solicit a broad range of answers from the user, and the system and method are still able to perform reliable prediction and language proficiency level determination based on the answer. As only one or a handful of questions need be asked, the time that the user has to spend on the text is radically shorter than in the case of a lengthy questionnaire wherein each question is sometimes used for assessing only one or maybe two aspects of the user’s language use. In other words, embodiments herein enable a user friendly language proficiency level determination that is fast, automatic, accurate and reliable.

According to a first aspect, there is provided a computerized system for determining a language proficiency level based on an input text string, the system comprising a back-end node, in turn comprising: at least one processor; a first interface configured to enable communication between the at least one processor and one or more user devices comprised in a front-end node; and a memory configured to store a current ML module and being configured to communicate with the processor. One or more of the at least one processor is configured to: send a first signal to a user device, via the first interface, wherein the first signal comprises or is indicative of a first text string; receive, in response to sending the first signal, a second signal from the user device, via the first interface, wherein the second signal is generated by a user interacting with the user device, and wherein the second signal comprises or is indicative of a second text string. One or more of the at least one processor is further configured to process the second text string, to derive at least one characteristic of the second text string, the at least one characteristic comprising: the average length of words; the use of advanced words; the average length of sentences; the number of characters; the number of words; the lexical diversity; the number of long words; N-grams; grammatical errors; and/or misspelled words. One or more of the at least one processor is further configured to determine, using a current machine learning (ML) model: a first prediction based on a selection of the at least one characteristic; and a second prediction based on the raw data of the second text string, and to determine a language proficiency level, LPL, to which the second text string belongs, based on the first prediction and the second prediction. According to another aspect, there is provided a method for determining a language proficiency level, LPL, based on an input text string, the method comprising: sending, by a processor, a first signal to a user device, via a first interface, wherein the first signal comprises or is indicative of a first text string; in response to sending the first signal receiving, in the processor, a second signal from the user device, via the first interface, wherein the second signal is generated by a user interacting with the user device, and wherein the second signal comprises or is indicative of a second text string. The method further comprises processing the second text string, by a processor, to derive at least one characteristic of the second text string, the at least one characteristic comprising: the average length of words; the use of advanced words; the average length of sentences; the number of characters; the number of words; the lexical diversity; the number of long words; N-grams; grammatical errors; or misspelled words. The method further comprises determining, using a machine learning, ML, model: a first prediction based on a selection of the at least one characteristic; and a second prediction based on the raw data of the second text string. Thereafter, the method comprises determining, using a processor, a language proficiency level, LPL, to which the second text string belongs, based on the first prediction and the second prediction.

According to a further aspect there is provided a computer program loadable into a memory communicatively connected or coupled to at least one data processor, comprising software for executing the method according to any of the embodiments presented herein when the program is run on the at least one data processor.

According to yet another aspect there is provided processor-readable medium, having a program recorded thereon, where the program is to make at least one data processor execute the method according to of any of the embodiments presented herein when the program is loaded into the at least one data processor.

BRIEF DESCRIPTION OF THE DRAWINGS

The invention is now to be explained more closely by means of preferred embodiments, which are disclosed as examples, and with reference to the attached drawings.

Fig. 1 shows a schematic overview of a system according to one or more embodiments;

Fig. 2 is a flow chart of a computerized method for automatically determining a language proficiency level, according to one or more embodiments;

Fig. 3 is a flow chart of one or more optional method embodiments; Fig. 4 is a flow chart of a computerized method for training a machine learning model for automatically determining a language proficiency level, according to one or more embodiments.

DETAILED DESCRIPTION

Introduction

In the present context, when we refer to a user we mean a person who wishes to study a foreign language using any type of education, from classroom based education to taking online classes, app-based self-studies, or any combination of these. A user may therefore for example also be referred to as a student, or a foreign language learner.

The inventors have realized that since existing solutions on how to assess and determine the language proficiency level (LPL) are such tedious processes, it is discouraging for the users and creates a major barrier of access to the educational content. The assessment may, for example, typically include a structured set of questions, often multiple choice questions, which can take up to 40 minutes to answer. After answering, an assessment is made before the educational content/material/class is accessible to the user.

The inventors have realized that in order to obtain a faster assessment or determination of a language proficiency level (LPL) related to the answer to one or more answers to questions or assignments in a test situation, the assessment or determination must be automated. Attempt have been made to automatically assess the language proficiency level of students, LPLUSER, by evaluating a user translation of a test sentence selected from a corpus, by comparing the user translation with high-quality reference translations that are usually done by skilled translators (see“Automatic Measuring of English Language Proficiency using MT Evaluation Technology”, by Keiji Yasuda, Fumiaki Sugaya, Eiichiro Sumita, Genichiro Kikui and Toshiyuki Takezawa, Anthology: W04-1708, Volume: Proceedings of the Workshop on eLearning for Computational Linguistics and Computational Linguistics for eLearning, Year: 2004). The comparison is based on a limited set of metadata features where after mathematical formulas as used for calculating the score of each sentence. This method however, although reducing the time the student has to spend on performing the test compared to the multi choice questions (depending of course on the length of the text to be translated), is not very engaging for the student, and the accuracy of the result can still be improved. For example, a translation comprising the phrase“Me I am” may score equally high as the correct phrase Ί am me”, as only a limited set of metadata features are employed, and the scoring of the mathematical algorithm is static based on the identified meta data features. The inventors have realized that this problem, and others, can be solved or at least ameliorated through embodiments of the present disclosure by a computerized system, method and computer program product for determining a language proficiency level of an input answer to an open question that can solicit a broad range of answers from the user. This engages the user much more than the multiple-choice questions or the translation, as it is similar to a chat situation, and thereby a much more natural way to assess language proficiency or knowledge level. It is similar to an oral test performed with a live tutor, as the system mimics a way a student would be assessed by a real human teacher. Through a few rounds of conversation, a teacher would be able to guess the student’s language proficiency level, and the system interaction, based on embodiments presented herein, performs in a similar way. As the answers may vary greatly in length, content and quality, there is no use in trying to measure them by comparison to a template and based on a set limited number of metadata features alone, as this would not render a reliable result. Instead, in embodiments presented herein, the system and method are configured to classify the answer by a two-step prediction of the language proficiency level of a text string representing a user input answer to a question prompted by the system. The text string representing a user input answer is hereinafter also referred to as a second text string Answer. The second text string Answer, i.e. the input of the user represented in text form, is hereinafter to be understood as the text form representation of the user input response to a single question, or the combined user input response to two or more consecutive questions prompted by the system. In other words, the second text string Answer may comprise the answer to more than one question prompted by the system, and the machine learning model of embodiments presented herein advantageously processes the entire second text string Answer at once in each of the two steps of the two-step prediction.

The two-step prediction may comprise determining a first language proficiency level prediction, referred to herein as a first prediction P1 , based on input metadata, for example using a complex regression-based machine learning (ML) model, or regression model. The metadata features, below also referred to as characteristics C, are retrieved by processing the input text string. The metadata features may for instance comprise a selection of: the average length of words, the use of advanced words, the average length of sentences, the number of characters, the number of words, the lexical diversity, the number of long words (for instance > 10 characters), N-grams, grammatical errors, misspelled words, etc., describing characteristics of the text string.

The two-step prediction may further comprise a step of determining a second language proficiency level prediction, referred to herein as a second prediction P2, based on the text string representing the user input, by validating the proficiency level of each sentence as a whole. This is achieved by evaluating such abstract concepts as vocabulary, grammar and overall language understanding through identification of complex correlations and relations obtained by previous, and possibly continuous, training of the neural net (NN) based ML, or NN model, model using large database(s) of labeled and unlabeled training data. The prediction is further described in connection with step 240 of Fig. 2. The two determined predictions are then used in combination to determine an LPL to which the input answer belongs, with high accuracy.

As a non-limiting example, the large database(s) of labeled and unlabeled training data may have been gathered during many years of experience in teaching language to learners, storing their test results and determined language levels (based on a relevant level system, such as for example the current English Live levels), and labeling of the training data content based on the determined language levels. The method may further comprise, and the system may further be configured to, generate and/or continuously training an ML model configured to perform the above predictions based on the user input text data. In a non-limiting example, the number of predetermined LPLs is 16, but any other suitable number of levels, both lower and higher, is of course equally applicable, depending on the detail level required, preferences and system setup. Depending of the number of levels, the accuracy of the assessment, the resulting determined LPL, will of course vary. For example, a low number of LPLs renders high accuracy when it comes to selecting the“correct” LPL, but the granularity is low, and vice versa. An LPL may herein after also be referred to as simply a level.

In embodiments wherein a NN model is used for the second prediction, the fact that the NN model can infer the prediction without handcrafted features/characteristics, such as for instance the at least one characteristic C herein, also makes the solution more robust compared to existing solutions not using a NN model. Advantageously, the embodiments presented herein combine both types of models to render an even more reliable determination of LPL. During assessment, the learner may be prompted by the system to input, in text form or orally, answers to one or more questions posed by the system. In different embodiments, the one or more question may be in text form or in audio form.

The one or more questions are advantageously broad, open, questions that can solicit a broad range of answers from the user. The machine learning model employed by the system and method according to embodiments presented herein is adapted/trained to take as input text string representing such differing free text answers as are input by different users and from the text strings derive metadata associated with sentences and words of the answer, as well as semantic information. The metadata may in some non-limiting example comprise a selection of the number of words used in the answer, the number of sentences used, difficulty of words/sentences. From this derived information, the machine learning model is further adapted and/or trained to determine an LPL of the input answer/text string. Thereby, the test mimics an oral test taken before a live tutor, since the system is able to find complex relations and make an assessment of various combined parameters relevant to the language proficiency of the user from a single text string. Also, the assessment, or determination of an LPL, is fast, thereby solving the problem of the previous solutions taking up to 40 minutes as there are a great number of questions to be answered in order to provide information on an equally great amount of separate relevant aspects of language proficiency.

In a non-limiting example, a user will be asked three questions via an app. The combined three responses form the text input to the service provided by the model. The service may in this non-limiting example return an LPL in the form of an integer between 1 and 16.

While the model may perform well in the aggregate, there will be times when one or both steps of the two-step prediction might be off, particularly if the user has not entered much text or oral input to evaluate. These instances will at times be predictable, particularly when the user has entered very little text. In order to minimize the risk of making erroneous assessments of the LPL based on an inaccurate prediction, the solution may in some embodiments further comprise determining a confidence value associated with the prediction, wherein the confidence value is indicative of the probability that the LPL is accurate within an acceptable predetermined deviation. The confidence value may for instance be expressed as a number, in a non-limiting example being a number between 0 and 1 , or according to any other suitable interval or representation. In some embodiments, wherein the confidence value is below a preset threshold, the system determined LPL may be discarded and the user may for example be prompted to input a new answer, or enabled to select an LPL, via an interactive menu or the like in a user interface of the user device, or another an input device connected or communicatively coupled to the user device or system.

As stated herein, a student is likely to be discouraged by too much writing. Therefore, as the inventors have realized, the amount of text or oral answer to be input should be kept at a minimum, while still being sufficient to perform an adequate assessment or prediction. In other words, the ML model must work on short inputs as well as the lengthy inputs.

In the present context, a system or system unit obtaining information, data or the like is to be understood as the system or system unit receiving it through a pushing action and/or retrieving it in through a pulling action from another system unit or unit external to the system, but communicably connected to the system or system unit in question.

System architecture

Below, embodiments of the inventive system are described in more detail, with reference to

Fig. 1.

Fig. 1 shows a system 100, according to embodiments of the invention, for determining a language proficiency level based on an input text string, the system 100 comprising a back- end node 101 comprising: a processor 1 10; a first interface 120 configured to enable communication between the processor 110 and one or more user devices 150i_...n comprised in a front-end node 102; and a memory 140 configured to store a current ML module and being configured to communicate with the processor 1 10. The processor 1 10 may be configured to send a first signal S1 to a user device 150,, via the first interface 120, wherein the first signal S1 comprises or is indicative of a first text string Question. In response to sending the first signal S1 , the processor 110 may further be configured to receive a second signal S2 from each of the one or more user device 150i_...n, via the first interface 120, wherein the second signal S2 is generated by a user 155, interacting with the user device 150 , and wherein the second signal S2 comprises or is indicative of a second text string Answer. In some embodiments, the first interface 120 may be configured to enable communication between the processor 1 10 and one or more user devices 150i_...n via a wireless network 180.

In some embodiments, a second processor 125 may be configured to send a first signal S1 to a user device 150,, via the first interface 120. In these embodiments, the second processor 125 may further, in response to sending the first signal S1 , be configured to receive a second signal S2 from each of the one or more user device 150i_...n, via the first interface 120, wherein the second signal S2 is generated by a user 155, interacting with the user device 150 , and wherein the second signal S2 comprises or is indicative of a second text string Answer. Such a second processor 125 may be comprised in the backend node 101 of the system 100, as illustrated by the dashed contouring of the unit 125 in Fig. 1 , and the processor 1 10 and the second processor 125 may be implemented as separate processing units, or as a common processing unit. Alternatively, the second processor 125 may be external to the system 100 and communicably coupled to the one or more user device 150, and the first interface 120. The second processor 125 may further be configured to forward the received second text string Answer to the processor 1 10 for further processing. The processor 1 10 is further configured to process the second text string Answer to derive at least one characteristic C of the second text string, the at least one characteristic C comprising: the average length of words; the use of advanced words; the average length of sentences; the number of characters; the number of words; the lexical diversity; the number of long words; N-grams; grammatical errors; or misspelled words. This at least one characteristic C, as well as the raw data text string, is then used as input into a machine learning model configured to classify the text string into one of a number of selectable language proficiency levels based on the at least one characteristic C and the raw data text string. This is enabled by the system 100 being configured to determine, using a current machine learning (ML) model: a first prediction P1 based on a selection of the at least one characteristic C; and a second prediction P2 based on the raw data of the second text string Answer. Thereafter, the processor 1 10 is further configured to determine a language proficiency level, LPL, to which the second text string Answer belongs, based on both the first prediction P1 and the second prediction P2. The processor 1 10 may in one or more embodiments be configured to select the language proficiency level, LPL, from a set of N predetermined language proficiency levels, LPLs. In some embodiments, the processor 1 10 may be configured to determine the language proficiency level, LPL, based on a predetermined algorithm configured to minimize the residual error or errors of the resulting language proficiency level, LPL, determination, or minimize the total residual error by minimizing each individual prediction error P1 , P2.

In some embodiments, the current ML model is configured to determine the second prediction P2 by performing semantic analysis on the raw data of the second text string Answer.

In one or more embodiments, the current ML model comprises a first machine learning model being configured to determine the first prediction P1 , and a second machine learning model being configured to determine the second prediction P2. The first ML model may be regression model, or a classification model. The second ML model may be an NN model. According to these embodiments, the system 100 is configured to determine the first prediction P1 using the regression model or classification model, and to determine the second prediction P2 using the NN model.

After determining the LPL, the processor 1 10 may further be configured to estimate a user language proficiency level LPLUSER associated with the user 155,, based on the determined language proficiency level LPL. The processor 110 may further be configured to communicate the user language proficiency level LPL_USER to the user device 150,, via the first interface 120. Thereby, the user 155, is made aware of the test result.

In embodiments wherein a second processor 125 is employed, the second processor 125 may be configured to perform all or part of the communication with the one or more user device 150i_...n, while the processor 1 10 may be configured to perform all processing of the received second text string Answer, predictions using the machine learning model, and the determination based on the predictions. Of course, the processor 1 10 may in these embodiments also be configured to perform all or part of the communication with the one or more user device 150i_...n according to any of the embodiments herein, instead of or as a complement to the second processor 125.

According to some embodiments, the processor 110 may be further configured to determine a confidence level indicative of the accuracy of the determined language proficiency level LPL or user language proficiency level LPL_USER- Thereby, the reliability of the resulting LPL is improved. If the confidence level is below a predetermined threshold, the processor 1 10 may be configured to discard the result and optionally prompt the user to provide a new answer, via the first interface 120, or enable the user 155, to select an LPL, via an interactive menu or the like in a user interface of the user device, or another an input device connected or communicatively coupled to the user device or system. This may for example be the case if the answer provided by the user is too short for the system to be able to make proper predictions.

The system 100 may further comprise a second interface 130 configured to enable communication between the processor 1 10 and one or more data source 160, wherein the one or more data source 160 is configured to receive, store, and send training data for a machine learning, ML, model. In such embodiments, the processor 1 10 may be configured to send the language proficiency level LPL or user language proficiency level LPL_USER, together with the associated first text string Question and second text string Answer, to one or more data source 160, via the second interface 130, to be included in the training data. Alternatively, the processor 1 10 may be configured to send the language proficiency level LPL or user language proficiency level LPL_USER to one or more data source 160, and the second processor 125 may be configured to send the first text string Question and/or second text string Answer to the same one or more data source 160, or a different selection of one or more data source 160, for storage. In some embodiments, the second interface 130 may be configured to enable communication between the processor 1 10 and one or more data source 160 via a wireless network 170. The stored LPLs, LPLUSERS, Questions and/or Answers may, once stored in a data source 160, in some embodiments be comprised in the training data for future use in training of a new ML model, or re-training of a current ML model, for use in a system or method described herein.

In some embodiments, the processor 1 10 may be configured to update the current ML model by: checking if there is new training data in one or more data source 160 and, if there is new training data in the one or more data source 160, obtain training data from the one or more data source 160 that comprises new training data. Thereafter, the processor 1 10 may be configured to obtain the current machine learning model from the memory 140; train the current machine learning model based on the obtained training data; and store the trained machine learning model as the current machine learning model in the memory 140.

The processor 1 10 may be configured to train a new ML model, instead of retraining the current ML model, based on training data from one or more data source 160. In such embodiments, the processor 1 10 may be configured to, after training; store the trained ML model in the memory 140 as the current ML model, which means that it is the new trained model that will next be used in LPL determination according to any of the embodiments presented herein.

In one or more embodiments, the processor 1 10 may be configured to perform any or all of the method steps being performed by a processor in the method embodiments described in connection with Figs. 2, 3 and 4.

Method embodiments

Fig. 2 shows a method according to one or more embodiments for determining a language proficiency level, LPL, the method comprising:

In step 210: sending, by a processor, a first signal S1 to a user device, via a first interface, wherein the first signal S1 comprises or is indicative of a first text string Question.

In step 220: if a second signal S2 has been generated by the user device in response to the first signal S1 , receiving, in the processor, the second signal S2 from the user device, via the first interface. The second signal S2 may be generated by a user interacting with the user device. In one or more embodiments, the second signal S2 comprises or is indicative of a second text string Answer.

The processor used to perform steps 210 and 220 may for example be the processor 1 10 or the second processor 125. In step 230: processing, by the processor 1 10, the second text string Answer to derive at least one characteristic C.

In one or more embodiment, the processing of step 230 comprises comparing the second text string Answer to training data stored in the memory 130 and/or accessible via the data source 160 to determine at least one characteristic C of the second text string Answer.

The at least one characteristic C of the second text string Answer may also be referred to as metadata features of the text string and may for instance comprise a selection of: the average length of words, the use of advanced words, the average length of sentences, the number of characters, the number of words, the lexical diversity, the number of long words (for instance > 10 characters), N-grams, grammatical errors, misspelled words, etc.

In step 240: determining, using a current ML model, a first prediction P1 based on a selection of one or more of the at least one characteristic C, and a second prediction P2 based on the raw data of the second text string Answer.

The current ML model may in one or more embodiments comprise a combination of a first ML model of a first type and a second ML model of a second type. In these embodiments, step 240 comprises determining the first prediction (P1 ) by the first ML model and determining the second prediction (P2) by the second ML model.

In one or more embodiments, of the first ML model is a regression model or a classification model and the second ML model is an NN model. In these embodiments, step 240 comprises determining the first prediction P1 by the regression model, based on a selection of one or more of the at least one characteristic C, and determining the second prediction P2 by the NN model based on the raw data of the second text string Answer. Of course, the predictions are only referred to as“first” and“second” to show that there are two different predictions being made by the current ML model, not that there is a certain order in which the predictions must be performed. The determination of P1 and Class 2 may be made in any consecutive order, or in parallel, as the order does not affect the result of the method.

In embodiments wherein the first ML model is a regression model, the regression model may for example be a random forest model, a linear regression model, a logistic regression model, a Bayesian ridge regression model, a support vector machine (SVM) model or a combination of a suitable selection of the alternatives above. Alternatively, or in combination with any of these embodiments, the first ML model may comprise a classification model.

The use of a regression model is advantageous compared to using only classification model techniques, as the regression model provides a continuous scale of values instead of a more limited amount of discrete values/levels, which makes it easier to determine with precision how accurately the language proficiency level (LPL) is determined in step 250.

The regression model may in these embodiments be configured to use a set of metadata features, herein referred to as at least one characteristic C, derived from the second text string Answer as input, and to process the at least one characteristic C based on a set of logic rules and conditions, for example comprising conditions of the type“if this, then that”, to determine a first prediction. The set of logic rules and conditions to be comprised in the regression model may have been determined during training of the regression model, wherein the training was performed using training data from one or more of the data sources 160. In some embodiments, the selection of one or more of the determined at least one characteristic C of the second text string Answer may comprise selecting a predetermined subset of the determined characteristics C, which will provide the most accurate basis for language level assessment.

As a further advantage, use of a regression model also enables a more granular continuous monitoring of improvements of the users language proficiency, in the case that data of the user’s performance is available to the system over time, for example if the user after the LPL has been determined accesses language learning content provided by the provider of the LPL determination test, and/or via the same interface, webpage and/or mobile application. Of course, a classification model could still be used to monitor progress in the user’s language proficiency over time, but increased granularity enables more precisely following a user’s progress. For example, using a regression model for such follow-up enables, for example, creating continuous plots of a user’s language proficiency development and finding connections between this development and the type of content that the user is exposed to/presented with. , Thereby, the system can learn what type of content each user should be fed in the future to maximize the individual user’s language proficiency development.

In embodiments wherein the second ML model is an NN model, the NN model may for example be a convolutional NN (CNN) model, a recurrent NN model model, a long short-term memory (LSTM) network model, or a recursive NN model. The NN model may be configured to, and the method step 240 may comprise, performing semantic analysis on the raw data of the second text string Answer to determine information indicative of the semantic content of the second text string Answer. Performing semantic analysis may comprise evaluating such abstract concepts as vocabulary, grammar and overall language understanding and possibly accent through identification of complex correlations and/or relations. The complex correlations and/or relations may have been obtained by previous, and possibly continuous, training of the NN model, using a large amount of labeled and unlabeled training data.

The first prediction P1 , which may be performed by a regression model, and the second prediction P2, which may be performed by an NN model, may be represented as values on a common scale, or values on different scales. In the case of different scales, this is of course taken into account and compensated for in the subsequent determination of an LPL based on the two predictions P1 , P2 in step 250.

In step 250: determining, using the processor 110, an LPL to which the second text string Answer belongs, based on the first prediction and the second prediction.

In one or more embodiments, the LPL is selected from a set of N predetermined LPLs.

In one or more embodiments, the determination of step 250 is made based on a predetermined algorithm configured to minimize the residual error or errors of the resulting LPL determination, or minimize the total residual error by minimizing each individual prediction error P1 , P2. The two-step prediction, including predictions based on characteristics of the text string and the raw data of the text string, respectively, performed independently of each other, is what provides the high accuracy of the LPL determination in step 250. By using a using an ML model comprising both a regression model and an NN model to perform the two independent predictions is even more advantageous. Just using a metadata based regression model mimics how a human assesses how the language is used in the input second text string Answer in terms of word usage, length of sentences etc. However, one cannot rely solely on metadata since that would be prone to an erroneous prediction if input consisted of for instance of a dictionary. This is why it is so advantageous to combine the regression model with the an NN model to effectively gauge the semantics and meaning of the text, as well as latent features picked up by the network that are very hard to decipher for a human brain and almost impossible to transform into a useful rule for information extraction.

The processor used to perform steps 230 to 250 may for example be the processor 110.

The first text string Question sent in step 210 may advantageously comprise one or more open questions that can solicit a broad range of answers from the user. As described herein, the machine learning model is further advantageously configured or trained to find complex relations and make an assessment of various combined parameters relevant to the language proficiency of the user from a single text string. A further advantage, and solution to an identified problem of previous existing solution, is thereby that the input needed is a text string representing the answer to just a single, or possibly a low number of, open questions. Thereby, the time that the user has to spend on the text is radically shorter than in the case of a lengthy questionnaire wherein each question is sometimes used for assessing only one or maybe two aspects of the user’s language use. In other words, embodiments herein enable a user friendly language proficiency level determination that is fast, automatic, accurate and reliable, as the method described herein mimics the situation of an oral test taken before a live tutor both in the way the questions are asked, the complex assessment performed by such a live tutor, and how fast the result is determined.

The set may comprise a suitable number of integers or decimal numbers. In a non-limiting example, the number of LPL levels in the set may be based on current English Live levels, wherein N=16. Of course, any other suitable number of levels is equally applicable.

Fig. 3 shows a number of optional method steps that may in some embodiments follow after step 240 of Fig. 2.

In some embodiments, as illustrated in Fig. 3, the method may comprise a step 310 of estimating a language proficiency level LPLUSER of the user, based at least on the determined LPL. In other words, the method may comprise estimating a proficiency level for the user who has input the second text string, i.e. the Answer, in response to the text string Question. The LPLUSER may be set to the same value as the determined LPL, exemplified by the determined LPL being for instance 12 on a scale from 1 to 16, and the LPLUSER correspondingly being estimated to 12 on a scale from 1 to 16. Alternatively, the determined LPL may be set according to a first scale, having levels from 1 to N, expressed as integers or with one or more decimals, while the LPLUSER is set according to a different scale, for instance having a lower amount of selectable levels. In a non-limiting example, the determined LPL may be set on a scale from 1 to 16, while the LPLUSER is set according to only 4 levels, wherein, as an example, LPL from 1 to 4 corresponds to LPLUSER = 1 , LPL from 5 to 8 corresponds to LPLUSER = 2, LPL from 9 to 12 corresponds to LPLUSER = 3 and LPL from 13 to 16 corresponds to LPLUSER = 4. In a further non-limiting example, the LPLUSER levels 1 to 4 may correspond to four levels of classes or content that may suit foreign language student with different language proficiency. Of course, any number of levels for both the determined LPL and LPLUSER may be employed according to different circumstances.

In different embodiments, a user device 150, may comprise a graphical user interface, not shown in the figures, via which the text string Question may in displayed to the user in graphical form, and/or a speaker not shown in the figures via which the text string Question may be output as audio, possibly after being processed by text to speech functionality comprised in a front-end processing device, not shown in the figures. The user device may in different embodiments comprise one or more input device not shown in the figures, such as a text input device not shown in the figures in the form of e.g. a keyboard or a touch display, and/or an audio input device not shown in the figures comprising a microphone and speech to text functionality.

As shown in Fig. 3, the method may in some embodiments comprise a step 320, following step 310, of communicating the LPLUSER to the user device 150, via the first interface 120.

As further shown in Fig. 3, the method may in some embodiments comprise a step 330 of determining a confidence level indicative of the accuracy of the determined LPL or estimated LPLUSER-

The method according to any of the embodiments described herein may further comprise sending the LPL and/or LPLUSER, together with the associated first text string Question and second text string Answer, to the data source 160 to be included in the training data. That way, the more the system 100 is used, the more training data will be available to the ML model to train on and use for determination of LPL.

As further shown in Fig. 3, the method may in some embodiments comprise a step 340 of updating, by the processor, the current ML model, by: checking if there is new training data in one or more data source; if there is new training data in the one or more data source, obtaining training data from the one or more data source; obtaining a current machine learning model from the memory; training the current machine learning model based on the obtained training data; and storing the trained machine learning model as the current machine learning model in the memory. Thereby, the machine learning model may be iteratively, possibly continuously, updated and the current ML model version, i.e. the most recently updated or trained ML model, which is based on the largest amount of training data and thereby being the most reliable, is always available to the system 100. Also, the test material is not static, but improves with more usage from the users.

In some embodiments, the method comprises training a new ML model based on training data from one or more data source, instead of retraining the current ML model. In such embodiments, the new ML model may after training be stored in the memory 140 as the current ML model that will be used in LPL determination according to embodiments presented herein. The processor used to perform the method steps of Fig. 3 may for example be the processor 1 10.

In Fig. 4, details of some embodiments of the method for updating the current ML model in step 340 are illustrated. As seen in Fig. 4, the method step 340 may comprise the following sub-sequence of steps:

Step 400: checking if there is new training data in one or more data source 160 and, if there is new training data in one or more data source 160, continuing the method with step 410.

Step 410: obtaining, by the processor 110, training data from the one or more data source 160.

Step 420: obtaining, by the processor 1 10, the current machine learning model stored in the memory 140.

In different embodiments, step 420 may be performed before, after or in parallel with step 410.

Step 430: training the current machine learning model based on the obtained training data.

Step 440: storing, and thereby replacing, the trained machine learning model as the current machine learning model in the memory 140.

The processor used to perform the method steps of Fig. 4 may for example be the processor 1 10.

All of the process steps, as well as any sub-sequence of steps, described with reference to

Fig. 2, 3 or 4 above may be controlled by means of a programmed data processor.

Moreover, although the embodiments of the invention described above with reference to the drawings comprise a data processor and processes performed in at least one processor, the invention thus also extends to computer programs, particularly computer programs on or in a carrier, adapted for putting the invention into practice. The program may be in the form of source code, object code, a code intermediate source and object code such as in partially compiled form, or in any other form suitable for use in the implementation of the process according to the invention. The program may either be a part of an operating system, or be a separate application. The carrier may be any entity or device capable of carrying the program. For example, the carrier may comprise a storage medium, such as a Flash memory, a ROM (Read Only Memory), an EPROM (Erasable Programmable Read-Only

Memory), an EEPROM (Electrically Erasable Programmable Read-only Memory), or a magnetic recording medium, for example a floppy disc or hard disc. Further, the carrier may be a transmissible carrier such as an electrical or optical signal which may be conveyed via electrical or optical cable or by radio or by other means. When the program is embodied in a signal which may be conveyed directly by a cable or other device or means, the carrier may be constituted by such cable or device or means. Alternatively, the carrier may be an integrated circuit in which the program is embedded, the integrated circuit being adapted for performing, or for use in the performance of, the relevant processes.

Program code, which, when run in the processor 110, causes the system 100 to perform the method according to any of the method embodiments herein may already be pre-stored in an internal memory 140 of the system 100. The processor 1 10 is in such embodiments communicably connected to the memory 140.

In one or more embodiments, there may be provided a computer program loadable into a memory communicatively connected or coupled to at least one data processor, e.g. the processor 110, comprising software for executing the method according any of the embodiments herein when the program is run on the at least one processor 110.

In one or more further embodiment, there may be provided a processor-readable medium, having a program recorded thereon, where the program is to make at least one data processor, e.g. the processor 110, execute the method according to of any of the embodiments herein when the program is loaded into the at least one data processor.

The invention is not restricted to the described embodiments in the figures, but may be varied freely within the scope of the claims.

Claims

1 . A computerized system (100) for determining a language proficiency level based on an input text string, the system (100) comprising:

a back-end node (101 ) comprising:

- at least one processor (1 10, 125);

- a first interface (120) configured to enable communication between the at least one processor (1 10, 125) and one or more user devices (150i_...n) comprised in a front-end node (102); and

- a memory (140) configured to store a current ML module and being configured to communicate with the processor (1 10),

wherein one or more of the at least one processor (1 10, 125) is configured to:

- send a first signal (S1 ) to a user device (150,), via the first interface (120), wherein the first signal (S1 ) comprises or is indicative of a first text string (Question); and

- receive, in response to sending the first signal (S1 ), a second signal (S2) from the user device (150i_...n), via the first interface (120), wherein the second signal (S2) is generated by a user (155,) interacting with the user device (150,), and wherein the second signal (S2) comprises or is indicative of a second text string (Answer),

wherein one or more of the at least one processor (1 10, 125) is configured to:

- process the second text string (Answer) to derive at least one characteristic (C) of the second text string, the at least one characteristic (C) comprising: the average length of words; the use of advanced words; the average length of sentences; the number of characters; the number of words; the lexical diversity; the number of long words; N-grams; grammatical errors; or misspelled words;

- determine, using a current machine learning, ML, model:

• a first prediction (P1 ) based on a selection of the at least one characteristic

(C);

• a second prediction (P2) based on the raw data of the second text string (Answer); and - determine a language proficiency level, LPL, to which the second text string (Answer) belongs, based on the first prediction (P1 ) and the second prediction (P2).

2. The system (100) of claim 1 , wherein the current machine learning, ML, model, is configured to determine the second prediction (P2) by performing semantic analysis on the raw data of the second text string (Answer).

3. The system (100) of claim 1 or 2, wherein the current machine learning, ML, model comprises a first machine learning model being configured to determine the first prediction (P1 ), and a second machine learning model being configured to determine the second prediction (P2).

4. The system (100) of claim 3, wherein the first machine learning model is a regression model and the second machine learning model is a neural network, NN, model.

5. The system (100) of any of the preceding claims, wherein the processor (1 10) is further configured to estimate a user language proficiency level (LPLUSER) associated with the user (155i), based on the determined language proficiency level (LPL).

6. The system (100) of claim 5, wherein the processor (1 10) is further configured to communicate the user language proficiency level (LPLUSER) to the user device (150,), via the first interface (120).

7. The system (100) of any of the preceding claims, wherein the processor (1 10) is further configured to determine a confidence level indicative of the accuracy of the determined language proficiency level (LPL) or user language proficiency level (LPLUSER)·

8. The system (100) of any of the preceding claims, wherein the processor (110) is configured to select the language proficiency level, LPL, from a set of N predetermined language proficiency levels, LPLs.

9. The system (100) of any of the preceding claims, wherein the processor (110) is configured to determine the language proficiency level, LPL, based on a predetermined algorithm configured to minimize the residual errors of the resulting language proficiency level, LPL, determination, or minimizing each individual prediction error (P1 , P2).

10. The system (100) of any of the preceding claims, further comprising a second interface (130) configured to enable communication between the processor (1 10) and one or more data source (160), wherein the one or more data source (160) is configured to receive, store, and send training data for a machine learning, ML, model.

1 1. The system (100) of claim 9, wherein the processor (1 10) is further configured to send the language proficiency level (LPL) or user language proficiency level (LPLUSER), together with the associated first text string (Question) and second text string (Answer), to the data source (160), via the second interface (130), to be included in the training data.

12. The system (100) of any of the preceding claims, wherein the second processor (125) is configured to send the first text string (Question) or the second text string (Answer) to at least one data source (160), to be included in the training data.

13. The system (100) of any of the claims 1 1 or 12, wherein the processor (1 10) is further configured to update the current ML model by:

checking if there is new training data in one or more data source (160);

if there is new training data in the one or more data source (160), obtaining training data from the one or more data source (160);

obtaining the current machine learning model from the memory (140);

training the current machine learning model based on the obtained training data; and

storing the trained machine learning model as the current machine learning model in the memory (140).

14. A method for determining a language proficiency level, LPL, based on an input text string, the method comprising:

sending, by a processor, a first signal (S1 ) to a user device, via a first interface, wherein the first signal (S1 ) comprises or is indicative of a first text string (Question);

in response to sending the first signal (S1 ), receiving, in the processor, a second signal (S2) from the user device, via the first interface, wherein the second signal (S2) is generated by a user interacting with the user device, and wherein the second signal (S2) comprises or is indicative of a second text string (Answer);

processing the second text string (Answer), by a processor, to derive at least one characteristic (C) of the second text string, the at least one characteristic (C) comprising: the average length of words; the use of advanced words; the average length of sentences; the number of characters; the number of words; the lexical diversity; the number of long words; N-grams; grammatical errors; or misspelled words;

determining, using a machine learning, ML, model:

- a first prediction (P1 ) based on a selection of the at least one characteristic (C); and

- a second prediction (P2) based on the raw data of the second text string (Answer); and

determining, by a processor, a language proficiency level, LPL, to which the second text string (Answer) belongs, based on the first prediction (P1 ) and the second prediction (P2).

15. The method of claim 14, wherein the determining the second prediction (P2), using the current machine learning, ML, model, includes performing semantic analysis on the raw data of the second text string (Answer).

16. The method of claim 14 or 15, wherein the current machine learning, ML, model a first machine learning model of a first type and a second machine learning model of a second type, wherein determining the first prediction (P1 ) is done using the first machine learning model and determining the second prediction (P2) is done using the second machine learning model.

17. The method of claim 16, wherein the first machine learning model is a regression model and the second machine learning model is a neural network, NN, model.

18. The method of any of the claims 14 to 17, comprising selecting the language proficiency level, LPL, from a set of N predetermined language proficiency levels, LPLs.

19. The method of any of the claims 14 to 18, comprising determining the language proficiency level, LPL, based on a predetermined algorithm configured to minimize the residual errors of the resulting language proficiency level, LPL, determination, or minimize each individual prediction error (P1 , P2).

20. The method of any of the claims 14 to 19, further comprising the step of estimating a language proficiency level (LPLUSER) associated with the user, based at least on the determined language proficiency level (LPL).

21. The method of claim 20, further comprising the step of communicating the user language proficiency level (LPLUSER) to the user device via the first interface.

22. The method of any of the claims 14 to 21 , further comprising determining a confidence level indicative of the accuracy of the determined language proficiency level (LPL) or user language proficiency level (LPLUSER)·

23. The method of any of the claims 14 to 22, further comprising sending the language proficiency level (LPL) or user language proficiency level (LPLUSER), together with the associated first text string (Question) and second text string (Answer), to at least one data source, to be included in the training data.

24. The method of any of the claims 12 to 20, further comprising the step of updating, by the processor, the current ML model, by:

checking if there is new training data in one or more data source;

if there is new training data in the one or more data source, obtaining training data from the one or more data source;

obtaining a current machine learning model from the memory;

storing the trained machine learning model as the current machine learning model in the memory.

25. A computer program loadable into a memory communicatively connected or coupled to at least one data processor, comprising software for executing the method according any of the method claims 14 to 24 when the program is run on the at least one data processor.

26. A processor-readable medium, having a program recorded thereon, where the program is to make at least one data processor execute the method according to of any of the method claims 14 to 24 when the program is loaded into the at least one data processor.