WO2024095819A1 - Proficiency level determination device, proficiency level determination method, and program - Google Patents

Proficiency level determination device, proficiency level determination method, and program Download PDF

Info

Publication number
WO2024095819A1
WO2024095819A1 PCT/JP2023/038290 JP2023038290W WO2024095819A1 WO 2024095819 A1 WO2024095819 A1 WO 2024095819A1 JP 2023038290 W JP2023038290 W JP 2023038290W WO 2024095819 A1 WO2024095819 A1 WO 2024095819A1
Authority
WO
WIPO (PCT)
Prior art keywords
level
proficiency
proficiency level
question
assessment
Prior art date
Application number
PCT/JP2023/038290
Other languages
French (fr)
Japanese (ja)
Inventor
淳 渡辺
倫也 上田
Original Assignee
株式会社Z会
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 株式会社Z会 filed Critical 株式会社Z会
Publication of WO2024095819A1 publication Critical patent/WO2024095819A1/en

Links

Images

Classifications

    • GPHYSICS
    • G09EDUCATION; CRYPTOGRAPHY; DISPLAY; ADVERTISING; SEALS
    • G09BEDUCATIONAL OR DEMONSTRATION APPLIANCES; APPLIANCES FOR TEACHING, OR COMMUNICATING WITH, THE BLIND, DEAF OR MUTE; MODELS; PLANETARIA; GLOBES; MAPS; DIAGRAMS
    • G09B19/00Teaching not covered by other main groups of this subclass

Definitions

  • the present invention relates to a proficiency level assessment device, a proficiency level assessment method, and a program for assessing the proficiency level of an answerer.
  • Patent Document 1 discloses an academic ability estimation model generation device that can generate an academic ability estimation model that accurately estimates current academic ability without requiring comprehensive learning data.
  • the academic ability estimation model generation device of Patent Document 1 includes a decision tree generation unit that generates a decision tree using correct/incorrect information as teacher data indicating whether multiple solvers who answered a group of predetermined problems answered each question correctly or incorrectly, a pruning unit that deletes the leaf node that is the end of the generated decision tree when the entropy of the classification result indicated by the leaf node is equal to or less than a predetermined value, and a category generation unit that sets each of the new ends of the decision tree after the leaf node deletion as a category to which one of the solvers belongs.
  • the present invention aims to provide a proficiency level assessment device that allows users to know the accuracy of the assessment level, which is the result of the proficiency level assessment made by a classifier.
  • the proficiency level assessment device of the present invention includes a classifier learning unit, a label distribution generation unit for each question and assessment level, a proficiency level assessment unit, and an end assessment unit.
  • the classifier learning unit uses answer data by multiple solvers, to which proficiency levels have been assigned as labels, as learning data, and learns each classifier dedicated to each problem that judges the proficiency level.
  • the problem/judgment level label distribution generation unit uses answer data by multiple solvers, to which proficiency levels have been assigned as labels, that does not include learning data, as test data, and generates a distribution of labels assigned to the test data (hereinafter, label distribution) for each proficiency level (hereinafter, judgment level) judged based on the test data by each classifier dedicated to each problem.
  • the proficiency level judgment unit judges the proficiency level using the dedicated classifier based on answer data for a specific problem by a specific solver to which no label has been assigned.
  • the completion judgment unit executes an end judgment of the proficiency level judgment for a specific solver based on an index indicating the degree of variation in the label distribution corresponding to the judgment level for the specific problem.
  • the proficiency level assessment device of the present invention allows you to know the accuracy of the assessment level, which is the result of the proficiency level assessment made by the assessor.
  • FIG. 2 is a block diagram showing the functional configuration of the skill level determination device according to the first embodiment.
  • 4 is a flowchart showing a classifier learning operation of the skill level determination device according to the first embodiment.
  • 4 is a flowchart showing a label distribution generation operation of the skill level assessment device according to the first embodiment.
  • FIG. 13 is a diagram showing an example of a label distribution generated for each question and for each judgment level.
  • 4 is a flowchart showing an end determination operation of the skill level determination device according to the first embodiment.
  • 11A and 11B are diagrams for explaining an example in which cumulative variance is used for end determination; 11A and 11B are diagrams for explaining an example in which a confidence interval is used for termination determination.
  • 4 is a flowchart showing the question order optimization operation of the mastery level determination device according to the first embodiment.
  • FIG. 13 is a diagram showing an example of a question order determination operation.
  • FIG. 2 is a diagram showing an example of the functional configuration of a computer.
  • the proficiency level determination device 1 of the present embodiment includes a determiner learning unit 100, a determiner storage unit 105, a label distribution generation unit for each question and determination level 110, a label distribution storage unit 115, a proficiency level determination unit 120, a completion determination unit 125, and a question order optimization unit 130.
  • the determiner learning operation of the proficiency level determination device 1 will be described below with reference to FIG. 2.
  • the classifier learning unit 100 learns each classifier that is dedicated to each question and judges the proficiency level using answer data by multiple solvers to which proficiency levels are assigned as labels (S100).
  • the classifier storage unit 105 stores each learned classifier (S105).
  • the answer data may be, for example, an English composition question, an English reading comprehension question, etc.
  • the answer may be an answer to a question in a subject other than English.
  • the answer may be an answer to a question in mathematics, Japanese, social studies, or science.
  • the mastery level can be, for example, five levels from level 1 (low evaluation) to level 5 (high evaluation). Alternatively, it may be 13 levels from A+ to A-, B+ to B-, C+ to C-, D+ to D-, and F, or it may be 10 levels from 1 to 10, or it may be a system with a full score of 0 to 100 points. In the following embodiment, an example of five levels from level 1 (low evaluation) to level 5 (high evaluation) will be described.
  • the evaluation of the proficiency level may be an evaluation of the entire subject, or an evaluation of a specific field of the subject. For example, when an English composition question is asked, the evaluation may be made on the overall English composition ability as the proficiency level, or, if the question range is determined (e.g., the use of specific grammar rules is determined), only the English composition ability in the corresponding question range may be evaluated.
  • Labeling may be performed, for example, by a teacher or a corrector judging the proficiency level for each piece of answer sheet data and assigning the label.
  • ⁇ Learning data> This is a large amount of answer data by multiple solvers with their proficiency levels labeled. Although the data has the same content as the test data described below, it is necessary to avoid using data that overlaps with the test data because the purposes are different. In operation, it is sufficient to divide the large amount of labeled answer data at a specified ratio, and use one portion as learning data and the other as test data.
  • the classifier is a model trained by supervised learning using the above-mentioned labeled answer data (proficiency level) as training data, and is trained as a classifier dedicated to each problem as described above. For example, if a total of X English composition questions, from problem 1 to X, are given, classifier 1 dedicated to problem 1, classifier 2 dedicated to problem 2, ..., classifier X dedicated to problem X are trained.
  • n 1, ..., N
  • n 1, ..., N
  • the judgment level for example, 3 on a 5-point scale
  • classifier m 1, ..., M
  • classifier m inputs answer sheet by respondent Q to English composition question m
  • judges respondent Q's proficiency level in English composition ability related to participial constructions and outputs the judged level (for example, 4 on a 5-point scale).
  • the label distribution generation unit 110 for each question and judgment level uses answer data that does not include learning data among answer data by multiple solvers to which proficiency levels are assigned as labels as test data, and generates a distribution of labels assigned to the test data (hereinafter also referred to as a label distribution) for each proficiency level (hereinafter referred to as a judgment level) judged based on the test data by each judger dedicated to each question (S110).
  • the label distribution storage unit 115 stores the generated label distribution (S115).
  • ⁇ Test Data> This is a large amount of answer data by multiple solvers with proficiency levels labeled, and although it has the same content as the training data described above, it is necessary to avoid using data that overlaps with the training data because the applications are different.
  • the accuracy of each classifier can be evaluated by evaluating the difference between the judgment level output by each classifier trained using the training data and the label assigned to the test data.
  • the label distribution is generated for each question and each judgment level.
  • An example of the label distribution is shown in FIG. 4.
  • the label distribution for judgment level 3 is a distribution of 4 answers labeled with a level 1 label, 18 answers labeled with a level 2 label, 100 answers labeled with a level 3 label, 17 answers labeled with a level 4 label, and 4 answers labeled with a level 5 label.
  • the judger 1 judges 22 answers labeled with levels 1 and 2 as being overly rated, judges 100 answers labeled with a level 3 label (total of 143 answers) as being appropriate, and judges 21 answers labeled with levels 4 and 5 as being underrated.
  • the above-mentioned label distribution has a scatter in the values, and the smaller the scatter, the higher the performance of the judger.
  • the label distribution for level 3 of classifier 2 for question 2 has 135 answers (out of a total of 160) labeled with level 3, which is less variance than the label distribution for level 3 of classifier 1.
  • the mastery level determination unit 120 determines the mastery level by a dedicated determiner based on the answer data of a predetermined problem by a predetermined solver to which no label has been assigned (S120).
  • the completion determination unit 125 performs a completion determination of the proficiency level determination for a given solver based on an index indicating the degree of variability in the label distribution corresponding to the determination level for a given question (S125).
  • the degree of variability in the label distribution is, for example, the variance of the label distribution.
  • the completion determination unit 125 can determine that the skill level determination is completed when the variance of the label distribution is lower than a predetermined threshold value.
  • the judgement level for problem 1 is 3 and the judgement level for problem 2 is 2.
  • the variance values representing the degree of variation in the corresponding label distribution are 0.47 and 0.28, respectively (see the shaded areas in Figure 4). If the threshold value set for the variance is, for example, 0.3, then judger 2 satisfies the condition, and the judgement level 2 output by judger 2 is output as the solver R's proficiency level judgement result, and the proficiency level judgement is terminated. In this case, solver R does not need to solve problem 3 in order to have his/her proficiency level judged, thereby reducing the burden on solver R.
  • the end judgment unit 125 can judge that the proficiency level judgment is ended when the cumulative variance of the multiple label distributions corresponding to the judgment levels in multiple questions is lower than a predetermined threshold.
  • the threshold is set to 0.15, the proficiency level assessment of answerer R is determined to end at question 3. If the threshold is set to 0.10, the proficiency level assessment of answerer R continues after question 3. The cumulative average value can be used as the proficiency level assessment result of answerer R. For example, if the proficiency level assessment ends at question 2, If the proficiency level assessment is completed in question 3, It is.
  • the 95% confidence interval calculated using the cumulative average value and cumulative variance obtained from the label distributions corresponding to the determination level 3 of question 1 and the determination level 2 of question 2 is [1.98, 3.18], and since two integer values (level 2, level 3) are included in the 95% confidence interval, it can be considered that the determination level is not yet determined to be 2 or 3.
  • the end determination unit 125 does not determine that the proficiency level determination is completed.
  • the 95% confidence interval calculated using the cumulative average value and cumulative variance obtained from the label distributions corresponding to each determination level of questions 1 to 3 is [2.30, 3.10], and since only one integer value (level 3) is included in the 95% confidence interval, it can be considered that the determination level is determined to be 3.
  • the completion determination unit 125 may determine that the mastery level determination is completed.
  • the end judgment unit 125 may correct the average value of the label distribution corresponding to the minimum or maximum judgment level in a given problem and the index indicating the degree of variation so as to approximate the label distribution corresponding to other judgment levels.
  • the minimum or maximum judgment level refers to judgment levels 1 and 5, for example, when judgment levels are 1 to 5.
  • the label distributions of judgment levels 1 and 5 are distributed only on one side and do not have a so-called bell curve shape.
  • the label distribution of judgment level 1 has distributions for labels 2 and 3, but labels 0 and -1 are not set, so there is no left side of the distribution.
  • the same is true for the label distribution of judgment level 5, where labels 6 and 7 are not set, so there is no right side of the distribution.
  • the termination judgment unit 125 may generate a pseudo distribution on one side that does not exist and synthesize it to correct the average value of the relevant label distribution and an index indicating the degree of variation so that they approximate the label distribution corresponding to the other judgment level.
  • labels with integer values smaller than the minimum judgment level and labels with integer values larger than the maximum judgment level can be prepared as dummy labels and assigned manually.
  • the question order optimization unit 130 sets the question with the smallest mean square error of the label distribution corresponding to the judgment level of a given question for a given solver as the next question to be asked to the given solver (S130).
  • the judgment level of judger 1 for problem 1 of solver R is 3, it is highly likely that the judgment level of solver R will ultimately be concluded to be 3.
  • the least squares error of the label distribution at judgment level 3 for each problem is 0.19, 0.39, and 0.57, respectively (see the shaded areas in the figure). Since the least squares error for problem 2 is the smallest, it can be said that problem 2 is superior in judging solvers belonging to judgment level 3.
  • the question order optimization unit 130 should set question 2, which has the smallest mean square error of the label distribution corresponding to judgment level 3 by the judge 1 for question 1 for solver R, as the next question to be given to solver R, following question 1.
  • the questions will be given in the order of question 1 ⁇ question 2 ⁇ question 3 ⁇ question 4.
  • the next question to be asked may be the question with the smallest mean square error of the label distribution of the judgment level closest to the cumulative average value of previously answered questions.
  • the completion determination unit 125 of the first embodiment may be omitted, and the mastery level determination unit 120 may perform the final output (in this modified example, the mastery level determination unit 120A is referred to as the mastery level determination unit 120A).
  • the mastery level determination unit 120A determines the mastery level using a dedicated determiner based on the answer data of a predetermined problem by a predetermined solver to which no label has been assigned, and generates and outputs a statistical index of the label distribution corresponding to the determination level.
  • the statistical indicators of the label distribution are, for example, the cumulative average value and cumulative variance described above.
  • the end determination unit 125 is omitted in this modified example, the proficiency level determination unit 120A outputs, for example, the cumulative average value and cumulative variance as the statistical indicators of the label distribution, so the evaluator can refer to the output cumulative variance value and decide whether to end or continue the proficiency level determination. If the evaluator decides to end the proficiency level determination, he or she can determine the proficiency level of the answerer based on the cumulative average value.
  • the device of the present invention has, as a single hardware entity, an input section to which a keyboard or the like can be connected, an output section to which a liquid crystal display or the like can be connected, a communication section to which a communication device (e.g., a communication cable) capable of communicating with the outside of the hardware entity can be connected, a CPU (which may also have a central processing unit, cache memory, registers, etc.), memories such as RAM and ROM, an external storage device such as a hard disk, and a bus connecting the input section, output section, communication section, CPU, RAM, ROM, and external storage device so that data can be exchanged between them.
  • the hardware entity may also be provided with a device (drive) capable of reading and writing recording media such as a CD-ROM.
  • a physical entity equipped with such hardware resources is, for example, a general-purpose computer.
  • the external storage device of the hardware entity stores the programs required to realize the above-mentioned functions and the data required in the processing of these programs (not limited to an external storage device, for example the programs may be stored in a ROM, which is a read-only storage device). Data obtained by the processing of these programs is stored appropriately in the RAM or the external storage device.
  • each program stored in an external storage device or ROM, etc.
  • the data required to process each program are loaded into memory as needed, and interpreted, executed, and processed by the CPU as appropriate.
  • the CPU realizes the specified functions (each of the components represented as the above, ... unit, ... means, etc.).
  • the program describing the processing contents can be recorded on a computer-readable recording medium.
  • Examples of computer-readable recording media include magnetic recording devices, optical disks, magneto-optical recording media, and semiconductor memories.
  • hard disk drives, flexible disks, magnetic tapes, etc. can be used as magnetic recording devices; DVDs (Digital Versatile Discs), DVD-RAMs (Random Access Memory), CD-ROMs (Compact Disc Read Only Memory), and CD-Rs (Recordable)/RWs (ReWritable) can be used as optical disks; MOs (Magneto-Optical discs) can be used as magneto-optical recording media; and EEP-ROMs (Electrically Erasable and Programmable-Read Only Memory) can be used as semiconductor memories.
  • the program may be distributed, for example, by selling, transferring, or lending portable recording media such as DVDs or CD-ROMs on which the program is recorded. Furthermore, the program may be distributed by storing the program in a storage device of a server computer and transferring the program from the server computer to other computers via a network.
  • a computer that executes such a program for example, first stores in its own storage device the program recorded on a portable recording medium or the program transferred from a server computer. Then, when executing a process, the computer reads the program stored on its own recording medium and executes the process according to the read program. As another execution form of the program, the computer may read the program directly from the portable recording medium and execute the process according to the program, or may execute the process according to the received program each time a program is transferred from the server computer to the computer.
  • the above-mentioned process may also be executed by a so-called ASP (Application Service Provider) type service that does not transfer the program from the server computer to the computer, but realizes the processing function only by issuing an execution instruction and obtaining the results.
  • ASP Application Service Provider
  • the program in this form includes information used for processing by an electronic computer that is equivalent to a program (such as data that is not a direct command to the computer but has properties that specify the processing of the computer).
  • a hardware entity is configured by executing a specific program on a computer, but at least a portion of the processing content may be realized by hardware.

Landscapes

  • Business, Economics & Management (AREA)
  • Engineering & Computer Science (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Physics & Mathematics (AREA)
  • Educational Administration (AREA)
  • Educational Technology (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Electrically Operated Instructional Devices (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

Provided is a proficiency level determination device with which it is possible to know the accuracy of a determined level, which is the results of a proficiency level determination by a determinator. This proficiency level determination device includes: a determinator training unit that trains determinators which determine a proficiency level exclusive to each problem using, as training data, answer data from a plurality of respondents with proficiency levels attached thereto as labels; a label distribution per problem/determination level generation unit that generates a label distribution for each determination level determined on the basis of test data by means of the determinators exclusive to each problem; a proficiency level determination unit that determines a proficiency level by means of the exclusive determinator on the basis of answer data by a prescribed respondent for a prescribed problem, the answer data not having a label attached thereto; and a termination determination unit that executes a termination determination for proficiency level determination for the prescribed respondent on the basis of an indicator indicating the degree of variation in the label distribution corresponding to the determined level for the prescribed problem.

Description

習熟レベル判定装置、習熟レベル判定方法、プログラムSkill level determination device, skill level determination method, and program
 本発明は、解答者の習熟レベルを判定する習熟レベル判定装置、習熟レベル判定方法、プログラムに関する。 The present invention relates to a proficiency level assessment device, a proficiency level assessment method, and a program for assessing the proficiency level of an answerer.
 例えば特許文献1に、網羅的な学習データを必要とせず、現在の学力を精度よく推定する学力推定用モデルを生成することができる学力推定用モデル生成装置が開示されている。特許文献1の学力推定用モデル生成装置は、予め定めた複数の問題からなる問題群に解答した複数人の解答者が各問題を正解または不正解したことを示す正誤情報を教師データとして、決定木を生成する決定木生成部と、生成された決定木の末端である葉ノードが示す分類結果のエントロピーが所定値以下である場合に該当する葉ノードを消去する枝刈部と、葉ノード消去後の決定木の新たな末端のそれぞれを何れかの解答者が属するカテゴリとするカテゴリ生成部を含む。 For example, Patent Document 1 discloses an academic ability estimation model generation device that can generate an academic ability estimation model that accurately estimates current academic ability without requiring comprehensive learning data. The academic ability estimation model generation device of Patent Document 1 includes a decision tree generation unit that generates a decision tree using correct/incorrect information as teacher data indicating whether multiple solvers who answered a group of predetermined problems answered each question correctly or incorrectly, a pruning unit that deletes the leaf node that is the end of the generated decision tree when the entropy of the classification result indicated by the leaf node is equal to or less than a predetermined value, and a category generation unit that sets each of the new ends of the decision tree after the leaf node deletion as a category to which one of the solvers belongs.
特許7065927号公報Patent No. 7065927
 従来、学力を推定するモデルによる学力推定はどの程度信頼できるか分からないという課題があった。  Until now, there has been an issue with the reliability of academic ability estimates made using models that estimate academic ability.
 そこで本発明では、判定器による習熟レベルの判定結果である判定レベルの精度を知ることができる習熟レベル判定装置を提供することを目的とする。 The present invention aims to provide a proficiency level assessment device that allows users to know the accuracy of the assessment level, which is the result of the proficiency level assessment made by a classifier.
 本発明の習熟レベル判定装置は、判定器学習部と、問題・判定レベル毎ラベル分布生成部と、習熟レベル判定部と、終了判定部を含む。 The proficiency level assessment device of the present invention includes a classifier learning unit, a label distribution generation unit for each question and assessment level, a proficiency level assessment unit, and an end assessment unit.
 判定器学習部は、習熟レベルをラベルとして付与した複数の解答者による答案データを学習データとし、各問題に専属して習熟レベルを判定する各判定器を学習する。問題・判定レベル毎ラベル分布生成部は、習熟レベルをラベルとして付与した複数の解答者による答案データのうち、学習データを含まない答案データをテストデータとし、問題毎に、各問題に専属する各判定器により、テストデータに基づいて判定された習熟レベル(以下、判定レベル)毎に、テストデータに付与されているラベルの分布(以下、ラベル分布)を生成する。習熟レベル判定部は、ラベルが付与されていない所定の解答者における所定の問題の答案データに基づいて、専属する判定器により習熟レベルを判定する。終了判定部は、所定の問題における判定レベルに対応するラベル分布のばらつき度合いを示す指標に基づいて所定の解答者における習熟レベル判定の終了判定を実行する。 The classifier learning unit uses answer data by multiple solvers, to which proficiency levels have been assigned as labels, as learning data, and learns each classifier dedicated to each problem that judges the proficiency level. The problem/judgment level label distribution generation unit uses answer data by multiple solvers, to which proficiency levels have been assigned as labels, that does not include learning data, as test data, and generates a distribution of labels assigned to the test data (hereinafter, label distribution) for each proficiency level (hereinafter, judgment level) judged based on the test data by each classifier dedicated to each problem. The proficiency level judgment unit judges the proficiency level using the dedicated classifier based on answer data for a specific problem by a specific solver to which no label has been assigned. The completion judgment unit executes an end judgment of the proficiency level judgment for a specific solver based on an index indicating the degree of variation in the label distribution corresponding to the judgment level for the specific problem.
 本発明の習熟レベル判定装置によれば、判定器による習熟レベルの判定結果である判定レベルの精度を知ることができる。 The proficiency level assessment device of the present invention allows you to know the accuracy of the assessment level, which is the result of the proficiency level assessment made by the assessor.
実施例1の習熟レベル判定装置の機能構成を示すブロック図。FIG. 2 is a block diagram showing the functional configuration of the skill level determination device according to the first embodiment. 実施例1の習熟レベル判定装置の判定器学習動作を示すフローチャート。4 is a flowchart showing a classifier learning operation of the skill level determination device according to the first embodiment. 実施例1の習熟レベル判定装置のラベル分布生成動作を示すフローチャート。4 is a flowchart showing a label distribution generation operation of the skill level assessment device according to the first embodiment. 問題毎、判定レベル毎に生成されたラベル分布の例を示す図。FIG. 13 is a diagram showing an example of a label distribution generated for each question and for each judgment level. 実施例1の習熟レベル判定装置の終了判定動作を示すフローチャート。4 is a flowchart showing an end determination operation of the skill level determination device according to the first embodiment. 終了判定に累積の分散を用いる例について説明する図。11A and 11B are diagrams for explaining an example in which cumulative variance is used for end determination; 終了判定に信頼区間を用いる例について説明する図。11A and 11B are diagrams for explaining an example in which a confidence interval is used for termination determination. 実施例1の習熟レベル判定装置の出題順最適化動作を示すフローチャート。4 is a flowchart showing the question order optimization operation of the mastery level determination device according to the first embodiment. 出題順決定動作例を示す図。FIG. 13 is a diagram showing an example of a question order determination operation. コンピュータの機能構成例を示す図。FIG. 2 is a diagram showing an example of the functional configuration of a computer.
 以下、本発明の実施の形態について、詳細に説明する。なお、同じ機能を有する構成部には同じ番号を付し、重複説明を省略する。 Below, an embodiment of the present invention will be described in detail. Components having the same functions will be given the same numbers, and duplicate explanations will be omitted.
 以下、図1を参照して、実施例1の習熟レベル判定装置の機能構成を説明する。同図に示すように、本実施例の習熟レベル判定装置1は、判定器学習部100と、判定器記憶部105と、問題・判定レベル毎ラベル分布生成部110と、ラベル分布記憶部115と、習熟レベル判定部120と、終了判定部125と、出題順最適化部130を含む。以下、図2を参照して、習熟レベル判定装置1の判定器学習動作を説明する。 The functional configuration of the proficiency level determination device of the first embodiment will be described below with reference to FIG. 1. As shown in the figure, the proficiency level determination device 1 of the present embodiment includes a determiner learning unit 100, a determiner storage unit 105, a label distribution generation unit for each question and determination level 110, a label distribution storage unit 115, a proficiency level determination unit 120, a completion determination unit 125, and a question order optimization unit 130. The determiner learning operation of the proficiency level determination device 1 will be described below with reference to FIG. 2.
<判定器学習動作>
 判定器学習部100は、習熟レベルをラベルとして付与した複数の解答者による答案データを学習データとし、各問題に専属して習熟レベルを判定する各判定器を学習する(S100)。判定器記憶部105は学習された各判定器を記憶する(S105)。
<Classifier learning operation>
The classifier learning unit 100 learns each classifier that is dedicated to each question and judges the proficiency level using answer data by multiple solvers to which proficiency levels are assigned as labels (S100). The classifier storage unit 105 stores each learned classifier (S105).
≪答案データ≫
 答案データとして、例えば英作文の問題、英語の読解問題などが考えられる。答案は英語以外の教科であってもよい。例えば、数学、国語、社会、理科の問題に対する答案であってもよい。
<Answer data>
The answer data may be, for example, an English composition question, an English reading comprehension question, etc. The answer may be an answer to a question in a subject other than English. For example, the answer may be an answer to a question in mathematics, Japanese, social studies, or science.
≪習熟レベル≫
 習熟レベルは、例えばレベル1(低評価)~レベル5(高評価)の5段階とすることができる。この他にも、A+~A-,B+~B-,C+~C-,D+~D-,Fの13段階としてもよいし、1~10の10段階としてもよいし、0~100点満点の方式としてもよい。以下の実施例においては、レベル1(低評価)~レベル5(高評価)の5段階の例を用いて説明する。
Skill Level
The mastery level can be, for example, five levels from level 1 (low evaluation) to level 5 (high evaluation). Alternatively, it may be 13 levels from A+ to A-, B+ to B-, C+ to C-, D+ to D-, and F, or it may be 10 levels from 1 to 10, or it may be a system with a full score of 0 to 100 points. In the following embodiment, an example of five levels from level 1 (low evaluation) to level 5 (high evaluation) will be described.
≪習熟レベルの評価対象≫
 習熟レベルの評価対象は、その教科全体に対する評価であってもよいし、その教科の特定の出題分野に対する評価であってもよい。例えば英作文の問題を出題する場合、習熟レベルとして英作文能力全般についてを評価対象としてもよいし、例えば出題範囲が決められている(特定の文法規則等を使用することが決められている)場合には、該当する出題範囲の英作文能力のみを評価対象としてもよい。
<Evaluation of proficiency level>
The evaluation of the proficiency level may be an evaluation of the entire subject, or an evaluation of a specific field of the subject. For example, when an English composition question is asked, the evaluation may be made on the overall English composition ability as the proficiency level, or, if the question range is determined (e.g., the use of specific grammar rules is determined), only the English composition ability in the corresponding question range may be evaluated.
≪ラベル付与≫
 ラベル付与は、例えば教師、添削者が答案データ毎に習熟レベルを判断し、これを付与することにより実行すればよい。
<Label assignment>
Labeling may be performed, for example, by a teacher or a corrector judging the proficiency level for each piece of answer sheet data and assigning the label.
≪学習データ≫
 習熟レベルをラベルとして付与した複数の解答者による大量の答案データである。後述するテストデータと同じ内容のデータであるが、用途が異なるためテストデータと重複したデータを用いないようにする必要がある。運用上は、ラベルを付与した大量の答案データを所定の割合で分割し、一方を学習データ、他方をテストデータとすればよい。
<Learning data>
This is a large amount of answer data by multiple solvers with their proficiency levels labeled. Although the data has the same content as the test data described below, it is necessary to avoid using data that overlaps with the test data because the purposes are different. In operation, it is sufficient to divide the large amount of labeled answer data at a specified ratio, and use one portion as learning data and the other as test data.
≪判定器の学習≫
 判定器は、上述のラベル(習熟レベル)付き答案データを学習データとして教師あり学習によって学習されるモデルであり、前述したように各問題に専属する判定器として学習されるものとする。例えば問題1~Xの計X問の英作文の問題が出題されたものとすると、問題1に専属する判定器1、問題2に専属する判定器2、…、問題Xに専属する判定器Xがそれぞれ学習される。
<Classifier learning>
The classifier is a model trained by supervised learning using the above-mentioned labeled answer data (proficiency level) as training data, and is trained as a classifier dedicated to each problem as described above. For example, if a total of X English composition questions, from problem 1 to X, are given, classifier 1 dedicated to problem 1, classifier 2 dedicated to problem 2, ..., classifier X dedicated to problem X are trained.
≪判定レベル≫
 判定器により判定された習熟レベルを「判定レベル」とも呼称する。
<Judgment Level>
The proficiency level judged by the judger is also called the "judgment level."
≪判定器の判定動作の例≫
 例えば、N題の自由英作文の問題が出題された場合、判定器n(n=1、…、N)は、英作文問題nについての解答者Pの答案を入力とし、解答者Pの英作文能力全般についての習熟レベルを判定し、判定レベル(例えば5段階評価の3)を出力する。
<Example of judgment operation of the judge>
For example, if N free English composition questions are given, a classifier n (n = 1, ..., N) inputs the answer of a respondent P to English composition question n, judges the overall proficiency level of the respondent P in English composition ability, and outputs the judgment level (for example, 3 on a 5-point scale).
 例えば、M題の分詞構文を使った短い英作文が出題された場合、判定器m(m=1、…、M)は、英作文問題mについての解答者Qの答案を入力とし、解答者Qの分詞構文に関する英作文能力の習熟レベルを判定し、判定レベル(例えば5段階評価の4)を出力する。 For example, if a short English composition using participial constructions is given as a question, classifier m (m = 1, ..., M) inputs answer sheet by respondent Q to English composition question m, judges respondent Q's proficiency level in English composition ability related to participial constructions, and outputs the judged level (for example, 4 on a 5-point scale).
 以下、図3、図4を参照して、習熟レベル判定装置1のラベル分布生成動作を説明する。 Below, the label distribution generation operation of the proficiency level assessment device 1 will be explained with reference to Figures 3 and 4.
<ラベル分布生成動作>
 問題・判定レベル毎ラベル分布生成部110は、習熟レベルをラベルとして付与した複数の解答者による答案データのうち、学習データを含まない答案データをテストデータとし、問題毎に、各問題に専属する各判定器により、テストデータに基づいて判定された習熟レベル(以下、判定レベル)毎に、テストデータに付与されているラベルの分布(以下、ラベル分布とも呼称する)を生成する(S110)。ラベル分布記憶部115は、生成されたラベル分布を記憶する(S115)。
<Label distribution generation operation>
The label distribution generation unit 110 for each question and judgment level uses answer data that does not include learning data among answer data by multiple solvers to which proficiency levels are assigned as labels as test data, and generates a distribution of labels assigned to the test data (hereinafter also referred to as a label distribution) for each proficiency level (hereinafter referred to as a judgment level) judged based on the test data by each judger dedicated to each question (S110).The label distribution storage unit 115 stores the generated label distribution (S115).
≪テストデータ≫
 習熟レベルをラベルとして付与した複数の解答者による大量の答案データであり、前述した学習データと同じ内容のデータであるが、用途が異なるため学習データと重複したデータを用いないようにする必要がある。学習データを用いて学習した各判定器が出力する判定レベルとテストデータに付与されているラベルとの差分を評価することにより、各判定器の精度を評価することができる。
<Test Data>
This is a large amount of answer data by multiple solvers with proficiency levels labeled, and although it has the same content as the training data described above, it is necessary to avoid using data that overlaps with the training data because the applications are different. The accuracy of each classifier can be evaluated by evaluating the difference between the judgment level output by each classifier trained using the training data and the label assigned to the test data.
≪ラベル分布≫
 前述したようにラベル分布は、問題毎、判定レベル毎に生成される。ラベル分布の例を図4に示した。例えばテストデータのうち問題1についての複数の解答者の答案データを判定器1により習熟レベル判定した場合、判定レベル3についてのラベル分布は、レベル1のラベルが付与された答案4件、レベル2のラベルが付与された答案18件、レベル3のラベルが付与された答案100件、レベル4のラベルが付与された答案17件、レベル5のラベルが付与された答案4件という分布になっている。この事例において、判定器1は、レベル1、2のラベルが付与された答案22件について、過大と考えられる判定をしており、レベル3のラベルが付与された答案100件(全143件)について適正と考えられる判定をしており、レベル4、5のラベル付与された答案21件について過小と考えられる判定をしていることになる。上述のラベル分布は値がばらついており、このばらつきが少ないほど判定器の性能が高いことを示す。例えば問題2についての判定器2のレベル3についてのラベル分布は、レベル3のラベルが付与された答案が135件(全160件)あり、判定器1のレベル3についてのラベル分布よりもばらつきが少ない。
<Label distribution>
As described above, the label distribution is generated for each question and each judgment level. An example of the label distribution is shown in FIG. 4. For example, when the answer data of multiple solvers for question 1 in the test data is judged by the judger 1 for the proficiency level, the label distribution for judgment level 3 is a distribution of 4 answers labeled with a level 1 label, 18 answers labeled with a level 2 label, 100 answers labeled with a level 3 label, 17 answers labeled with a level 4 label, and 4 answers labeled with a level 5 label. In this example, the judger 1 judges 22 answers labeled with levels 1 and 2 as being overly rated, judges 100 answers labeled with a level 3 label (total of 143 answers) as being appropriate, and judges 21 answers labeled with levels 4 and 5 as being underrated. The above-mentioned label distribution has a scatter in the values, and the smaller the scatter, the higher the performance of the judger. For example, the label distribution for level 3 of classifier 2 for question 2 has 135 answers (out of a total of 160) labeled with level 3, which is less variance than the label distribution for level 3 of classifier 1.
 以下、図5、図6、図7を参照して、習熟レベル判定装置1の終了判定動作を説明する。 The following describes the end determination operation of the proficiency level determination device 1 with reference to Figures 5, 6, and 7.
<終了判定動作>
 習熟レベル判定部120は、ラベルが付与されていない所定の解答者における所定の問題の答案データに基づいて、専属する判定器により習熟レベルを判定する(S120)。
<End determination operation>
The mastery level determination unit 120 determines the mastery level by a dedicated determiner based on the answer data of a predetermined problem by a predetermined solver to which no label has been assigned (S120).
 終了判定部125は、所定の問題における判定レベルに対応するラベル分布のばらつき度合いを示す指標に基づいて所定の解答者における習熟レベル判定の終了判定を実行する(S125)。ラベル分布のばらつき度合いとは、例えばラベル分布の分散である。 The completion determination unit 125 performs a completion determination of the proficiency level determination for a given solver based on an index indicating the degree of variability in the label distribution corresponding to the determination level for a given question (S125). The degree of variability in the label distribution is, for example, the variance of the label distribution.
≪終了判定の事例1≫
 例えば、終了判定部125は、ラベル分布の分散が所定の閾値よりも低い場合に、習熟レベル判定を終了と判定することができる。
<Example 1 of termination judgment>
For example, the completion determination unit 125 can determine that the skill level determination is completed when the variance of the label distribution is lower than a predetermined threshold value.
 例えば、所定の解答者Rにおいて問題1、問題2を順に解いた答案データ(ラベル未付与)について、判定器1、判定器2がそれぞれ解答者Rの習熟レベルを判定したところ、問題1の判定レベルが3、問題2の判定レベルが2になったものとする。この場合、これらに対応するラベル分布のばらつき度合いを表す分散の値はそれぞれ、0.47、0.28となる(図4の網掛け部を参照)。分散に対して設定する閾値を例えば0.3とすれば、判定器2が条件を満たすことになり、判定器2が出力した判定レベル2を解答者Rの習熟レベル判定結果として出力し、習熟レベル判定を終了とする。この場合、解答者Rは習熟レベル判定のために問題3を解かなくてよいため、解答者Rの負担を軽減することができる。 For example, suppose that when judger 1 and judger 2 judge the proficiency level of a given solver R for answer data (without labels) in which the solver solves problems 1 and 2 in that order, the judgement level for problem 1 is 3 and the judgement level for problem 2 is 2. In this case, the variance values representing the degree of variation in the corresponding label distribution are 0.47 and 0.28, respectively (see the shaded areas in Figure 4). If the threshold value set for the variance is, for example, 0.3, then judger 2 satisfies the condition, and the judgement level 2 output by judger 2 is output as the solver R's proficiency level judgement result, and the proficiency level judgement is terminated. In this case, solver R does not need to solve problem 3 in order to have his/her proficiency level judged, thereby reducing the burden on solver R.
≪終了判定の事例2≫
 以下、解答者Rが、問題1、…、問題Kについて回答し、問題k(k=1、…、K)の答案データに対する判定器kの判定レベルに対応するラベル分布をXとするとき、X、…、Xの平均値(X+…+X)/Kの平均値E((X+…+X)/K)および分散V((X+…+X)/K)をそれぞれ、累積の平均値、累積の分散と呼ぶものとする。
<Example 2 of termination judgment>
Hereinafter, when solver R answers questions 1, ..., question K, and the label distribution corresponding to the judgment level of classifier k for the answer data of question k (k = 1 , ..., K) is Xk , the average value E(( X1 +...+ XK )/K) and variance V(( X1 +...+ XK )/K) of the average value ( X1 +...+ XK )/K of X1 , ..., XK will be called the cumulative average value and cumulative variance, respectively.
 例えば、終了判定部125は、複数の問題における判定レベルに対応する複数のラベル分布の累積の分散が所定の閾値よりも低い場合に、習熟レベル判定を終了と判定することもできる。例えば図6に示すように、解答者Rの問題1の答案データに対する判定器1の判定レベルが3(対応するラベル分布Xの平均値E(X)=2.99、分散V(X)=0.47)、解答者Rの問題2の答案データに対する判定器2の判定レベルが2(対応するラベル分布Xの平均値E(X)=2.16、分散V(X)=0.28)、解答者Rの問題3の答案データに対する判定器3の判定レベルが3(対応するラベル分布Xの平均値E(X)=2.95、分散V(X)=0.39)である場合に、問題2までの累積の分散は、
Figure JPOXMLDOC01-appb-M000001

である。また、問題3までの累積の分散は、
Figure JPOXMLDOC01-appb-M000002

である。従って、閾値=0.20と設定すれば、解答者Rの習熟レベル判定は問題2で終了と判定される。閾値=0.15と設定すれば、解答者Rの習熟レベル判定は問題3で終了と判定される。閾値=0.10と設定すれば、解答者Rの習熟レベル判定は問題3以降も続行される。解答者Rの習熟レベル判定結果として累積の平均値を用いることができる。例えば、問題2で習熟レベル判定が終了した場合、
Figure JPOXMLDOC01-appb-M000003

であり、問題3で習熟レベル判定が終了した場合、
Figure JPOXMLDOC01-appb-M000004

である。
For example, the end judgment unit 125 can judge that the proficiency level judgment is ended when the cumulative variance of the multiple label distributions corresponding to the judgment levels in multiple questions is lower than a predetermined threshold. For example, as shown in FIG. 6, when the judgment level of the judger 1 for the answer data of the problem 1 of the solver R is 3 (average value E( X1 ) of the corresponding label distribution X1 = 2.99, variance V( X1 ) = 0.47), the judgment level of the judger 2 for the answer data of the problem 2 of the solver R is 2 (average value E( X2 ) of the corresponding label distribution X2 = 2.16, variance V( X2 ) = 0.28), and the judgment level of the judger 3 for the answer data of the problem 3 of the solver R is 3 (average value E( X3 ) of the corresponding label distribution X3 = 2.95, variance V( X3 ) = 0.39), the cumulative variance up to the problem 2 is
Figure JPOXMLDOC01-appb-M000001

The cumulative variance up to problem 3 is
Figure JPOXMLDOC01-appb-M000002

Therefore, if the threshold is set to 0.20, the proficiency level assessment of answerer R is determined to end at question 2. If the threshold is set to 0.15, the proficiency level assessment of answerer R is determined to end at question 3. If the threshold is set to 0.10, the proficiency level assessment of answerer R continues after question 3. The cumulative average value can be used as the proficiency level assessment result of answerer R. For example, if the proficiency level assessment ends at question 2,
Figure JPOXMLDOC01-appb-M000003

If the proficiency level assessment is completed in question 3,
Figure JPOXMLDOC01-appb-M000004

It is.
≪終了判定の事例3≫
 また、例えば、終了判定部125は、複数の問題における判定レベルに対応する複数のラベル分布から得られる累積の平均値と累積の分散を用いて計算した所定の信頼区間(T%信頼区間、例えばT=95)に整数値が1つのみ含まれる場合に、習熟レベル判定を終了と判定することもできる。例えば図7に示すように、問題1の判定レベル3、問題2の判定レベル2にそれぞれ対応するラベル分布から得られる累積の平均値と累積の分散を用いて計算した95%信頼区間は[1.98,3.18]であり、95%信頼区間に整数値が2つ(レベル2、レベル3)含まれるため、判定レベルが2であるか3であるか未だ定まっていないものと捉えることができる。この場合、終了判定部125は習熟レベル判定を終了と判定しない。一方、問題1~3の各判定レベルに対応するラベル分布から得られる累積の平均値と累積の分散を用いて計算した95%信頼区間は[2.30,3.10]であり、95%信頼区間に整数値が1つ(レベル3)のみ含まれるため、判定レベルが3に定まったと捉えることができる。この場合、終了判定部125は習熟レベル判定を終了と判定してよい。
<Example 3 of termination judgment>
In addition, for example, the end determination unit 125 can determine that the proficiency level determination is completed when only one integer value is included in a predetermined confidence interval (T% confidence interval, for example, T=95) calculated using the cumulative average value and cumulative variance obtained from a plurality of label distributions corresponding to the determination levels in a plurality of questions. For example, as shown in FIG. 7, the 95% confidence interval calculated using the cumulative average value and cumulative variance obtained from the label distributions corresponding to the determination level 3 of question 1 and the determination level 2 of question 2 is [1.98, 3.18], and since two integer values (level 2, level 3) are included in the 95% confidence interval, it can be considered that the determination level is not yet determined to be 2 or 3. In this case, the end determination unit 125 does not determine that the proficiency level determination is completed. On the other hand, the 95% confidence interval calculated using the cumulative average value and cumulative variance obtained from the label distributions corresponding to each determination level of questions 1 to 3 is [2.30, 3.10], and since only one integer value (level 3) is included in the 95% confidence interval, it can be considered that the determination level is determined to be 3. In this case, the completion determination unit 125 may determine that the mastery level determination is completed.
≪終了判定の事例4≫
 終了判定部125は、所定の問題における最小、または最大の判定レベルに対応するラベル分布の平均値、およびばらつき度合いを示す指標を他の判定レベルに対応するラベル分布と近似するように補正してもよい。最小、または最大の判定レベルとは、例えば判定レベル1~5とした場合の判定レベル1、5のことである。判定レベル1、判定レベル5のラベル分布は、片側にしか分布がなく、いわゆるベルカーブ形状とならない。例えば判定レベル1のラベル分布は、ラベル2、3などにも分布があるがラベル0、ラベル-1などが設定されていないため分布の左側が存在しない。判定レベル5のラベル分布についても同様であり、ラベル6、7などが設定されていないため分布の右側が存在しない。
<Example 4 of termination judgment>
The end judgment unit 125 may correct the average value of the label distribution corresponding to the minimum or maximum judgment level in a given problem and the index indicating the degree of variation so as to approximate the label distribution corresponding to other judgment levels. The minimum or maximum judgment level refers to judgment levels 1 and 5, for example, when judgment levels are 1 to 5. The label distributions of judgment levels 1 and 5 are distributed only on one side and do not have a so-called bell curve shape. For example, the label distribution of judgment level 1 has distributions for labels 2 and 3, but labels 0 and -1 are not set, so there is no left side of the distribution. The same is true for the label distribution of judgment level 5, where labels 6 and 7 are not set, so there is no right side of the distribution.
 例えば終了判定部125は、存在しない片側の分布を擬似的に生成して合成することによって、該当のラベル分布の平均値、およびばらつき度合いを示す指標を、他の判定レベルに対応するラベル分布と近似するように補正してもよい。 For example, the termination judgment unit 125 may generate a pseudo distribution on one side that does not exist and synthesize it to correct the average value of the relevant label distribution and an index indicating the degree of variation so that they approximate the label distribution corresponding to the other judgment level.
 また、例えば最小の判定レベルよりも小さい整数値のラベル、最大の判定レベルよりも大きい整数値のラベルをダミーラベルとして用意しておき、人手で付与しておいてもよい。 Also, for example, labels with integer values smaller than the minimum judgment level and labels with integer values larger than the maximum judgment level can be prepared as dummy labels and assigned manually.
 以下、図8、図9を参照して、習熟レベル判定装置1の出題順最適化動作を説明する。 Below, the question order optimization operation of the proficiency level assessment device 1 will be explained with reference to Figures 8 and 9.
<出題順最適化動作>
 出題順最適化部130は、所定の解答者における所定の問題の判定レベルに対応するラベル分布の平均二乗誤差が最も小さい問題を所定の解答者に対して次に出題する問題と設定する(S130)。
<Question order optimization>
The question order optimization unit 130 sets the question with the smallest mean square error of the label distribution corresponding to the judgment level of a given question for a given solver as the next question to be asked to the given solver (S130).
 例えば解答者Rの問題1における判定器1の判定レベルが3であった場合、解答者Rの判定レベルは最終的に3と結論される可能性が高い。このとき、例えば図9に示すように、各問題(問題2~4)の判定レベル3におけるラベル分布の最小二乗誤差は、それぞれ0.19、0.39、0.57となっており(同図の網掛け部を参照)、問題2の最小二乗誤差が最小であることから、問題2は、判定レベル3に属する解答者の判定において優れているということができる。 For example, if the judgment level of judger 1 for problem 1 of solver R is 3, it is highly likely that the judgment level of solver R will ultimately be concluded to be 3. In this case, as shown in Figure 9, for example, the least squares error of the label distribution at judgment level 3 for each problem (problems 2 to 4) is 0.19, 0.39, and 0.57, respectively (see the shaded areas in the figure). Since the least squares error for problem 2 is the smallest, it can be said that problem 2 is superior in judging solvers belonging to judgment level 3.
 従ってこの場合、出題順最適化部130は、解答者Rにおける問題1の判定器1による判定レベル3に対応するラベル分布の平均二乗誤差が最も小さい問題2を解答者Rに対して、問題1に続いて、次に出題する問題と設定すればよい。同図の事例の場合、問題1→問題2→問題3→問題4の順で出題される。 Therefore, in this case, the question order optimization unit 130 should set question 2, which has the smallest mean square error of the label distribution corresponding to judgment level 3 by the judge 1 for question 1 for solver R, as the next question to be given to solver R, following question 1. In the case of the example shown in the figure, the questions will be given in the order of question 1 → question 2 → question 3 → question 4.
 2つ目の問題以降、過去に解答してきた問題の累積の平均値の値に最も近い判定レベルのラベル分布の平均二乗誤差が最も小さい問題を、次に出題する問題として設定してもよい。 From the second question onwards, the next question to be asked may be the question with the smallest mean square error of the label distribution of the judgment level closest to the cumulative average value of previously answered questions.
≪変形例1≫
 実施例1の終了判定部125を省略し、習熟レベル判定部120が最終的な出力を行う構成に変形してもよい(この変形例では、習熟レベル判定部120Aと呼称する)。この場合、習熟レベル判定部120Aは、ラベルが付与されていない所定の解答者における所定の問題の答案データに基づいて、専属する判定器により習熟レベルを判定し、判定レベルに対応するラベル分布の統計指標を生成して出力する。
<Modification 1>
The completion determination unit 125 of the first embodiment may be omitted, and the mastery level determination unit 120 may perform the final output (in this modified example, the mastery level determination unit 120A is referred to as the mastery level determination unit 120A). In this case, the mastery level determination unit 120A determines the mastery level using a dedicated determiner based on the answer data of a predetermined problem by a predetermined solver to which no label has been assigned, and generates and outputs a statistical index of the label distribution corresponding to the determination level.
 ラベル分布の統計指標とは、例えば前述した累積の平均値、累積の分散である。この変形例では終了判定部125が省略されているものの、習熟レベル判定部120Aが、ラベル分布の統計指標として例えば累積の平均値、累積の分散を出力するため、評価者は出力された累積の分散の値を参照して、習熟レベル判定の終了/継続の何れかを判断することができる。評価者は、習熟レベル判定を終了すると判断した場合に、累積の平均値に基づいて解答者の習熟レベルを判定することができる。 The statistical indicators of the label distribution are, for example, the cumulative average value and cumulative variance described above. Although the end determination unit 125 is omitted in this modified example, the proficiency level determination unit 120A outputs, for example, the cumulative average value and cumulative variance as the statistical indicators of the label distribution, so the evaluator can refer to the output cumulative variance value and decide whether to end or continue the proficiency level determination. If the evaluator decides to end the proficiency level determination, he or she can determine the proficiency level of the answerer based on the cumulative average value.
<補記>
 本発明の装置は、例えば単一のハードウェアエンティティとして、キーボードなどが接続可能な入力部、液晶ディスプレイなどが接続可能な出力部、ハードウェアエンティティの外部に通信可能な通信装置(例えば通信ケーブル)が接続可能な通信部、CPU(Central Processing Unit、キャッシュメモリやレジスタなどを備えていてもよい)、メモリであるRAMやROM、ハードディスクである外部記憶装置並びにこれらの入力部、出力部、通信部、CPU、RAM、ROM、外部記憶装置の間のデータのやり取りが可能なように接続するバスを有している。また必要に応じて、ハードウェアエンティティに、CD-ROMなどの記録媒体を読み書きできる装置(ドライブ)などを設けることとしてもよい。このようなハードウェア資源を備えた物理的実体としては、汎用コンピュータなどがある。
<Additional Notes>
The device of the present invention has, as a single hardware entity, an input section to which a keyboard or the like can be connected, an output section to which a liquid crystal display or the like can be connected, a communication section to which a communication device (e.g., a communication cable) capable of communicating with the outside of the hardware entity can be connected, a CPU (which may also have a central processing unit, cache memory, registers, etc.), memories such as RAM and ROM, an external storage device such as a hard disk, and a bus connecting the input section, output section, communication section, CPU, RAM, ROM, and external storage device so that data can be exchanged between them. If necessary, the hardware entity may also be provided with a device (drive) capable of reading and writing recording media such as a CD-ROM. A physical entity equipped with such hardware resources is, for example, a general-purpose computer.
 ハードウェアエンティティの外部記憶装置には、上述の機能を実現するために必要となるプログラムおよびこのプログラムの処理において必要となるデータなどが記憶されている(外部記憶装置に限らず、例えばプログラムを読み出し専用記憶装置であるROMに記憶させておくこととしてもよい)。また、これらのプログラムの処理によって得られるデータなどは、RAMや外部記憶装置などに適宜に記憶される。 The external storage device of the hardware entity stores the programs required to realize the above-mentioned functions and the data required in the processing of these programs (not limited to an external storage device, for example the programs may be stored in a ROM, which is a read-only storage device). Data obtained by the processing of these programs is stored appropriately in the RAM or the external storage device.
 ハードウェアエンティティでは、外部記憶装置(あるいはROMなど)に記憶された各プログラムとこの各プログラムの処理に必要なデータが必要に応じてメモリに読み込まれて、適宜にCPUで解釈実行・処理される。その結果、CPUが所定の機能(上記、…部、…手段などと表した各構成要件)を実現する。 In a hardware entity, each program stored in an external storage device (or ROM, etc.) and the data required to process each program are loaded into memory as needed, and interpreted, executed, and processed by the CPU as appropriate. As a result, the CPU realizes the specified functions (each of the components represented as the above, ... unit, ... means, etc.).
 本発明は上述の実施形態に限定されるものではなく、本発明の趣旨を逸脱しない範囲で適宜変更が可能である。また、上記実施形態において説明した処理は、記載の順に従って時系列に実行されるのみならず、処理を実行する装置の処理能力あるいは必要に応じて並列的にあるいは個別に実行されるとしてもよい。 The present invention is not limited to the above-described embodiment, and appropriate modifications can be made without departing from the spirit of the present invention. Furthermore, the processes described in the above embodiment are not limited to being executed chronologically in the order described, but may be executed in parallel or individually depending on the processing capacity of the device executing the processes or as necessary.
 既述のように、上記実施形態において説明したハードウェアエンティティ(本発明の装置)における処理機能をコンピュータによって実現する場合、ハードウェアエンティティが有すべき機能の処理内容はプログラムによって記述される。そして、このプログラムをコンピュータで実行することにより、上記ハードウェアエンティティにおける処理機能がコンピュータ上で実現される。 As mentioned above, when the processing functions of the hardware entities (the devices of the present invention) described in the above embodiments are realized by a computer, the processing contents of the functions that the hardware entities should have are described by a program. Then, by executing this program on a computer, the processing functions of the hardware entities are realized on the computer.
 上述の各種の処理は、図10に示すコンピュータの記録部10020に、上記方法の各ステップを実行させるプログラムを読み込ませ、制御部10010、入力部10030、出力部10040などに動作させることで実施できる。 The various processes described above can be implemented by loading a program that executes each step of the above method into the recording unit 10020 of the computer shown in FIG. 10, and operating the control unit 10010, input unit 10030, output unit 10040, etc.
 この処理内容を記述したプログラムは、コンピュータで読み取り可能な記録媒体に記録しておくことができる。コンピュータで読み取り可能な記録媒体としては、例えば、磁気記録装置、光ディスク、光磁気記録媒体、半導体メモリ等どのようなものでもよい。具体的には、例えば、磁気記録装置として、ハードディスク装置、フレキシブルディスク、磁気テープ等を、光ディスクとして、DVD(Digital Versatile Disc)、DVD-RAM(Random Access Memory)、CD-ROM(Compact Disc Read Only Memory)、CD-R(Recordable)/RW(ReWritable)等を、光磁気記録媒体として、MO(Magneto-Optical disc)等を、半導体メモリとしてEEP-ROM(Electrically Erasable and Programmable-Read Only Memory)等を用いることができる。 The program describing the processing contents can be recorded on a computer-readable recording medium. Examples of computer-readable recording media include magnetic recording devices, optical disks, magneto-optical recording media, and semiconductor memories. Specifically, for example, hard disk drives, flexible disks, magnetic tapes, etc. can be used as magnetic recording devices; DVDs (Digital Versatile Discs), DVD-RAMs (Random Access Memory), CD-ROMs (Compact Disc Read Only Memory), and CD-Rs (Recordable)/RWs (ReWritable) can be used as optical disks; MOs (Magneto-Optical discs) can be used as magneto-optical recording media; and EEP-ROMs (Electrically Erasable and Programmable-Read Only Memory) can be used as semiconductor memories.
 また、このプログラムの流通は、例えば、そのプログラムを記録したDVD、CD-ROM等の可搬型記録媒体を販売、譲渡、貸与等することによって行う。さらに、このプログラムをサーバコンピュータの記憶装置に格納しておき、ネットワークを介して、サーバコンピュータから他のコンピュータにそのプログラムを転送することにより、このプログラムを流通させる構成としてもよい。 The program may be distributed, for example, by selling, transferring, or lending portable recording media such as DVDs or CD-ROMs on which the program is recorded. Furthermore, the program may be distributed by storing the program in a storage device of a server computer and transferring the program from the server computer to other computers via a network.
 このようなプログラムを実行するコンピュータは、例えば、まず、可搬型記録媒体に記録されたプログラムもしくはサーバコンピュータから転送されたプログラムを、一旦、自己の記憶装置に格納する。そして、処理の実行時、このコンピュータは、自己の記録媒体に格納されたプログラムを読み取り、読み取ったプログラムに従った処理を実行する。また、このプログラムの別の実行形態として、コンピュータが可搬型記録媒体から直接プログラムを読み取り、そのプログラムに従った処理を実行することとしてもよく、さらに、このコンピュータにサーバコンピュータからプログラムが転送されるたびに、逐次、受け取ったプログラムに従った処理を実行することとしてもよい。また、サーバコンピュータから、このコンピュータへのプログラムの転送は行わず、その実行指示と結果取得のみによって処理機能を実現する、いわゆるASP(Application Service Provider)型のサービスによって、上述の処理を実行する構成としてもよい。なお、本形態におけるプログラムには、電子計算機による処理の用に供する情報であってプログラムに準ずるもの(コンピュータに対する直接の指令ではないがコンピュータの処理を規定する性質を有するデータ等)を含むものとする。 A computer that executes such a program, for example, first stores in its own storage device the program recorded on a portable recording medium or the program transferred from a server computer. Then, when executing a process, the computer reads the program stored on its own recording medium and executes the process according to the read program. As another execution form of the program, the computer may read the program directly from the portable recording medium and execute the process according to the program, or may execute the process according to the received program each time a program is transferred from the server computer to the computer. The above-mentioned process may also be executed by a so-called ASP (Application Service Provider) type service that does not transfer the program from the server computer to the computer, but realizes the processing function only by issuing an execution instruction and obtaining the results. Note that the program in this form includes information used for processing by an electronic computer that is equivalent to a program (such as data that is not a direct command to the computer but has properties that specify the processing of the computer).
 また、この形態では、コンピュータ上で所定のプログラムを実行させることにより、ハードウェアエンティティを構成することとしたが、これらの処理内容の少なくとも一部をハードウェア的に実現することとしてもよい。 In addition, in this embodiment, a hardware entity is configured by executing a specific program on a computer, but at least a portion of the processing content may be realized by hardware.

Claims (10)

  1.  習熟レベルをラベルとして付与した複数の解答者による答案データを学習データとし、各問題に専属して習熟レベルを判定する各判定器を学習する判定器学習部と、
     習熟レベルをラベルとして付与した複数の解答者による答案データのうち、学習データを含まない答案データをテストデータとし、問題毎に、各問題に専属する各判定器により、テストデータに基づいて判定された習熟レベル(以下、判定レベル)毎に、テストデータに付与されているラベルの分布(以下、ラベル分布)を生成する問題・判定レベル毎ラベル分布生成部と、
     ラベルが付与されていない所定の解答者における所定の問題の答案データに基づいて、専属する判定器により習熟レベルを判定する習熟レベル判定部と、
     所定の問題における判定レベルに対応するラベル分布のばらつき度合いを示す指標に基づいて所定の解答者における習熟レベル判定の終了判定を実行する終了判定部を含む
     習熟レベル判定装置。
    A classifier learning unit uses answer data by a plurality of solvers to which proficiency levels are assigned as labels as learning data, and learns each classifier that is dedicated to each problem and judges the proficiency level;
    a problem/assessment level label distribution generation unit that generates a distribution of labels (hereinafter, label distribution) assigned to the test data for each problem and for each proficiency level (hereinafter, assessment level) determined based on the test data by each classifier dedicated to each problem; and
    A mastery level determination unit that determines a mastery level by a dedicated determiner based on answer data of a predetermined problem by a predetermined solver to which no label has been assigned;
    A proficiency level assessment device comprising: an end assessment unit that executes an end assessment of the proficiency level of a specified solver based on an index indicating a degree of variation in label distribution corresponding to an assessment level for a specified question.
  2.  請求項1に記載の習熟レベル判定装置であって、
     ラベル分布のばらつき度合いとは、ラベル分布の分散である
     習熟レベル判定装置。
    The skill level determination device according to claim 1,
    The degree of variation in the label distribution is the variance of the label distribution.
  3.  請求項1に記載の習熟レベル判定装置であって、
     解答者Rが、問題1、…、問題Kについて回答し、問題k(k=1、…、K)の答案データに対する判定器kの判定レベルに対応するラベル分布をXとするとき、X、…、Xの平均値(X+…+X)/Kの平均値E((X+…+X)/K)および分散V((X+…+X)/K)をそれぞれ、累積の平均値、累積の分散と呼ぶものとし、
     終了判定部は、
     複数の問題における判定レベルに対応する複数のラベル分布の累積の分散が所定の閾値よりも低い場合に、習熟レベル判定を終了と判定する
     習熟レベル判定装置。
    The skill level determination device according to claim 1,
    When solver R answers questions 1, ..., question K, and the label distribution corresponding to the judgment level of classifier k for the answer data of question k (k = 1 , ..., K) is Xk , the average value E(( X1 +...+ XK )/K) and variance V(( X1 +...+ XK )/K) of the average value ( X1 +...+ XK )/K of X1, ..., XK are called the cumulative average value and cumulative variance, respectively.
    The end determination unit is
    The proficiency level assessment device determines that the proficiency level assessment is completed when a cumulative variance of a plurality of label distributions corresponding to assessment levels in a plurality of questions is lower than a predetermined threshold.
  4.  請求項1に記載の習熟レベル判定装置であって、
     解答者Rが、問題1、…、問題Kについて回答し、問題k(k=1、…、K)の答案データに対する判定器kの判定レベルに対応するラベル分布をXとするとき、X、…、Xの平均値(X+…+X)/Kの平均値E((X+…+X)/K)および分散V((X+…+X)/K)をそれぞれ、累積の平均値、累積の分散と呼ぶものとし、
     終了判定部は、
     複数の問題における判定レベルに対応する複数のラベル分布から得られる累積の平均値と累積の分散を用いて計算した所定の信頼区間に整数値が1つのみ含まれる場合に、習熟レベル判定を終了と判定する
     習熟レベル判定装置。
    The skill level determination device according to claim 1,
    When solver R answers questions 1, ..., question K, and the label distribution corresponding to the judgment level of classifier k for the answer data of question k (k = 1 , ..., K) is Xk , the average value E(( X1 +...+ XK )/K) and variance V(( X1 +...+ XK )/K) of the average value ( X1 +...+ XK )/K of X1, ..., XK are called the cumulative average value and cumulative variance, respectively.
    The end determination unit is
    A proficiency level assessment device that determines that the proficiency level assessment is complete when only one integer value is included in a predetermined confidence interval calculated using a cumulative mean value and a cumulative variance obtained from a plurality of label distributions corresponding to the assessment levels for a plurality of questions.
  5.  請求項1に記載の習熟レベル判定装置であって、
     解答者Rが、問題1、…、問題Kについて回答し、問題k(k=1、…、K)の答案データに対する判定器kの判定レベルに対応するラベル分布をXとするとき、X、…、Xの平均値(X+…+X)/Kの平均値E((X+…+X)/K)および分散V((X+…+X)/K)をそれぞれ、累積の平均値、累積の分散と呼ぶものとし、
     所定の解答者における所定の問題の判定レベルに対応するラベル分布の平均二乗誤差が最も小さい問題、または累積の平均値の値に最も近い判定レベルのラベル分布の平均二乗誤差が最も小さい問題を所定の解答者に対して次に出題する問題と設定する出題順最適化部を含む
     習熟レベル判定装置。
    The skill level determination device according to claim 1,
    When solver R answers questions 1, ..., question K, and the label distribution corresponding to the judgment level of classifier k for the answer data of question k (k = 1 , ..., K) is Xk , the average value E(( X1 +...+ XK )/K) and variance V(( X1 +...+ XK )/K) of the average value ( X1 +...+ XK )/K of X1, ..., XK are called the cumulative average value and cumulative variance, respectively.
    A proficiency level assessment device including a question order optimization unit that sets, as the next question to be asked to a given solver, the question with the smallest mean square error of the label distribution corresponding to the assessment level of a given solver, or the question with the smallest mean square error of the label distribution of the assessment level closest to the cumulative average value.
  6.  請求項1に記載の習熟レベル判定装置であって、
     終了判定部は、
     所定の問題における最小、または最大の判定レベルに対応するラベル分布の平均値、およびばらつき度合いを示す指標を他の判定レベルに対応するラベル分布と近似するように補正する
     習熟レベル判定装置。
    The skill level determination device according to claim 1,
    The end determination unit is
    A proficiency level determination device that corrects the average value of a label distribution corresponding to the minimum or maximum determination level in a specified problem, and an index indicating the degree of variation, so as to approximate the label distribution corresponding to other determination levels.
  7.  習熟レベルをラベルとして付与した複数の解答者による答案データを学習データとし、各問題に専属して習熟レベルを判定する各判定器を学習する判定器学習部と、
     習熟レベルをラベルとして付与した複数の解答者による答案データのうち、学習データを含まない答案データをテストデータとし、問題毎に、各問題に専属する各判定器により、テストデータに基づいて判定された習熟レベル(以下、判定レベル)毎に、テストデータに付与されているラベルの分布(以下、ラベル分布)を生成する問題・判定レベル毎ラベル分布生成部と、
     ラベルが付与されていない所定の解答者における所定の問題の答案データに基づいて、専属する判定器により習熟レベルを判定し、判定レベルに対応するラベル分布の統計指標を生成して出力する習熟レベル判定部を含む
     習熟レベル判定装置。
    A classifier learning unit uses answer data by a plurality of solvers to which proficiency levels are assigned as labels as learning data, and learns each classifier that is dedicated to each problem and judges the proficiency level;
    a problem/assessment level label distribution generation unit that generates a distribution of labels (hereinafter, label distribution) assigned to the test data for each problem and for each proficiency level (hereinafter, assessment level) determined based on the test data by each classifier dedicated to each problem; and
    A proficiency level determination device including a proficiency level determination unit that determines the proficiency level using a dedicated determiner based on answer data for a specified problem by a specified solver to which no label has been assigned, and generates and outputs a statistical index of the label distribution corresponding to the determination level.
  8.  請求項7に記載の習熟レベル判定装置であって、
     解答者Rが、問題1、…、問題Kについて回答し、問題k(k=1、…、K)の答案データに対する判定器kの判定レベルに対応するラベル分布をXとするとき、X、…、Xの平均値(X+…+X)/Kの平均値E((X+…+X)/K)および分散V((X+…+X)/K)をそれぞれ、累積の平均値、累積の分散と呼ぶものとし、
     習熟レベル判定部は、
     判定レベルに対応するラベル分布の統計指標として、累積の平均値と累積の分散を生成して出力する
     習熟レベル判定装置。
    The skill level determination device according to claim 7,
    When solver R answers questions 1, ..., question K, and the label distribution corresponding to the judgment level of classifier k for the answer data of question k (k = 1 , ..., K) is Xk , the average value E(( X1 +...+ XK )/K) and variance V(( X1 +...+ XK )/K) of the average value ( X1 +...+ XK )/K of X1, ..., XK are called the cumulative average value and cumulative variance, respectively.
    The proficiency level determination unit is
    A proficiency level determination device that generates and outputs a cumulative average value and a cumulative variance as statistical indicators of a label distribution corresponding to the determination level.
  9.  習熟レベル判定装置が実行する習熟レベル判定方法であって、
     習熟レベルをラベルとして付与した複数の解答者による答案データを学習データとし、各問題に専属して習熟レベルを判定する各判定器を学習するステップと、
     習熟レベルをラベルとして付与した複数の解答者による答案データのうち、学習データを含まない答案データをテストデータとし、問題毎に、各問題に専属する各判定器により、テストデータに基づいて判定された習熟レベル(以下、判定レベル)毎に、テストデータに付与されているラベルの分布(以下、ラベル分布)を生成するステップと、
     ラベルが付与されていない所定の解答者における所定の問題の答案データに基づいて、専属する判定器により習熟レベルを判定するステップと、
     所定の問題における判定レベルに対応するラベル分布のばらつき度合いを示す指標に基づいて所定の解答者における習熟レベル判定の終了判定を実行するステップを含む
     習熟レベル判定方法。
    A skill level determination method executed by a skill level determination device, comprising:
    A step of learning answer data by a plurality of solvers to which proficiency levels are assigned as labels, and training each classifier that judges the proficiency level exclusively for each problem;
    A step of generating a distribution of labels (hereinafter, label distribution) assigned to the test data for each problem, based on the proficiency level (hereinafter, judgment level) judged based on the test data by each judger dedicated to each problem, from among answer data by multiple solvers to which proficiency level is assigned as a label, and using answer data that does not include learning data as test data;
    A step of judging a proficiency level by a dedicated judger based on answer data of a predetermined problem by a predetermined solver to which no label has been assigned;
    A method for judging a proficiency level, comprising the step of executing a judgment of completion of the proficiency level judgment for a given solver based on an index indicating a degree of variation in label distribution corresponding to an assessment level for a given question.
  10.  コンピュータを請求項1から8の何れかに記載の習熟レベル判定装置として機能させるプログラム。 A program that causes a computer to function as a proficiency level determination device according to any one of claims 1 to 8.
PCT/JP2023/038290 2022-11-04 2023-10-24 Proficiency level determination device, proficiency level determination method, and program WO2024095819A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
JP2022-177007 2022-11-04
JP2022177007A JP7339414B1 (en) 2022-11-04 2022-11-04 Proficiency level determination device, proficiency level determination method, and program

Publications (1)

Publication Number Publication Date
WO2024095819A1 true WO2024095819A1 (en) 2024-05-10

Family

ID=87882224

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/JP2023/038290 WO2024095819A1 (en) 2022-11-04 2023-10-24 Proficiency level determination device, proficiency level determination method, and program

Country Status (2)

Country Link
JP (1) JP7339414B1 (en)
WO (1) WO2024095819A1 (en)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2005215023A (en) * 2004-01-27 2005-08-11 Recruit Management Solutions Co Ltd Test implementation system and test implementation method
JP2016126029A (en) * 2014-12-26 2016-07-11 公益財団法人 日本英語検定協会 Computing for estimating capability value of multiple examinees on basis of item response theory
US20180151084A1 (en) * 2016-11-30 2018-05-31 Electronics And Telecommunications Research Institute Apparatus and method for providing personalized adaptive e-learning
JP2020076805A (en) * 2018-11-05 2020-05-21 日本電信電話株式会社 Learning support device, learning support method and program

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2005215023A (en) * 2004-01-27 2005-08-11 Recruit Management Solutions Co Ltd Test implementation system and test implementation method
JP2016126029A (en) * 2014-12-26 2016-07-11 公益財団法人 日本英語検定協会 Computing for estimating capability value of multiple examinees on basis of item response theory
US20180151084A1 (en) * 2016-11-30 2018-05-31 Electronics And Telecommunications Research Institute Apparatus and method for providing personalized adaptive e-learning
JP2020076805A (en) * 2018-11-05 2020-05-21 日本電信電話株式会社 Learning support device, learning support method and program

Also Published As

Publication number Publication date
JP7339414B1 (en) 2023-09-05
JP2024067159A (en) 2024-05-17

Similar Documents

Publication Publication Date Title
Petersen et al. Reviewing CS1 exam question content
Draper et al. Statistical analysis of performance indicators in UK higher education
CN101105845A (en) Information processing apparatus, information processing method and computer program
WO2021174827A1 (en) Text generation method and appartus, computer device and readable storage medium
WO2005101244A2 (en) Method for estimating examinee attribute parameters in a cognitive diagnosis model
Auer et al. Using machine learning to model trace behavioral data from a game‐based assessment
Scalise et al. First term probation: Models for identifying high risk students
WO2024095819A1 (en) Proficiency level determination device, proficiency level determination method, and program
CN111931875B (en) Data processing method, electronic device and computer readable medium
CN112052663B (en) Customer service statement quality inspection method and related equipment
JP2021026647A (en) Answer style component removal device, answer style component removal method, and program
JP6832410B1 (en) Learning effect estimation device, learning effect estimation method, program
CN114298299A (en) Model training method, device, equipment and storage medium based on course learning
CN114565034A (en) Creation capability information calculation and training method, device, medium, equipment and platform
KR102635769B1 (en) Learning effect estimation device, learning effect estimation method, program
US9272204B2 (en) Education through employment of gaming
JP7090188B2 (en) Learning effect estimation device, learning effect estimation method, program
JP7183216B2 (en) Successful applicant cluster set generation device, proficiency level target determination device, successful applicant cluster set generation method, proficiency level target determination method, program
CN112102062A (en) Risk assessment method and device based on weak supervised learning and electronic equipment
WO2024004071A1 (en) State estimation device, problem recommendation device, state estimation method, problem recommendation method, and program
JP6903177B1 (en) Learning effect estimation device, learning effect estimation method, program
CN116976434B (en) Knowledge point diffusion representation-based knowledge tracking method and storage medium
WO2024004070A1 (en) Problem recommendation device, problem recommendation method, and program
JP2004046255A (en) Computer adaptive test device, computer adaptive test system, computer adaptive test method, and recording medium with computer adaptive test program stored therein
CN116775396B (en) Pressure testing method and device for hard disk of server