CN112116187A

CN112116187A - Method for dynamically optimizing expression evaluation questions

Info

Publication number: CN112116187A
Application number: CN202010254255.6A
Authority: CN
Inventors: 马徐骏; 刘嘉; 詹晨; 孟磊; 王浩宇; 褚东宇; 汤大业; 王磊
Original assignee: Shanghai Mien Network Technology Co ltd
Current assignee: Shanghai Mien Network Technology Co ltd
Priority date: 2020-04-02
Filing date: 2020-04-02
Publication date: 2020-12-22

Abstract

The invention belongs to the technical field of statistical learning, and relates to a method for dynamically adjusting and automatically optimizing subject parameters in expression evaluation, in particular to a method for dynamically optimizing expression evaluation subjects, which improves the accuracy of evaluation and appraisal and dynamically adjusts the difficulty of evaluation while increasing the number of samples by applying a statistical learning method, and comprises the following specific steps: 1) for a question q_jD represents the difficulty of the question, b represents a word stock of the question, w represents a weight parameter of the word stock, and l represents the discrimination of the question; 2) optimizing the parameter d; 3) and optimizing the parameter w. According to the method, a statistical learning method is used, useful information is extracted from the existing sample data which is continuously added to optimize the topic parameters, so that the topic model is more robust, and sensitive feedback is provided for different expressive force evaluation inputs; by using the unsupervised learning model, extra marking work is not needed in the learning process, the workload of experts is reduced, and the labor cost is reduced.

Description

Method for dynamically optimizing expression evaluation questions

Technical Field

The invention belongs to the technical field of statistical learning, and particularly relates to a method for dynamically optimizing expression evaluation questions.

Background

The rapid expression test is a novel spoken language test mode, and compared with the traditional Chinese language test, the rapid expression test has the advantages of wide application range, various question sources, rapid test and objective evaluation. However, each attribute of the current theme is specified by experts according to own experience during initialization, and the judgment may be inaccurate. Aiming at the problem, the invention provides a method which can adjust the subject parameters unsupervised through the continuous increase of the number of samples and can improve the accuracy of the judgment of the subjects to a greater extent.

Disclosure of Invention

The invention aims to provide a method for dynamically optimizing an expressive force evaluation topic so as to solve the problems in the background technology.

In order to achieve the purpose, the invention provides the following technical scheme:

a method for dynamically optimizing an expression evaluation question applies a statistical learning method to increase the number of samples and simultaneously improve the accuracy of evaluation and appraisal and dynamically adjust the difficulty of the evaluation and appraisal, and comprises the following specific steps:

1) for a question q_jD represents the difficulty of the question, b represents the word stock of the question, w represents the weight parameter of the word stock, l represents the distinguishing degree of the question, and all samples of the word stock form a sample set S ═ S₁,s₂,s₃...,s_n]Wherein s is_i＝{h_i,sc_iDenotes the ith sample, h_iIs its hit vector, sc_iRepresents its score;

2) optimizing the parameter d;

3) and optimizing the parameter w.

As a preferred aspect of the present invention, the specific steps of the optimization of the parameter d are as follows:

a) extracting all question sets of the same type Q ═ Q₁,q₂,q₃...q_nExtracting an average set avgs ═ avg of the questions₁,avg₂,avg₃...,avg_n}；

b) According to the question q_jAverage score avg of_jThe difficulty can be adjusted at the position in the set avgs, if the average score is higher, the difficulty is lower, otherwise, the difficulty is higher.

As a preferred aspect of the present invention, the specific steps of the optimization of the parameter w are as follows:

a) sorting the sample set S according to the sample score sc, wherein the sorted sample set is S';

b) generating a random normal distribution achievement set sc' by taking the average avg and the standard deviation sd of the achievement sets sc of all samples as parameters;

c) by adjusting the weight parameter w, the sc and sc' are fitted by a gradient descent method.

As a preferred aspect of the present invention, sc is obtained in step 1)_iAnd sc are respectively:

sc＝w·h^T。

as a preferred aspect of the present invention, the topic q in step 2)_jThe difficulty adjustment formula is as follows:

the near function maps the values to three numbers of-1, 0 and 1 nearby, the avg function calculates the average value of a set, and the sd function calculates the standard deviation of the set.

As a preferred aspect of the present invention, the loss function in step 3) is formulated as follows:

J(w)＝L₂(w·h^T-sc′)+w²，

wherein L is₂The function represents the L2 distance of a vector to the origin.

As a preferred aspect of the present invention, the specific operation steps of the gradient descent method in step 3) are as follows:

a) selecting one distribution curve of the weight parameters w;

b) obtaining the achievement set sc of all samples in the distribution curve according to the selected distribution curve₁；

c) Gathering sc with the achievements of all samples₁Average avg of₁And standard deviation sd₁As parameters, a set of random normal distribution results sc 'is generated'₁；

d) Selecting another distribution curve of the weight parameter w;

e) obtaining the achievement set sc of all samples in the distribution curve according to the selected distribution curve₂；

f) Gathering sc with the achievements of all samples₂Average avg of₂And standard deviation sd₂As parameters, a set of random normal distribution results sc 'is generated'₂；

g) Repeating the above six steps to obtain the sample score sc₁，sc₂，sc₃，...sc_nAnd average number avg₁，avg₂，avg₃...avg_nAnd standard deviation sd₁，sd₂，sd₃，...sd_nAnd generating a random normal distribution score set sc 'from the obtained data'_n。

The invention has the advantages that:

1. according to the method, a statistical learning method is used, useful information is extracted from the existing sample data which is continuously added, and the topic parameters are optimized, so that the topic model is more robust, and sensitive feedback is provided for different expressive force evaluation inputs;

2. by using the unsupervised learning model, the invention does not need additional labeling work in the learning process, reduces the workload of experts and lowers the labor cost.

Drawings

FIG. 1 is a schematic diagram of difficulty parameter optimization in the present invention;

FIG. 2 is a diagram illustrating the optimization of thesaurus weight vector parameters in the present invention.

Detailed Description

The technical solutions in the embodiments of the present invention will be described clearly and completely with reference to the following embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

The technical scheme provided by the invention is as follows:

2) optimizing the parameter d;

3) and optimizing the parameter w.

b) According to the question q_jAverage score avg of_jThe position in the set avgs can be adjusted in difficulty, and if the average score is higher, the difficulty is lowerOtherwise, the difficulty is high.

sc＝w·h^T。

a) selecting one distribution curve of the weight parameters w;

b) obtaining the score according to the selected distribution curveScore set sc for all samples within a cloth Curve₁；

d) Selecting another distribution curve of the weight parameter w;

g) Repeating the above six steps to obtain the sample score sc₁，sc₂，sc₃，...sc_nAnd average number avg₁，avg₂，avg₃...avg_nSdn, and a standard deviation sd1, sd2, sd3,. and generating a random normal distribution achievement set sc' n from the resulting data.

The foregoing shows and describes the general principles, essential features, and advantages of the invention. It will be understood by those skilled in the art that the present invention is not limited to the embodiments described above, and the preferred embodiments of the present invention are described in the above embodiments and the description, and are not intended to limit the present invention. The scope of the invention is defined by the appended claims and equivalents thereof.

Claims

1. A method for dynamically optimizing an expression evaluation question is characterized in that: by applying the statistical learning method, the accuracy of evaluation and appraisal is improved and the difficulty of evaluation and appraisal is dynamically adjusted while the number of samples is increased, and the method specifically comprises the following steps:

1) for a question q_jD represents the difficulty of the question, b represents the thesaurus of the question, w represents the weighting parameter of the thesaurus, and l represents the weight of the questionDiscrimination, all samples of which constitute a set of samples S ═ S₁,s₂,s₃...,s_n]Wherein s is_i＝{h_i,sc_iDenotes the ith sample, h_iIs its hit vector, sc_iRepresents its score;

2) optimizing the parameter d;

3) and optimizing the parameter w.

2. The method for dynamically optimizing an expressive force evaluation topic according to claim 1, wherein: the specific steps of the optimization of the parameter d are as follows:

3. The method for dynamically optimizing an expressive force evaluation topic according to claim 1, wherein: the specific steps of the optimization of the parameter w are as follows:

4. The method for dynamically optimizing an expressive force evaluation topic according to claim 1, wherein: obtaining sc in step 1)_iAnd sc are respectively:

sc＝w·h^T。

5. the method for dynamically optimizing an expressive force evaluation topic according to claim 1, wherein: topic q in step 2)_jThe difficulty adjustment formula is as follows:

6. The method for dynamically optimizing an expressive force evaluation topic according to claim 1, wherein: the loss function formula in step 3) is as follows:

J(w)＝L₂(w·h^T-sc′)+w²，

7. The method for dynamically optimizing an expressive force evaluation topic according to claim 1, wherein: the specific operation steps of the gradient descent method in the step 3) are as follows:

a) selecting one distribution curve of the weight parameters w;

d) Selecting another distribution curve of the weight parameter w;

f) Set of achievements with all samplessc₂Average avg of₂And standard deviation sd₂As parameters, a set of random normal distribution results sc 'is generated'₂；