CN114333461B

CN114333461B - Automatic subjective question scoring method and system

Info

Publication number: CN114333461B
Application number: CN202011069722.4A
Authority: CN
Inventors: 董黎明; 李仁传; 冯斌; 江凌; 王洪大
Original assignee: Individual
Current assignee: Individual
Priority date: 2020-09-30
Filing date: 2020-09-30
Publication date: 2024-03-12
Anticipated expiration: 2040-09-30
Also published as: CN114333461A

Abstract

The invention discloses an automatic grading method and an automatic grading system for subjective questions. The method comprises the following steps: acquiring an answer set to be scored and a standard answer; forming a subjective question answer set with reference scores and an automatic scoring answer set based on the answer set to be scored and the standard answers; wherein the set of reference scoring answers comprises: randomly selecting n subjective question answers and standard answers from answer sets to be scored; the automatic scoring answer set comprises the remaining subjective question answers after n subjective question answers are removed in the answer set to be scored; obtaining a score set, wherein the score set comprises scores obtained by manually scoring n subjective question answers and scores corresponding to standard answers; for any subjective question answer in the automatic scoring answer set, calculating the similarity between the subjective question answer and each subjective question answer in the reference scoring answer set; sorting the similarity in a descending order to obtain a sorting result; and automatically scoring any subjective question answer by using the sequencing result and the score set to obtain an automatic score.

Description

Automatic subjective question scoring method and system

Technical Field

The invention relates to the field of computers, in particular to an automatic subjective question scoring method and system.

Background

Subjective questions (including noun interpretation, brief answering, discussion, etc.) refer to questions that can be answered by an examinee by giving full play to their subjective activity according to their degree of understanding of the questions. The question subjective question answer may generally contain several points, varying in length from tens to hundreds of words.

The subjective questions answers (subjective questions answers to be evaluated) given by the examinee in the examination need to be automatically scored manually or by a computer according to a given method. The subjective question scoring is a difficult problem in the examination paper marking work of various examination at present, occupies most of examination paper marking time, and realizes automatic scoring of subjective questions by an informatization means, which is an important way for improving examination paper marking speed and efficiency.

Disclosure of Invention

In view of this, the embodiment of the invention provides a subjective question automatic scoring method and a subjective question automatic scoring system, so as to realize automatic scoring of subjective questions.

In order to achieve the above object, the embodiment of the present invention provides the following technical solutions:

an automatic scoring method for subjective questions, comprising:

acquiring an answer set to be scored and a standard answer; the answer set to be scored comprises at least one subjective question answer to be scored;

forming a subjective question answer set with reference scores and an automatic score answer set based on the answer set to be scored and the standard answer; wherein the set of reference scoring answers includes: randomly selecting n subjective question answers from the answer set to be scored and the standard answer; n is a positive integer; the automatic scoring answer set comprises the remaining subjective question answers after the n subjective question answers are removed in the answer set to be scored;

obtaining a score set, wherein the score set comprises scores obtained by manually scoring the n subjective question answers and scores corresponding to the standard answers;

calculating the similarity of any subjective question answer in the automatic scoring answer set and each subjective question answer in the reference scoring answer set; the answer of any subjective question is expressed as a subjective question answer x;

sorting the similarity in a descending order to obtain a sorting result;

and automatically scoring the subjective question answers x by using the sorting result and the score set to obtain automatic scoring scores of the subjective question answers x.

Optionally, the automatically scoring the subjective question answer x by using the ranking result and the score set, and obtaining an automatic scoring value of the subjective question answer x includes:

taking the corresponding scores of the first r positions in the score set in the sorting result as the input of an automatic scoring model, and automatically scoring the subjective question answers x by the automatic scoring model to obtain automatic scoring scores; r is a positive integer less than or equal to n.

Optionally, the automatic scoring model calculates the automatic scoring value of the subjective question answer x according to the following formula:

wherein cx represents the automatic scoring value of the answer x of the subjective question, i represents the ith position in the former r positions, ci represents the score corresponding to the ith position in the scoring set, and bi is the weight corresponding to the ith position.

Optionally, bi has a default value of 1/r.

Optionally, the method further comprises: randomly extracting m subjective question answers in the automatic scoring answer set; obtaining the manual rechecking scores of the m subjective question answers after manual rechecking; m is a positive integer; respectively calculating the score differences between the automatic score values and the manual review scores of the m subjective question answers; the subjective question answers with the score differences larger than a preset threshold are negative samples; and updating the weights corresponding to the first r positions by using the manual rechecking values of the negative samples.

Optionally, the sorting result corresponding to the negative sample is a target sorting result; the manual review score corresponding to the negative sample is expressed as f; the updating the weights corresponding to the first r positions by using the manual rechecking scores of the negative samples comprises: executing at least one round of weight updating operation until an updating stop condition is met; each round of weight update operation at least comprises: determining a first target score and a second target score in the target score set; the set of target scores includes: the first r positions of the target sorting result are marked as valid in the score corresponding to the score set, and each score in the target score set is marked as valid initially; the first target score is the score which has the smallest f-difference and is marked as valid, and the second target score is the score which has the largest f-difference and is marked as valid; updating the weights of the positions corresponding to the first target score and the second target score respectively; marking the first target score and the second target score as invalid; recalculating the automatic scoring values of the negative samples using the updated weights; the recalculated automatic scoring score is denoted as f'; judging whether f and f' meet the updating stop condition, if so, stopping; if not, judging whether the number of the scores marked as valid by the target score set is not less than 2; if not, returning to execute the step of determining a first target score and a second target score in the target score set; if not, marking all the scores in the target score set as valid, and returning to execute the step of determining the first target score and the second target score in the target score set.

Optionally, the position corresponding to the first target score is abb _min The weight corresponding to the update is denoted b _min The method comprises the steps of carrying out a first treatment on the surface of the The corresponding position of the second target score is abb _max The weight corresponding to the update is denoted b _max The method comprises the steps of carrying out a first treatment on the surface of the The updating the weights of the positions corresponding to the first target score and the second target score respectively comprises the following steps: a is equal to b _min And b is b _max The sum value as position abb _min The updated weight is added; c.bmax as the position abb _max The updated weights; a. and b and c are update coefficients.

An automatic subjective question scoring system comprising:

an acquisition unit configured to:

a similarity calculation system for:

sorting the similarity in a descending order to obtain a sorting result;

an automatic scoring system for: and automatically scoring the subjective question answers x by using the sorting result and the score set to obtain automatic scoring scores of the subjective question answers x.

Optionally, the automatic scoring system includes: an automatic scoring model for: taking the scores corresponding to the first r positions in the score set in the sorting result as input, and automatically scoring the subjective question answers x to obtain automatic scoring scores; r is a positive integer less than or equal to n.

Optionally, the method further comprises: an updating system for: randomly extracting m subjective question answers in the automatic scoring answer set; obtaining the manual rechecking scores of the m subjective question answers after manual rechecking; m is a positive integer; respectively calculating the score differences between the automatic score values and the manual review scores of the m subjective question answers; the subjective question answers with the score differences larger than a preset threshold are negative samples; and updating the weights corresponding to the first r positions by using the manual rechecking values of the negative samples.

Therefore, in the embodiment of the invention, part of subjective question answers are extracted for manual scoring, and the obtained manual scoring values are used as reference data. For any subjective question answer, calculating the similarity between the subjective question answer and the standard answer and the answer subjected to manual scoring, and automatically scoring based on the similarity and the manual scoring score to obtain an automatic scoring score, so that the subjective question is automatically scored, and a high-quality automatic scoring result is obtained.

Drawings

FIG. 1 is an exemplary architecture of an automatic subjective question scoring system according to an embodiment of the present invention;

FIG. 2 is an exemplary flow chart of a method for automatically scoring subjective questions according to an embodiment of the present invention;

FIG. 3 is a schematic diagram of the main operations involved in the manual scoring stage according to an embodiment of the present invention;

FIG. 4 is a schematic diagram of main operations involved in a similarity calculation stage according to an embodiment of the present invention;

FIG. 5 is a schematic diagram of the main operations involved in the automatic scoring stage according to an embodiment of the present invention;

FIG. 6 is another exemplary flow chart of a method for automatically scoring subjective questions according to an embodiment of the present invention;

FIG. 7a is an exemplary flow of a weight update operation provided by an embodiment of the present invention;

FIG. 7b is another exemplary flow of a weight update operation provided by an embodiment of the present invention;

fig. 8 is an exemplary flow of a prior art automatic scoring method.

Detailed Description

The invention provides an automatic subjective question scoring system and method, which are used for realizing automatic subjective question scoring.

Referring to fig. 1, an exemplary structure of the automatic subjective question scoring system includes: an acquisition unit 1, a similarity calculation system 2 and an automatic scoring system 3.

Wherein the automatic scoring system 3 may further comprise: an automatic scoring model.

The system may further comprise an updating system 4 for updating the weights in the automatic scoring model.

The acquisition unit 1, the similarity calculation system 2, the automatic scoring system 3, and the updating system 4 may be installed in the same device, or may be deployed in separate devices, respectively.

FIG. 2 illustrates an exemplary flow of the subjective question automatic scoring method performed by the subjective question automatic scoring system described above, including:

s1: and obtaining an answer set to be scored and standard answers.

The answer set S to be scored comprises at least one subjective question answer to be scored.

The subjective question answers and the standard answers are in text form.

S2: and forming a reference scoring subjective question answer set and an automatic scoring answer set based on the answer set to be scored and the standard answer.

Wherein the set of reference scoring answers comprises: randomly selecting n subjective question answers and standard answers from answer sets to be scored; n is a positive integer.

The standard answers can be multiple or one, and the standard score is a full score of the subjective questions, for example, 20 points.

Alternatively, the reference scoring answer set may further include a subset of answers to be evaluated s and p standard answers. Wherein the number of elements of the subset s of answers to be evaluated (i.e. answers to be evaluated) is n, and each element is s ₁ ,s ₂ ,…s _n 。

The automatic scoring answer set X may include the subjective question answers remaining after n subjective question answers are removed from the answer set S to be scored (i.e., the answer subset S to be evaluated is removed).

S3: a set of scores is obtained.

The score set comprises scores (which can be called as manual score scores) obtained by manually scoring n subjective question answers, and scores corresponding to standard answers.

The n subjective questions in the answer subset s are manually scored to become an evaluated answer set Y (Y) ₁ ,y ₂ ,…y _n ) The labels of the elements in the two sets are consistent; the reference scoring answer set is changed to include the set of evaluated answers Y and p standard answers.

The score set may be represented by C, where C ₁ ,c ₂ ,…c _n Respectively is y ₁ ,y ₂ ,…y _n Corresponding manual scoring score, and c _n+1 …c _n+p The scores of the standard answers (i.e., full score) are respectively given.

Steps S1-S3 may be collectively referred to as the manual scoring stage, and may be performed by the aforementioned acquisition unit 1.

The main operations involved in the manual scoring stage can be seen in fig. 3.

S4: for any subjective question answer in the automatic scoring answer set, calculating the similarity between any subjective question answer and each subjective question answer in the reference scoring answer set;

any subjective question answer is expressed as a subjective question answer x (answer to be evaluated x);

assuming that 10 subjective questions are in the automatic scoring answer set, the similarity between the subjective question answer x and the 10 subjective question answers is obtained.

S5: sorting the similarity in a descending order to obtain a sorting result;

steps S4 and S5 may be collectively referred to as a similarity calculation stage, and may be performed by the similarity calculation system 2 described above.

Specifically, the functions of the similarity calculation system may be implemented using various machine learning models, such as a TF-IDF (terminal frequency-Inverse Document Frequency) model, which is not described herein.

Assuming that n+1 subjective questions in the automatic scoring answer set (one of which is a standard answer), the main operations involved in the similarity calculation stage can be seen in fig. 4.

S6: and automatically scoring the subjective question answers x by using the sorting result and the score set to obtain automatic scoring scores (cx) of the subjective question answers x.

Step S6 may be performed by the automatic scoring system 3 described previously.

In one example, the automatic scoring system 3 may be specifically an automatic scoring model, where the score corresponding to the first r positions in the score set in the ranking result may be used as an input of the automatic scoring model, and the automatic scoring model may score the subjective question answer x automatically to obtain an automatic scoring score.

r is a positive integer less than or equal to n.

Assuming n=10 and r=5, the scores corresponding to the first 5 positions in the ranking result are taken as input to the automatic scoring model. The main operations involved in the automatic scoring stage can be seen in fig. 5.

An automatic scoring model is described below.

In other embodiments of the present invention, the automatic scoring model calculates the automatic scoring value of the answer x to the subjective question according to the following formula:

wherein cx represents the automatic scoring score of the answer x of the subjective question, i represents the ith position in the former r positions, ci represents the score corresponding to the ith position in the scoring set, and bi is the weight corresponding to the ith position.

The default value of bi is 1/r. I.e. taking the average of the scores of the first n most similar evaluated solutions. For example, if the scores of the first 5 most similar evaluated subjective-question answers are {18,20,18,16,20}, respectively, the subjective-question answer x is scored as 18.4 according to the default weight of the model.

Of course, in actual situations, other parameter models can be selected, and different weight values can be given to the evaluated answers with different similarity ranks.

In a general scoring system, for negative samples (the automatic scoring score and the manual scoring score have a large difference), the negative samples are discarded only after the manual scoring correction, and the value of the negative samples is ignored.

The update of the automatic scoring model may be performed using a negative sample, see fig. 6, which illustratively includes the steps of:

s7: m subjective question answers in the automatic scoring answer set are randomly extracted.

m is a positive integer.

S8: and obtaining the manual rechecking scores of the m subjective question answers after manual rechecking.

That is, the person manually rechecks the answers of the m subjective questions to obtain the scores.

S9: respectively calculating the score difference between the automatic score values and the manual review scores of the m subjective question answers;

the subjective question answers with the score differences larger than the preset threshold are negative samples.

Those skilled in the art can flexibly design the preset threshold value, for example 1,2,0.5.

S10: and updating the weights corresponding to the first r positions by using the manual rechecking values of the negative samples.

That is, the weight values (i.e., bi values in equation 1) of the respective components in the automatic scoring model are calculated reversely.

The model updating is an optional process, that is, whether the model updating is performed or not can be determined according to actual requirements in actual application, and if the scoring result can meet the use requirements, the model updating is not performed.

An exemplary manner of updating the weights corresponding to the first r positions using the manual review score of the negative sample is described below.

For convenience of description, the sorting result corresponding to the negative sample can be taken as a target sorting result, and the manual rechecking value corresponding to the negative sample is represented by f;

at least one round of weight update operation may be performed until an update stop condition is satisfied to update weights corresponding to the first r locations.

Referring to fig. 7a, each round of weight update operation at least includes:

s701: a first target score and a second target score are determined in the set of target scores.

The target score set includes: the first r positions of the target ranking result are the scores corresponding to the score sets.

The first target score is the score which has the smallest difference with f and is marked as effective, and the second target score is the score which has the largest difference with f and is marked as effective. Initially, each score in the target score set is marked as valid.

Assuming that the most similar r subjective questions to the negative sample answer y ₁ ,y ₂ ,…,y _r Its corresponding score is c ₁ ,c ₂ ,…,c _r Initially, c ₁ ,c ₂ ,…,c _r Are marked as valid.

If c ₃ Minimum difference from f, c ₅ Maximum difference from f, c ₃ For the first target score, c ₅ Is the second target score.

S702: and respectively updating the weights of the positions corresponding to the first target score and the second target score.

In one example, let the location corresponding to the first target score be denoted as abb _min The weight corresponding to the update is denoted b _min The method comprises the steps of carrying out a first treatment on the surface of the The position corresponding to the second target score is abb _max The weight corresponding to the update is denoted b _max ；

Then a is b _min And b is b _max The sum value as position abb _min Updated weights are updated to take c×bmax as the position abb _max The updated weights; a. and b and c are update coefficients.

a. The values of b and c can be flexibly designed, and a is assumed to be 1, b is assumed to be 0.5, and c is assumed to be 0.5.

In the previous example, c ₃ For the first target score, c ₅ The value of the weight corresponding to the second target score before updating is 0.2,0.2.

Assuming that a takes 1, b takes 0.5 and c takes 0.5, c ₃ The new weight of the corresponding position is 0.3, c ₅ The new weight of the corresponding position is 0.1.

S703: the first target score and the second target score are marked as invalid.

Following the previous example, then at c ₁ ,c ₂ ,…,c _r Will c in (b) ₃ And c ₅ Marked invalid.

S704: the updated weights are used to recalculate the automatic scoring scores for the negative samples.

The recalculated automatic scoring score is denoted as f'.

S705: judging whether f and f' meet the update stop condition, if so, stopping, and if not, entering S706;

in one example, the update stop condition may include: f' is within d% of f.

The person skilled in the art can flexibly design the value of d, for example 5, 10, etc.

S706: judging whether the number of the scores marked as valid by the target score set is not less than 2, if not, returning to S701; if yes, go to S707;

since the maximum and minimum differences cannot be determined if the number is less than 2, re-execution is required.

S707: all scores in the target score set are marked as valid and the process returns to S701.

Alternatively, referring to fig. 7b, each round of weight update operation at least includes:

s71: a first target score and a second target score are determined in the set of target scores.

The first target score is the score with the smallest difference value with f, and the second target score is the score with the largest difference value with f.

Assuming that the most similar r subjective questions to the negative sample answer y ₁ ,y ₂ ,…,y _r Its corresponding score is c ₁ ,c ₂ ,…,c _r . If c ₃ Minimum difference from f, c ₅ Maximum difference from f, c ₃ For the first target score, c ₅ Is the second target score.

S72: and respectively updating the weights of the positions corresponding to the first target score and the second target score.

S73: the first target score and the second target score are deleted from the target score set.

Following the previous example, then at c ₁ ,c ₂ ,…,c _r Delete c in ₃ And c ₅ 。

S74: the updated weights are used to recalculate the automatic scoring scores for the negative samples.

The recalculated automatic scoring score is denoted as f'.

S75: judging whether f and f' meet the update stop condition, if so, stopping, and if not, entering S76;

in one example, the update stop condition may include: f' is within d% of f.

S76: judging whether the target score set is empty or only one element is left, if not, returning to the step S71; if yes, go to S77;

s77: and (3) enabling the target score set to comprise scores corresponding to the first r positions of the target sorting result in the score set, and returning to the step (S71).

It should be noted that, in a conventional automatic scoring method, a main flow is shown in fig. 8, and a necessary precondition is that "a word vector associated with a category or a knowledge point to which a test question belongs is trained in advance".

On the basis, the existing method calculates word weights in two texts after receiving a student answer text and a reference answer text, and determines a final word representation sequence in the two texts, namely, the student answer text and the reference answer text are output in the word sequence and the corresponding weight;

then, using a WMD (Word river's Distance) model to calculate the Distance between the weight data of the student answer text and the Word weight data in the precondition, and then converting the Distance into corresponding text similarity; finally, the text similarity is converted into a corresponding score that is automatically scored.

1. A corpus is needed as a basis. In the method proposed in paper 1, the precondition of "pre-trained word vectors related to the category or knowledge point to which the test question belongs" requires a high-quality corpus as a basis. In general, the establishment and maintenance of a high-quality corpus are complex, not only a large amount of high-quality text data is needed, but also the text data needs to be segmented, weighted and the like, each process needs manual intervention, and Chinese segmentation is still a very difficult bottleneck problem in academia due to the specificity of Chinese; meanwhile, the language content of different professions, industries or fields has larger difference, and even for the same text, different word segmentation modes exist. The assignment of weights to words after word segmentation is also a task with very strong subjectivity and specialty, which requires a lot of manpower and has high professional demands for personnel.

2. The importance of subjective judgment of the scoring staff cannot be reflected. The emphasis of the existing method is basically to compare the answer to be evaluated with the standard answer through various algorithms and the existing data base, adjust the comparison result, and read the comparison result from different angles to calculate the final score. In the process, the importance of manual scoring is ignored, because the diversity and complexity of languages can only measure the similarity or classification from certain aspects, and the subjective judgment of people cannot be completely replaced, so that the premise of acquiring a high-quality automatic scoring result is that the intervention of manual judgment is needed.

3. Updating the scoring model is difficult. In the implementation process of the method, although a link for manually checking the automatic scoring result exists, the link only binarizes the scoring result, namely whether the result is reasonable or not is judged, and the reasonable result is regarded as a positive sample (the difference between the automatic scoring result and the manual scoring result is smaller) and is added into a corpus; and for negative samples (the automatic score and the manual score have a larger difference), the negative samples are discarded only after the manual score correction, and the value of the negative samples is ignored.

Compared with the method, the automatic scoring method provided by the embodiment of the invention has the following advantages:

1. independent of the corpus. When calculating the similarity, the fact that the words are identical is considered, and the similarity at the semantic level is not related, so that the support of a corpus is not needed.

2. Automatic scoring is implemented on the basis of manual scoring, and the importance of subjective judgment of scoring personnel is fully reflected.

3. Updating and correcting the automatic scoring model based on the negative sample: after the automatic scoring is finished, the scoring model can be updated according to the situation, the deviation of the automatic scoring is corrected in a manual rechecking mode, and the automatic scoring model is updated and corrected on the basis of a negative sample, so that the accuracy of the model is improved.

An automatic subjective question scoring system is described below. Referring to fig. 1, exemplary includes:

an acquisition unit 1 for:

forming a subjective question answer set with reference scores and an automatic scoring answer set based on the answer set to be scored and the standard answers; wherein the set of reference scoring answers comprises: randomly selecting n subjective question answers and standard answers from answer sets to be scored; n is a positive integer; the automatic scoring answer set comprises the remaining subjective question answers after n subjective question answers are removed in the answer set to be scored;

obtaining a score set, wherein the score set comprises scores obtained by manually scoring n subjective question answers and scores corresponding to standard answers;

a similarity calculation system 2 for:

for any subjective question answer in the automatic scoring answer set, calculating the similarity between any subjective question answer and each subjective question answer in the reference scoring answer set; any subjective question answer is expressed as a subjective question answer x;

sorting the similarity in a descending order to obtain a sorting result;

an automatic scoring system 3 for: and automatically scoring the subjective question answers x by using the sorting result and the score set to obtain automatic scoring values of the subjective question answers x.

Specific details are described in the foregoing description, and are not repeated here.

In other embodiments of the present invention, the automatic scoring system includes:

an automatic scoring model for: taking the scores corresponding to the first r positions in the score set in the sorting result as input, and automatically scoring the subjective question answers x to obtain automatic scoring scores; r is a positive integer less than or equal to n.

In other embodiments of the present invention, the automatic scoring model calculates the automatic scoring value of the answer x of the subjective question according to the following formula:

wherein cx represents the automatic scoring score of the answer x of the subjective question, i represents the ith position in the former r positions, ci represents the score corresponding to the ith position in the scoring set, and bi is the weight corresponding to the ith position. The default value of bi is 1/r.

In other embodiments of the present invention, the automatic subjective question scoring system may further include:

an updating system 4 for:

randomly extracting m subjective question answers in the automatic scoring answer set;

obtaining the manual rechecking scores of m subjective question answers after manual rechecking; m is a positive integer;

respectively calculating the score difference between the automatic score values and the manual review scores of the m subjective question answers; the subjective question answers with the score differences larger than a preset threshold are negative samples;

and updating the weights corresponding to the first r positions by using the manual rechecking values of the negative samples.

The sorting result corresponding to the negative sample is a target sorting result; the artificial review score corresponding to the negative sample is denoted as f.

In other embodiments of the present invention, in terms of updating weights corresponding to the previous r positions using the manual review score of the negative sample, the updating system 4 may be specifically configured to: and executing at least one round of weight updating operation until the updating stop condition is met.

Wherein, each round of weight updating operation at least comprises:

determining a first target score and a second target score in the target score set; the target score set includes: the first r positions of the target sequencing result are the scores corresponding to the score sets; the first target score is the score with the smallest difference value with f, and the second target score is the score with the largest difference value with f;

updating the weights of the positions corresponding to the first target score and the second target score respectively;

deleting the first target score and the second target score from the target score set;

recalculating the automatic scoring values of the negative samples using the updated weights; the recalculated automatic scoring score is denoted as f';

judging whether f and f' meet update stop conditions, if so, stopping;

if not, judging whether the target score set is empty or only one element is left;

if yes, returning to execute the step of determining a first target score and a second target score in the target score set;

if not, the target score set comprises the scores corresponding to the first r positions of the target sorting result in the score set, and the step of determining the first target score and the second target score in the target score set is returned to be executed.

Alternatively, each round of weight update operation includes at least:

determining a first target score and a second target score in the target score set; the set of target scores includes: the first r positions of the target sorting result are marked as valid in the score corresponding to the score set, and each score in the target score set is marked as valid initially; the first target score is the score which has the smallest f-difference and is marked as valid, and the second target score is the score which has the largest f-difference and is marked as valid;

marking the first target score and the second target score as invalid;

judging whether f and f' meet the updating stop condition, if so, stopping;

if not, judging whether the number of the scores marked as valid by the target score set is not less than 2;

if not, returning to execute the step of determining a first target score and a second target score in the target score set;

if not, marking all the scores in the target score set as valid, and returning to execute the step of determining the first target score and the second target score in the target score set.

In other embodiments of the present invention, the location corresponding to the first target score is denoted as abb _min The weight corresponding to the update is denoted b _min The method comprises the steps of carrying out a first treatment on the surface of the The position corresponding to the second target score is abb _max The weight corresponding to the update is denoted b _max ；

The updating system 4 may be specifically configured to, in terms of updating weights of positions corresponding to the first target score and the second target score, respectively:

a is equal to b _min And b is b _max The sum value as position abb _min The updated weight is added;

c.bmax as the position abb _max The updated weights;

a. and b and c are update coefficients.

In the present specification, each embodiment is described in a progressive manner, and each embodiment is mainly described in a different point from other embodiments, and identical and similar parts between the embodiments are all enough to refer to each other. For the device disclosed in the embodiment, since it corresponds to the method disclosed in the embodiment, the description is relatively simple, and reference is made to the description of the method section.

Those of skill would further appreciate that the elements and model steps of the examples described in connection with the embodiments disclosed herein may be implemented as electronic hardware, computer software, or combinations of both, and that the elements and steps of the examples have been described generally in terms of functionality in the foregoing description to clearly illustrate this interchangeability of hardware and software. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the solution. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present invention.

The steps of a method or model described in connection with the embodiments disclosed herein may be embodied directly in hardware, in a software module executed by a processor, or in a combination of the two. The software modules may be disposed in Random Access Memory (RAM), memory, read Only Memory (ROM), electrically programmable ROM, electrically erasable programmable ROM, registers, hard disk, a removable disk, WD-ROM, or any other form of storage medium known in the art.

The previous description of the disclosed embodiments is provided to enable any person skilled in the art to make or use the present invention. Various modifications to these embodiments will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other embodiments without departing from the spirit or scope of the invention. Thus, the present invention is not intended to be limited to the embodiments shown herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.

Claims

1. An automatic subjective question scoring method, comprising:

forming a subjective question answer set with reference scores and an automatic score answer set based on the answer set to be scored and the standard answer; wherein, the subjective question answer set of the reference score comprises: randomly selecting n subjective question answers from the answer set to be scored and the standard answer; n is a positive integer; the automatic scoring answer set comprises the remaining subjective question answers after the n subjective question answers are removed in the answer set to be scored;

sorting the similarity in a descending order to obtain a sorting result;

automatically scoring the subjective question answers x by using the sorting result and the score set to obtain automatic scoring values of the subjective question answers x; wherein, the automatically scoring the subjective question answer x by using the sorting result and the score set, and obtaining the automatic scoring value of the subjective question answer x includes: taking the corresponding scores of the first r positions in the score set in the sorting result as the input of an automatic scoring model, and automatically scoring the subjective question answers x by the automatic scoring model to obtain automatic scoring scores; r is a positive integer less than or equal to n;

wherein the method further comprises: randomly extracting m subjective question answers in the automatic scoring answer set;

obtaining the manual rechecking scores of the m subjective question answers after manual rechecking; m is a positive integer;

respectively calculating the score differences between the automatic score values and the manual review scores of the m subjective question answers; the subjective question answers with the score differences larger than a preset threshold are negative samples;

updating weights corresponding to the first r positions by using the manual rechecking values of the negative samples;

the sorting result corresponding to the negative sample is a target sorting result;

the manual review score corresponding to the negative sample is expressed as f;

the updating the weights corresponding to the first r positions by using the manual rechecking scores of the negative samples comprises: executing at least one round of weight updating operation until an updating stop condition is met;

each round of weight update operation at least comprises:

marking the first target score and the second target score as invalid;

judging whether f and f' meet the updating stop condition, if so, stopping;

if yes, marking all the scores in the target score set as valid, and returning to execute the step of determining the first target score and the second target score in the target score set.

2. The method according to claim 1, wherein the automatic scoring model calculates the automatic scoring value of the subjective question answer x according to the following formula:

3. The method of claim 1, wherein,

the corresponding position of the first target score is abb _min The weight corresponding to the update is denoted b _min ；

The corresponding position of the second target score is abb _max The weight corresponding to the update is denoted b _max ；

The updating the weights of the positions corresponding to the first target score and the second target score respectively comprises the following steps:

c.bmax as the position abb _max The updated weights;

a. and b and c are update coefficients.

4. An automatic subjective question scoring system, comprising:

an acquisition unit configured to:

a similarity calculation system for:

sorting the similarity in a descending order to obtain a sorting result;

an automatic scoring system for: automatically scoring the subjective question answers x by using the sorting result and the score set to obtain automatic scoring values of the subjective question answers x; wherein the automatic scoring system comprises: an automatic scoring model for: taking the scores corresponding to the first r positions in the score set in the sorting result as input, and automatically scoring the subjective question answers x to obtain automatic scoring scores; r is a positive integer less than or equal to n;

an updating system for:

the manual review score corresponding to the negative sample is expressed as f;

each round of weight update operation at least comprises:

marking the first target score and the second target score as invalid;

judging whether f and f' meet the updating stop condition, if so, stopping;

5. The system of claim 4, wherein the automatic scoring model calculates the automatic scoring value of the subjective question answer x according to the following formula: