CN114333461B - Automatic subjective question scoring method and system - Google Patents

Automatic subjective question scoring method and system Download PDF

Info

Publication number
CN114333461B
CN114333461B CN202011069722.4A CN202011069722A CN114333461B CN 114333461 B CN114333461 B CN 114333461B CN 202011069722 A CN202011069722 A CN 202011069722A CN 114333461 B CN114333461 B CN 114333461B
Authority
CN
China
Prior art keywords
score
answer
subjective question
scoring
target
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202011069722.4A
Other languages
Chinese (zh)
Other versions
CN114333461A (en
Inventor
董黎明
李仁传
冯斌
江凌
王洪大
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Individual
Original Assignee
Individual
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Individual filed Critical Individual
Priority to CN202011069722.4A priority Critical patent/CN114333461B/en
Publication of CN114333461A publication Critical patent/CN114333461A/en
Application granted granted Critical
Publication of CN114333461B publication Critical patent/CN114333461B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Landscapes

  • Electrically Operated Instructional Devices (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention discloses an automatic grading method and an automatic grading system for subjective questions. The method comprises the following steps: acquiring an answer set to be scored and a standard answer; forming a subjective question answer set with reference scores and an automatic scoring answer set based on the answer set to be scored and the standard answers; wherein the set of reference scoring answers comprises: randomly selecting n subjective question answers and standard answers from answer sets to be scored; the automatic scoring answer set comprises the remaining subjective question answers after n subjective question answers are removed in the answer set to be scored; obtaining a score set, wherein the score set comprises scores obtained by manually scoring n subjective question answers and scores corresponding to standard answers; for any subjective question answer in the automatic scoring answer set, calculating the similarity between the subjective question answer and each subjective question answer in the reference scoring answer set; sorting the similarity in a descending order to obtain a sorting result; and automatically scoring any subjective question answer by using the sequencing result and the score set to obtain an automatic score.

Description

Automatic subjective question scoring method and system
Technical Field
The invention relates to the field of computers, in particular to an automatic subjective question scoring method and system.
Background
Subjective questions (including noun interpretation, brief answering, discussion, etc.) refer to questions that can be answered by an examinee by giving full play to their subjective activity according to their degree of understanding of the questions. The question subjective question answer may generally contain several points, varying in length from tens to hundreds of words.
The subjective questions answers (subjective questions answers to be evaluated) given by the examinee in the examination need to be automatically scored manually or by a computer according to a given method. The subjective question scoring is a difficult problem in the examination paper marking work of various examination at present, occupies most of examination paper marking time, and realizes automatic scoring of subjective questions by an informatization means, which is an important way for improving examination paper marking speed and efficiency.
Disclosure of Invention
In view of this, the embodiment of the invention provides a subjective question automatic scoring method and a subjective question automatic scoring system, so as to realize automatic scoring of subjective questions.
In order to achieve the above object, the embodiment of the present invention provides the following technical solutions:
an automatic scoring method for subjective questions, comprising:
acquiring an answer set to be scored and a standard answer; the answer set to be scored comprises at least one subjective question answer to be scored;
forming a subjective question answer set with reference scores and an automatic score answer set based on the answer set to be scored and the standard answer; wherein the set of reference scoring answers includes: randomly selecting n subjective question answers from the answer set to be scored and the standard answer; n is a positive integer; the automatic scoring answer set comprises the remaining subjective question answers after the n subjective question answers are removed in the answer set to be scored;
obtaining a score set, wherein the score set comprises scores obtained by manually scoring the n subjective question answers and scores corresponding to the standard answers;
calculating the similarity of any subjective question answer in the automatic scoring answer set and each subjective question answer in the reference scoring answer set; the answer of any subjective question is expressed as a subjective question answer x;
sorting the similarity in a descending order to obtain a sorting result;
and automatically scoring the subjective question answers x by using the sorting result and the score set to obtain automatic scoring scores of the subjective question answers x.
Optionally, the automatically scoring the subjective question answer x by using the ranking result and the score set, and obtaining an automatic scoring value of the subjective question answer x includes:
taking the corresponding scores of the first r positions in the score set in the sorting result as the input of an automatic scoring model, and automatically scoring the subjective question answers x by the automatic scoring model to obtain automatic scoring scores; r is a positive integer less than or equal to n.
Optionally, the automatic scoring model calculates the automatic scoring value of the subjective question answer x according to the following formula:
wherein cx represents the automatic scoring value of the answer x of the subjective question, i represents the ith position in the former r positions, ci represents the score corresponding to the ith position in the scoring set, and bi is the weight corresponding to the ith position.
Optionally, bi has a default value of 1/r.
Optionally, the method further comprises: randomly extracting m subjective question answers in the automatic scoring answer set; obtaining the manual rechecking scores of the m subjective question answers after manual rechecking; m is a positive integer; respectively calculating the score differences between the automatic score values and the manual review scores of the m subjective question answers; the subjective question answers with the score differences larger than a preset threshold are negative samples; and updating the weights corresponding to the first r positions by using the manual rechecking values of the negative samples.
Optionally, the sorting result corresponding to the negative sample is a target sorting result; the manual review score corresponding to the negative sample is expressed as f; the updating the weights corresponding to the first r positions by using the manual rechecking scores of the negative samples comprises: executing at least one round of weight updating operation until an updating stop condition is met; each round of weight update operation at least comprises: determining a first target score and a second target score in the target score set; the set of target scores includes: the first r positions of the target sorting result are marked as valid in the score corresponding to the score set, and each score in the target score set is marked as valid initially; the first target score is the score which has the smallest f-difference and is marked as valid, and the second target score is the score which has the largest f-difference and is marked as valid; updating the weights of the positions corresponding to the first target score and the second target score respectively; marking the first target score and the second target score as invalid; recalculating the automatic scoring values of the negative samples using the updated weights; the recalculated automatic scoring score is denoted as f'; judging whether f and f' meet the updating stop condition, if so, stopping; if not, judging whether the number of the scores marked as valid by the target score set is not less than 2; if not, returning to execute the step of determining a first target score and a second target score in the target score set; if not, marking all the scores in the target score set as valid, and returning to execute the step of determining the first target score and the second target score in the target score set.
Optionally, the position corresponding to the first target score is abb min The weight corresponding to the update is denoted b min The method comprises the steps of carrying out a first treatment on the surface of the The corresponding position of the second target score is abb max The weight corresponding to the update is denoted b max The method comprises the steps of carrying out a first treatment on the surface of the The updating the weights of the positions corresponding to the first target score and the second target score respectively comprises the following steps: a is equal to b min And b is b max The sum value as position abb min The updated weight is added; c.bmax as the position abb max The updated weights; a. and b and c are update coefficients.
An automatic subjective question scoring system comprising:
an acquisition unit configured to:
acquiring an answer set to be scored and a standard answer; the answer set to be scored comprises at least one subjective question answer to be scored;
forming a subjective question answer set with reference scores and an automatic score answer set based on the answer set to be scored and the standard answer; wherein the set of reference scoring answers includes: randomly selecting n subjective question answers from the answer set to be scored and the standard answer; n is a positive integer; the automatic scoring answer set comprises the remaining subjective question answers after the n subjective question answers are removed in the answer set to be scored;
obtaining a score set, wherein the score set comprises scores obtained by manually scoring the n subjective question answers and scores corresponding to the standard answers;
a similarity calculation system for:
calculating the similarity of any subjective question answer in the automatic scoring answer set and each subjective question answer in the reference scoring answer set; the answer of any subjective question is expressed as a subjective question answer x;
sorting the similarity in a descending order to obtain a sorting result;
an automatic scoring system for: and automatically scoring the subjective question answers x by using the sorting result and the score set to obtain automatic scoring scores of the subjective question answers x.
Optionally, the automatic scoring system includes: an automatic scoring model for: taking the scores corresponding to the first r positions in the score set in the sorting result as input, and automatically scoring the subjective question answers x to obtain automatic scoring scores; r is a positive integer less than or equal to n.
Optionally, the automatic scoring model calculates the automatic scoring value of the subjective question answer x according to the following formula:
wherein cx represents the automatic scoring value of the answer x of the subjective question, i represents the ith position in the former r positions, ci represents the score corresponding to the ith position in the scoring set, and bi is the weight corresponding to the ith position.
Optionally, the method further comprises: an updating system for: randomly extracting m subjective question answers in the automatic scoring answer set; obtaining the manual rechecking scores of the m subjective question answers after manual rechecking; m is a positive integer; respectively calculating the score differences between the automatic score values and the manual review scores of the m subjective question answers; the subjective question answers with the score differences larger than a preset threshold are negative samples; and updating the weights corresponding to the first r positions by using the manual rechecking values of the negative samples.
Therefore, in the embodiment of the invention, part of subjective question answers are extracted for manual scoring, and the obtained manual scoring values are used as reference data. For any subjective question answer, calculating the similarity between the subjective question answer and the standard answer and the answer subjected to manual scoring, and automatically scoring based on the similarity and the manual scoring score to obtain an automatic scoring score, so that the subjective question is automatically scored, and a high-quality automatic scoring result is obtained.
Drawings
FIG. 1 is an exemplary architecture of an automatic subjective question scoring system according to an embodiment of the present invention;
FIG. 2 is an exemplary flow chart of a method for automatically scoring subjective questions according to an embodiment of the present invention;
FIG. 3 is a schematic diagram of the main operations involved in the manual scoring stage according to an embodiment of the present invention;
FIG. 4 is a schematic diagram of main operations involved in a similarity calculation stage according to an embodiment of the present invention;
FIG. 5 is a schematic diagram of the main operations involved in the automatic scoring stage according to an embodiment of the present invention;
FIG. 6 is another exemplary flow chart of a method for automatically scoring subjective questions according to an embodiment of the present invention;
FIG. 7a is an exemplary flow of a weight update operation provided by an embodiment of the present invention;
FIG. 7b is another exemplary flow of a weight update operation provided by an embodiment of the present invention;
fig. 8 is an exemplary flow of a prior art automatic scoring method.
Detailed Description
The invention provides an automatic subjective question scoring system and method, which are used for realizing automatic subjective question scoring.
Referring to fig. 1, an exemplary structure of the automatic subjective question scoring system includes: an acquisition unit 1, a similarity calculation system 2 and an automatic scoring system 3.
Wherein the automatic scoring system 3 may further comprise: an automatic scoring model.
The system may further comprise an updating system 4 for updating the weights in the automatic scoring model.
The acquisition unit 1, the similarity calculation system 2, the automatic scoring system 3, and the updating system 4 may be installed in the same device, or may be deployed in separate devices, respectively.
FIG. 2 illustrates an exemplary flow of the subjective question automatic scoring method performed by the subjective question automatic scoring system described above, including:
s1: and obtaining an answer set to be scored and standard answers.
The answer set S to be scored comprises at least one subjective question answer to be scored.
The subjective question answers and the standard answers are in text form.
S2: and forming a reference scoring subjective question answer set and an automatic scoring answer set based on the answer set to be scored and the standard answer.
Wherein the set of reference scoring answers comprises: randomly selecting n subjective question answers and standard answers from answer sets to be scored; n is a positive integer.
The standard answers can be multiple or one, and the standard score is a full score of the subjective questions, for example, 20 points.
Alternatively, the reference scoring answer set may further include a subset of answers to be evaluated s and p standard answers. Wherein the number of elements of the subset s of answers to be evaluated (i.e. answers to be evaluated) is n, and each element is s 1 ,s 2 ,…s n
The automatic scoring answer set X may include the subjective question answers remaining after n subjective question answers are removed from the answer set S to be scored (i.e., the answer subset S to be evaluated is removed).
S3: a set of scores is obtained.
The score set comprises scores (which can be called as manual score scores) obtained by manually scoring n subjective question answers, and scores corresponding to standard answers.
The n subjective questions in the answer subset s are manually scored to become an evaluated answer set Y (Y) 1 ,y 2 ,…y n ) The labels of the elements in the two sets are consistent; the reference scoring answer set is changed to include the set of evaluated answers Y and p standard answers.
The score set may be represented by C, where C 1 ,c 2 ,…c n Respectively is y 1 ,y 2 ,…y n Corresponding manual scoring score, and c n+1 …c n+p The scores of the standard answers (i.e., full score) are respectively given.
Steps S1-S3 may be collectively referred to as the manual scoring stage, and may be performed by the aforementioned acquisition unit 1.
The main operations involved in the manual scoring stage can be seen in fig. 3.
S4: for any subjective question answer in the automatic scoring answer set, calculating the similarity between any subjective question answer and each subjective question answer in the reference scoring answer set;
any subjective question answer is expressed as a subjective question answer x (answer to be evaluated x);
assuming that 10 subjective questions are in the automatic scoring answer set, the similarity between the subjective question answer x and the 10 subjective question answers is obtained.
S5: sorting the similarity in a descending order to obtain a sorting result;
steps S4 and S5 may be collectively referred to as a similarity calculation stage, and may be performed by the similarity calculation system 2 described above.
Specifically, the functions of the similarity calculation system may be implemented using various machine learning models, such as a TF-IDF (terminal frequency-Inverse Document Frequency) model, which is not described herein.
Assuming that n+1 subjective questions in the automatic scoring answer set (one of which is a standard answer), the main operations involved in the similarity calculation stage can be seen in fig. 4.
S6: and automatically scoring the subjective question answers x by using the sorting result and the score set to obtain automatic scoring scores (cx) of the subjective question answers x.
Step S6 may be performed by the automatic scoring system 3 described previously.
In one example, the automatic scoring system 3 may be specifically an automatic scoring model, where the score corresponding to the first r positions in the score set in the ranking result may be used as an input of the automatic scoring model, and the automatic scoring model may score the subjective question answer x automatically to obtain an automatic scoring score.
r is a positive integer less than or equal to n.
Assuming n=10 and r=5, the scores corresponding to the first 5 positions in the ranking result are taken as input to the automatic scoring model. The main operations involved in the automatic scoring stage can be seen in fig. 5.
Therefore, in the embodiment of the invention, part of subjective question answers are extracted for manual scoring, and the obtained manual scoring values are used as reference data. For any subjective question answer, calculating the similarity between the subjective question answer and the standard answer and the answer subjected to manual scoring, and automatically scoring based on the similarity and the manual scoring score to obtain an automatic scoring score, so that the subjective question is automatically scored, and a high-quality automatic scoring result is obtained.
An automatic scoring model is described below.
In other embodiments of the present invention, the automatic scoring model calculates the automatic scoring value of the answer x to the subjective question according to the following formula:
wherein cx represents the automatic scoring score of the answer x of the subjective question, i represents the ith position in the former r positions, ci represents the score corresponding to the ith position in the scoring set, and bi is the weight corresponding to the ith position.
The default value of bi is 1/r. I.e. taking the average of the scores of the first n most similar evaluated solutions. For example, if the scores of the first 5 most similar evaluated subjective-question answers are {18,20,18,16,20}, respectively, the subjective-question answer x is scored as 18.4 according to the default weight of the model.
Of course, in actual situations, other parameter models can be selected, and different weight values can be given to the evaluated answers with different similarity ranks.
In a general scoring system, for negative samples (the automatic scoring score and the manual scoring score have a large difference), the negative samples are discarded only after the manual scoring correction, and the value of the negative samples is ignored.
The update of the automatic scoring model may be performed using a negative sample, see fig. 6, which illustratively includes the steps of:
s7: m subjective question answers in the automatic scoring answer set are randomly extracted.
m is a positive integer.
S8: and obtaining the manual rechecking scores of the m subjective question answers after manual rechecking.
That is, the person manually rechecks the answers of the m subjective questions to obtain the scores.
S9: respectively calculating the score difference between the automatic score values and the manual review scores of the m subjective question answers;
the subjective question answers with the score differences larger than the preset threshold are negative samples.
Those skilled in the art can flexibly design the preset threshold value, for example 1,2,0.5.
S10: and updating the weights corresponding to the first r positions by using the manual rechecking values of the negative samples.
That is, the weight values (i.e., bi values in equation 1) of the respective components in the automatic scoring model are calculated reversely.
The model updating is an optional process, that is, whether the model updating is performed or not can be determined according to actual requirements in actual application, and if the scoring result can meet the use requirements, the model updating is not performed.
An exemplary manner of updating the weights corresponding to the first r positions using the manual review score of the negative sample is described below.
For convenience of description, the sorting result corresponding to the negative sample can be taken as a target sorting result, and the manual rechecking value corresponding to the negative sample is represented by f;
at least one round of weight update operation may be performed until an update stop condition is satisfied to update weights corresponding to the first r locations.
Referring to fig. 7a, each round of weight update operation at least includes:
s701: a first target score and a second target score are determined in the set of target scores.
The target score set includes: the first r positions of the target ranking result are the scores corresponding to the score sets.
The first target score is the score which has the smallest difference with f and is marked as effective, and the second target score is the score which has the largest difference with f and is marked as effective. Initially, each score in the target score set is marked as valid.
Assuming that the most similar r subjective questions to the negative sample answer y 1 ,y 2 ,…,y r Its corresponding score is c 1 ,c 2 ,…,c r Initially, c 1 ,c 2 ,…,c r Are marked as valid.
If c 3 Minimum difference from f, c 5 Maximum difference from f, c 3 For the first target score, c 5 Is the second target score.
S702: and respectively updating the weights of the positions corresponding to the first target score and the second target score.
In one example, let the location corresponding to the first target score be denoted as abb min The weight corresponding to the update is denoted b min The method comprises the steps of carrying out a first treatment on the surface of the The position corresponding to the second target score is abb max The weight corresponding to the update is denoted b max
Then a is b min And b is b max The sum value as position abb min Updated weights are updated to take c×bmax as the position abb max The updated weights; a. and b and c are update coefficients.
a. The values of b and c can be flexibly designed, and a is assumed to be 1, b is assumed to be 0.5, and c is assumed to be 0.5.
In the previous example, c 3 For the first target score, c 5 The value of the weight corresponding to the second target score before updating is 0.2,0.2.
Assuming that a takes 1, b takes 0.5 and c takes 0.5, c 3 The new weight of the corresponding position is 0.3, c 5 The new weight of the corresponding position is 0.1.
S703: the first target score and the second target score are marked as invalid.
Following the previous example, then at c 1 ,c 2 ,…,c r Will c in (b) 3 And c 5 Marked invalid.
S704: the updated weights are used to recalculate the automatic scoring scores for the negative samples.
The recalculated automatic scoring score is denoted as f'.
S705: judging whether f and f' meet the update stop condition, if so, stopping, and if not, entering S706;
in one example, the update stop condition may include: f' is within d% of f.
The person skilled in the art can flexibly design the value of d, for example 5, 10, etc.
S706: judging whether the number of the scores marked as valid by the target score set is not less than 2, if not, returning to S701; if yes, go to S707;
since the maximum and minimum differences cannot be determined if the number is less than 2, re-execution is required.
S707: all scores in the target score set are marked as valid and the process returns to S701.
Alternatively, referring to fig. 7b, each round of weight update operation at least includes:
s71: a first target score and a second target score are determined in the set of target scores.
The target score set includes: the first r positions of the target ranking result are the scores corresponding to the score sets.
The first target score is the score with the smallest difference value with f, and the second target score is the score with the largest difference value with f.
Assuming that the most similar r subjective questions to the negative sample answer y 1 ,y 2 ,…,y r Its corresponding score is c 1 ,c 2 ,…,c r . If c 3 Minimum difference from f, c 5 Maximum difference from f, c 3 For the first target score, c 5 Is the second target score.
S72: and respectively updating the weights of the positions corresponding to the first target score and the second target score.
In one example, let the location corresponding to the first target score be denoted as abb min The weight corresponding to the update is denoted b min The method comprises the steps of carrying out a first treatment on the surface of the The position corresponding to the second target score is abb max The weight corresponding to the update is denoted b max
Then a is b min And b is b max The sum value as position abb min Updated weights are updated to take c×bmax as the position abb max The updated weights; a. and b and c are update coefficients.
a. The values of b and c can be flexibly designed, and a is assumed to be 1, b is assumed to be 0.5, and c is assumed to be 0.5.
In the previous example, c 3 For the first target score, c 5 The value of the weight corresponding to the second target score before updating is 0.2,0.2.
Assuming that a takes 1, b takes 0.5 and c takes 0.5, c 3 The new weight of the corresponding position is 0.3, c 5 The new weight of the corresponding position is 0.1.
S73: the first target score and the second target score are deleted from the target score set.
Following the previous example, then at c 1 ,c 2 ,…,c r Delete c in 3 And c 5
S74: the updated weights are used to recalculate the automatic scoring scores for the negative samples.
The recalculated automatic scoring score is denoted as f'.
S75: judging whether f and f' meet the update stop condition, if so, stopping, and if not, entering S76;
in one example, the update stop condition may include: f' is within d% of f.
The person skilled in the art can flexibly design the value of d, for example 5, 10, etc.
S76: judging whether the target score set is empty or only one element is left, if not, returning to the step S71; if yes, go to S77;
s77: and (3) enabling the target score set to comprise scores corresponding to the first r positions of the target sorting result in the score set, and returning to the step (S71).
It should be noted that, in a conventional automatic scoring method, a main flow is shown in fig. 8, and a necessary precondition is that "a word vector associated with a category or a knowledge point to which a test question belongs is trained in advance".
On the basis, the existing method calculates word weights in two texts after receiving a student answer text and a reference answer text, and determines a final word representation sequence in the two texts, namely, the student answer text and the reference answer text are output in the word sequence and the corresponding weight;
then, using a WMD (Word river's Distance) model to calculate the Distance between the weight data of the student answer text and the Word weight data in the precondition, and then converting the Distance into corresponding text similarity; finally, the text similarity is converted into a corresponding score that is automatically scored.
1. A corpus is needed as a basis. In the method proposed in paper 1, the precondition of "pre-trained word vectors related to the category or knowledge point to which the test question belongs" requires a high-quality corpus as a basis. In general, the establishment and maintenance of a high-quality corpus are complex, not only a large amount of high-quality text data is needed, but also the text data needs to be segmented, weighted and the like, each process needs manual intervention, and Chinese segmentation is still a very difficult bottleneck problem in academia due to the specificity of Chinese; meanwhile, the language content of different professions, industries or fields has larger difference, and even for the same text, different word segmentation modes exist. The assignment of weights to words after word segmentation is also a task with very strong subjectivity and specialty, which requires a lot of manpower and has high professional demands for personnel.
2. The importance of subjective judgment of the scoring staff cannot be reflected. The emphasis of the existing method is basically to compare the answer to be evaluated with the standard answer through various algorithms and the existing data base, adjust the comparison result, and read the comparison result from different angles to calculate the final score. In the process, the importance of manual scoring is ignored, because the diversity and complexity of languages can only measure the similarity or classification from certain aspects, and the subjective judgment of people cannot be completely replaced, so that the premise of acquiring a high-quality automatic scoring result is that the intervention of manual judgment is needed.
3. Updating the scoring model is difficult. In the implementation process of the method, although a link for manually checking the automatic scoring result exists, the link only binarizes the scoring result, namely whether the result is reasonable or not is judged, and the reasonable result is regarded as a positive sample (the difference between the automatic scoring result and the manual scoring result is smaller) and is added into a corpus; and for negative samples (the automatic score and the manual score have a larger difference), the negative samples are discarded only after the manual score correction, and the value of the negative samples is ignored.
Compared with the method, the automatic scoring method provided by the embodiment of the invention has the following advantages:
1. independent of the corpus. When calculating the similarity, the fact that the words are identical is considered, and the similarity at the semantic level is not related, so that the support of a corpus is not needed.
2. Automatic scoring is implemented on the basis of manual scoring, and the importance of subjective judgment of scoring personnel is fully reflected.
3. Updating and correcting the automatic scoring model based on the negative sample: after the automatic scoring is finished, the scoring model can be updated according to the situation, the deviation of the automatic scoring is corrected in a manual rechecking mode, and the automatic scoring model is updated and corrected on the basis of a negative sample, so that the accuracy of the model is improved.
An automatic subjective question scoring system is described below. Referring to fig. 1, exemplary includes:
an acquisition unit 1 for:
acquiring an answer set to be scored and a standard answer; the answer set to be scored comprises at least one subjective question answer to be scored;
forming a subjective question answer set with reference scores and an automatic scoring answer set based on the answer set to be scored and the standard answers; wherein the set of reference scoring answers comprises: randomly selecting n subjective question answers and standard answers from answer sets to be scored; n is a positive integer; the automatic scoring answer set comprises the remaining subjective question answers after n subjective question answers are removed in the answer set to be scored;
obtaining a score set, wherein the score set comprises scores obtained by manually scoring n subjective question answers and scores corresponding to standard answers;
a similarity calculation system 2 for:
for any subjective question answer in the automatic scoring answer set, calculating the similarity between any subjective question answer and each subjective question answer in the reference scoring answer set; any subjective question answer is expressed as a subjective question answer x;
sorting the similarity in a descending order to obtain a sorting result;
an automatic scoring system 3 for: and automatically scoring the subjective question answers x by using the sorting result and the score set to obtain automatic scoring values of the subjective question answers x.
Specific details are described in the foregoing description, and are not repeated here.
In other embodiments of the present invention, the automatic scoring system includes:
an automatic scoring model for: taking the scores corresponding to the first r positions in the score set in the sorting result as input, and automatically scoring the subjective question answers x to obtain automatic scoring scores; r is a positive integer less than or equal to n.
Specific details are described in the foregoing description, and are not repeated here.
In other embodiments of the present invention, the automatic scoring model calculates the automatic scoring value of the answer x of the subjective question according to the following formula:
wherein cx represents the automatic scoring score of the answer x of the subjective question, i represents the ith position in the former r positions, ci represents the score corresponding to the ith position in the scoring set, and bi is the weight corresponding to the ith position. The default value of bi is 1/r.
Specific details are described in the foregoing description, and are not repeated here.
In other embodiments of the present invention, the automatic subjective question scoring system may further include:
an updating system 4 for:
randomly extracting m subjective question answers in the automatic scoring answer set;
obtaining the manual rechecking scores of m subjective question answers after manual rechecking; m is a positive integer;
respectively calculating the score difference between the automatic score values and the manual review scores of the m subjective question answers; the subjective question answers with the score differences larger than a preset threshold are negative samples;
and updating the weights corresponding to the first r positions by using the manual rechecking values of the negative samples.
Specific details are described in the foregoing description, and are not repeated here.
The sorting result corresponding to the negative sample is a target sorting result; the artificial review score corresponding to the negative sample is denoted as f.
In other embodiments of the present invention, in terms of updating weights corresponding to the previous r positions using the manual review score of the negative sample, the updating system 4 may be specifically configured to: and executing at least one round of weight updating operation until the updating stop condition is met.
Wherein, each round of weight updating operation at least comprises:
determining a first target score and a second target score in the target score set; the target score set includes: the first r positions of the target sequencing result are the scores corresponding to the score sets; the first target score is the score with the smallest difference value with f, and the second target score is the score with the largest difference value with f;
updating the weights of the positions corresponding to the first target score and the second target score respectively;
deleting the first target score and the second target score from the target score set;
recalculating the automatic scoring values of the negative samples using the updated weights; the recalculated automatic scoring score is denoted as f';
judging whether f and f' meet update stop conditions, if so, stopping;
if not, judging whether the target score set is empty or only one element is left;
if yes, returning to execute the step of determining a first target score and a second target score in the target score set;
if not, the target score set comprises the scores corresponding to the first r positions of the target sorting result in the score set, and the step of determining the first target score and the second target score in the target score set is returned to be executed.
Specific details are described in the foregoing description, and are not repeated here.
Alternatively, each round of weight update operation includes at least:
determining a first target score and a second target score in the target score set; the set of target scores includes: the first r positions of the target sorting result are marked as valid in the score corresponding to the score set, and each score in the target score set is marked as valid initially; the first target score is the score which has the smallest f-difference and is marked as valid, and the second target score is the score which has the largest f-difference and is marked as valid;
updating the weights of the positions corresponding to the first target score and the second target score respectively;
marking the first target score and the second target score as invalid;
recalculating the automatic scoring values of the negative samples using the updated weights; the recalculated automatic scoring score is denoted as f';
judging whether f and f' meet the updating stop condition, if so, stopping;
if not, judging whether the number of the scores marked as valid by the target score set is not less than 2;
if not, returning to execute the step of determining a first target score and a second target score in the target score set;
if not, marking all the scores in the target score set as valid, and returning to execute the step of determining the first target score and the second target score in the target score set.
Specific details are described in the foregoing description, and are not repeated here.
In other embodiments of the present invention, the location corresponding to the first target score is denoted as abb min The weight corresponding to the update is denoted b min The method comprises the steps of carrying out a first treatment on the surface of the The position corresponding to the second target score is abb max The weight corresponding to the update is denoted b max
The updating system 4 may be specifically configured to, in terms of updating weights of positions corresponding to the first target score and the second target score, respectively:
a is equal to b min And b is b max The sum value as position abb min The updated weight is added;
c.bmax as the position abb max The updated weights;
a. and b and c are update coefficients.
Specific details are described in the foregoing description, and are not repeated here.
In the present specification, each embodiment is described in a progressive manner, and each embodiment is mainly described in a different point from other embodiments, and identical and similar parts between the embodiments are all enough to refer to each other. For the device disclosed in the embodiment, since it corresponds to the method disclosed in the embodiment, the description is relatively simple, and reference is made to the description of the method section.
Those of skill would further appreciate that the elements and model steps of the examples described in connection with the embodiments disclosed herein may be implemented as electronic hardware, computer software, or combinations of both, and that the elements and steps of the examples have been described generally in terms of functionality in the foregoing description to clearly illustrate this interchangeability of hardware and software. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the solution. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present invention.
The steps of a method or model described in connection with the embodiments disclosed herein may be embodied directly in hardware, in a software module executed by a processor, or in a combination of the two. The software modules may be disposed in Random Access Memory (RAM), memory, read Only Memory (ROM), electrically programmable ROM, electrically erasable programmable ROM, registers, hard disk, a removable disk, WD-ROM, or any other form of storage medium known in the art.
The previous description of the disclosed embodiments is provided to enable any person skilled in the art to make or use the present invention. Various modifications to these embodiments will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other embodiments without departing from the spirit or scope of the invention. Thus, the present invention is not intended to be limited to the embodiments shown herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.

Claims (5)

1. An automatic subjective question scoring method, comprising:
acquiring an answer set to be scored and a standard answer; the answer set to be scored comprises at least one subjective question answer to be scored;
forming a subjective question answer set with reference scores and an automatic score answer set based on the answer set to be scored and the standard answer; wherein, the subjective question answer set of the reference score comprises: randomly selecting n subjective question answers from the answer set to be scored and the standard answer; n is a positive integer; the automatic scoring answer set comprises the remaining subjective question answers after the n subjective question answers are removed in the answer set to be scored;
obtaining a score set, wherein the score set comprises scores obtained by manually scoring the n subjective question answers and scores corresponding to the standard answers;
calculating the similarity of any subjective question answer in the automatic scoring answer set and each subjective question answer in the reference scoring answer set; the answer of any subjective question is expressed as a subjective question answer x;
sorting the similarity in a descending order to obtain a sorting result;
automatically scoring the subjective question answers x by using the sorting result and the score set to obtain automatic scoring values of the subjective question answers x; wherein, the automatically scoring the subjective question answer x by using the sorting result and the score set, and obtaining the automatic scoring value of the subjective question answer x includes: taking the corresponding scores of the first r positions in the score set in the sorting result as the input of an automatic scoring model, and automatically scoring the subjective question answers x by the automatic scoring model to obtain automatic scoring scores; r is a positive integer less than or equal to n;
wherein the method further comprises: randomly extracting m subjective question answers in the automatic scoring answer set;
obtaining the manual rechecking scores of the m subjective question answers after manual rechecking; m is a positive integer;
respectively calculating the score differences between the automatic score values and the manual review scores of the m subjective question answers; the subjective question answers with the score differences larger than a preset threshold are negative samples;
updating weights corresponding to the first r positions by using the manual rechecking values of the negative samples;
the sorting result corresponding to the negative sample is a target sorting result;
the manual review score corresponding to the negative sample is expressed as f;
the updating the weights corresponding to the first r positions by using the manual rechecking scores of the negative samples comprises: executing at least one round of weight updating operation until an updating stop condition is met;
each round of weight update operation at least comprises:
determining a first target score and a second target score in the target score set; the set of target scores includes: the first r positions of the target sorting result are marked as valid in the score corresponding to the score set, and each score in the target score set is marked as valid initially; the first target score is the score which has the smallest f-difference and is marked as valid, and the second target score is the score which has the largest f-difference and is marked as valid;
updating the weights of the positions corresponding to the first target score and the second target score respectively;
marking the first target score and the second target score as invalid;
recalculating the automatic scoring values of the negative samples using the updated weights; the recalculated automatic scoring score is denoted as f';
judging whether f and f' meet the updating stop condition, if so, stopping;
if not, judging whether the number of the scores marked as valid by the target score set is not less than 2;
if not, returning to execute the step of determining a first target score and a second target score in the target score set;
if yes, marking all the scores in the target score set as valid, and returning to execute the step of determining the first target score and the second target score in the target score set.
2. The method according to claim 1, wherein the automatic scoring model calculates the automatic scoring value of the subjective question answer x according to the following formula:
wherein cx represents the automatic scoring value of the answer x of the subjective question, i represents the ith position in the former r positions, ci represents the score corresponding to the ith position in the scoring set, and bi is the weight corresponding to the ith position.
3. The method of claim 1, wherein,
the corresponding position of the first target score is abb min The weight corresponding to the update is denoted b min
The corresponding position of the second target score is abb max The weight corresponding to the update is denoted b max
The updating the weights of the positions corresponding to the first target score and the second target score respectively comprises the following steps:
a is equal to b min And b is b max The sum value as position abb min The updated weight is added;
c.bmax as the position abb max The updated weights;
a. and b and c are update coefficients.
4. An automatic subjective question scoring system, comprising:
an acquisition unit configured to:
acquiring an answer set to be scored and a standard answer; the answer set to be scored comprises at least one subjective question answer to be scored;
forming a subjective question answer set with reference scores and an automatic score answer set based on the answer set to be scored and the standard answer; wherein, the subjective question answer set of the reference score comprises: randomly selecting n subjective question answers from the answer set to be scored and the standard answer; n is a positive integer; the automatic scoring answer set comprises the remaining subjective question answers after the n subjective question answers are removed in the answer set to be scored;
obtaining a score set, wherein the score set comprises scores obtained by manually scoring the n subjective question answers and scores corresponding to the standard answers;
a similarity calculation system for:
calculating the similarity of any subjective question answer in the automatic scoring answer set and each subjective question answer in the reference scoring answer set; the answer of any subjective question is expressed as a subjective question answer x;
sorting the similarity in a descending order to obtain a sorting result;
an automatic scoring system for: automatically scoring the subjective question answers x by using the sorting result and the score set to obtain automatic scoring values of the subjective question answers x; wherein the automatic scoring system comprises: an automatic scoring model for: taking the scores corresponding to the first r positions in the score set in the sorting result as input, and automatically scoring the subjective question answers x to obtain automatic scoring scores; r is a positive integer less than or equal to n;
an updating system for:
randomly extracting m subjective question answers in the automatic scoring answer set;
obtaining the manual rechecking scores of the m subjective question answers after manual rechecking; m is a positive integer;
respectively calculating the score differences between the automatic score values and the manual review scores of the m subjective question answers; the subjective question answers with the score differences larger than a preset threshold are negative samples;
updating weights corresponding to the first r positions by using the manual rechecking values of the negative samples;
the sorting result corresponding to the negative sample is a target sorting result;
the manual review score corresponding to the negative sample is expressed as f;
the updating the weights corresponding to the first r positions by using the manual rechecking scores of the negative samples comprises: executing at least one round of weight updating operation until an updating stop condition is met;
each round of weight update operation at least comprises:
determining a first target score and a second target score in the target score set; the set of target scores includes: the first r positions of the target sorting result are marked as valid in the score corresponding to the score set, and each score in the target score set is marked as valid initially; the first target score is the score which has the smallest f-difference and is marked as valid, and the second target score is the score which has the largest f-difference and is marked as valid;
updating the weights of the positions corresponding to the first target score and the second target score respectively;
marking the first target score and the second target score as invalid;
recalculating the automatic scoring values of the negative samples using the updated weights; the recalculated automatic scoring score is denoted as f';
judging whether f and f' meet the updating stop condition, if so, stopping;
if not, judging whether the number of the scores marked as valid by the target score set is not less than 2;
if not, returning to execute the step of determining a first target score and a second target score in the target score set;
if yes, marking all the scores in the target score set as valid, and returning to execute the step of determining the first target score and the second target score in the target score set.
5. The system of claim 4, wherein the automatic scoring model calculates the automatic scoring value of the subjective question answer x according to the following formula:
wherein cx represents the automatic scoring value of the answer x of the subjective question, i represents the ith position in the former r positions, ci represents the score corresponding to the ith position in the scoring set, and bi is the weight corresponding to the ith position.
CN202011069722.4A 2020-09-30 2020-09-30 Automatic subjective question scoring method and system Active CN114333461B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011069722.4A CN114333461B (en) 2020-09-30 2020-09-30 Automatic subjective question scoring method and system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011069722.4A CN114333461B (en) 2020-09-30 2020-09-30 Automatic subjective question scoring method and system

Publications (2)

Publication Number Publication Date
CN114333461A CN114333461A (en) 2022-04-12
CN114333461B true CN114333461B (en) 2024-03-12

Family

ID=81032951

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011069722.4A Active CN114333461B (en) 2020-09-30 2020-09-30 Automatic subjective question scoring method and system

Country Status (1)

Country Link
CN (1) CN114333461B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115798022B (en) * 2023-02-07 2023-05-16 南京思优普信息科技有限公司 Artificial intelligence recognition method based on feature extraction

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6012056A (en) * 1998-02-18 2000-01-04 Cisco Technology, Inc. Method and apparatus for adjusting one or more factors used to rank objects
KR20010078675A (en) * 2000-02-09 2001-08-21 이재영 A System Grading Subjective Tests
KR20150014333A (en) * 2013-07-29 2015-02-06 한국교육과정평가원 Scoring management server and operating method thereof
CN108122181A (en) * 2017-12-20 2018-06-05 中州大学 A kind of computer application examination system
JP2018142249A (en) * 2017-02-28 2018-09-13 株式会社EduLab Answer sheet grading method
CN110096702A (en) * 2019-04-22 2019-08-06 安徽省泰岳祥升软件有限公司 A kind of subjective item methods of marking and device
CN110705278A (en) * 2018-07-09 2020-01-17 北大方正集团有限公司 Subjective question marking method and subjective question marking device

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6012056A (en) * 1998-02-18 2000-01-04 Cisco Technology, Inc. Method and apparatus for adjusting one or more factors used to rank objects
KR20010078675A (en) * 2000-02-09 2001-08-21 이재영 A System Grading Subjective Tests
KR20150014333A (en) * 2013-07-29 2015-02-06 한국교육과정평가원 Scoring management server and operating method thereof
JP2018142249A (en) * 2017-02-28 2018-09-13 株式会社EduLab Answer sheet grading method
CN108122181A (en) * 2017-12-20 2018-06-05 中州大学 A kind of computer application examination system
CN110705278A (en) * 2018-07-09 2020-01-17 北大方正集团有限公司 Subjective question marking method and subjective question marking device
CN110096702A (en) * 2019-04-22 2019-08-06 安徽省泰岳祥升软件有限公司 A kind of subjective item methods of marking and device

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
考试系统中智能化辅助阅卷技术研究;丁康健;中国优秀硕士学位论文全文数据库-信息科技辑;第2012年卷(第05期期);24-25 *

Also Published As

Publication number Publication date
CN114333461A (en) 2022-04-12

Similar Documents

Publication Publication Date Title
CN110993081B (en) Doctor online recommendation method and system
CN112908436B (en) Clinical test data structuring method, clinical test recommending method and device
US11803731B2 (en) Neural architecture search with weight sharing
CN108563791A (en) A kind of construction quality complains the method and system of text classification
CN111177386B (en) Proposal classification method and system
CN111445200A (en) Interviewing method and device based on artificial intelligence, computer equipment and storage medium
WO2021051586A1 (en) Interview answer text classification method, device, electronic apparatus and storage medium
JP2020047234A (en) Data evaluation method, device, apparatus, and readable storage media
CN111737968A (en) Method and terminal for automatically correcting and scoring composition
CN111898374A (en) Text recognition method and device, storage medium and electronic equipment
CN111144068A (en) Similar arbitration case recommendation method and device
CN107544956A (en) A kind of text wants point detecting method and system
CN114333461B (en) Automatic subjective question scoring method and system
CN107797981B (en) Target text recognition method and device
CN111046177A (en) Automatic arbitration case prejudging method and device
CN110969005A (en) Method and device for determining similarity between entity corpora
CN111639189B (en) Text graph construction method based on text content features
CN107783958B (en) Target statement identification method and device
CN113569018A (en) Question and answer pair mining method and device
CN109101984A (en) A kind of image-recognizing method and device based on convolutional neural networks
CN114281983B (en) Hierarchical text classification method, hierarchical text classification system, electronic device and storage medium
CN112598202B (en) Test question difficulty evaluation method and device, storage medium and computing equipment
CN111414930A (en) Deep learning model training method and device, electronic equipment and storage medium
CN111341404B (en) Electronic medical record data set analysis method and system based on ernie model
CN107992482A (en) Mathematics subjective item answers the stipulations method and system of step

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant