CN113239195A

CN113239195A - Knowledge point difficulty grading method based on big data

Info

Publication number: CN113239195A
Application number: CN202110483655.9A
Authority: CN
Inventors: 周鹏
Original assignee: Xiaoerlang Technology Co ltd
Current assignee: Xiaoerlang Technology Co ltd
Priority date: 2021-04-30
Filing date: 2021-04-30
Publication date: 2021-08-10
Anticipated expiration: 2041-04-30
Also published as: CN113239195B

Abstract

The invention discloses a knowledge point difficulty grading method based on big data, which belongs to the technical field of big data and comprises the following steps: obtaining open questions through big data, classifying the related fields of the questions, and forming the types of the questions in different fields; the invention creatively introduces a method that all answerers can obtain answers of others after answering questions, so that the answerers are testers and inspectors and can obtain a prediction score, and simultaneously, big data analysis and judgment can be carried out on the knowledge points of the questions through a big data algorithm to obtain a score, and the score and the knowledge points are synthesized to obtain an average value to obtain a final score.

Description

Knowledge point difficulty grading method based on big data

Technical Field

The invention belongs to the technical field of big data, and particularly relates to a knowledge point difficulty grading method based on big data.

Background

With the rapid development of the IT technology, the concept of big data comes up, the big data is generally used in the IT industry, generally refers to a data set which cannot be captured, managed and processed by a conventional software tool within a certain time range, people can have stronger decision-making power, insight discovery power and flow optimization capability according to big data information, and the big data is information assets with high growth rate and diversification.

Generally, difficulty judgment of knowledge points of some subjects is carried out through manual judgment, when the knowledge points are judged manually, wrong judgment is sometimes caused by wrong or overlooked results, and grading accuracy of difficulty points of the knowledge points is finally influenced, so that big data is introduced, difficulty grading can also be carried out on the difficulty points of some subjects, the difficulty of the knowledge points can be judged through an analysis process by the big data, the method avoids the condition of wrong judgment caused by manual judgment, but the method can only be used for judging the knowledge points with specified unique answers, and the method does not have good judgment capability on some open knowledge points, and in order to solve the problem, the method for grading difficulty points of knowledge points based on big data, which combines manual and big data, is needed.

Disclosure of Invention

The invention aims to: the method is used for solving the problem that the difficulty of knowledge points of some topics is generally judged by manual, when the knowledge points are judged by manual, wrong judgment is caused by wrong or overlooked results sometimes, and the grading accuracy of the difficulty of the knowledge points is finally influenced, so that big data is introduced, difficulty grading can be carried out on the difficulty of the knowledge points of some topics, the big data can be used for judging the difficulty of the knowledge points through an analysis process, although the method avoids the condition of wrong judgment caused by manual judgment, the method can only be used for judging the knowledge points with specified unique answers, and the problem that some open knowledge points do not have good judgment capability is solved.

In order to achieve the purpose, the invention adopts the following technical scheme: the knowledge point difficulty grading method based on big data comprises the following steps:

s1, obtaining open questions through big data, classifying the related fields of the questions to form question types of different fields, and forming a big question bank;

s2, selecting a batch of specific answering persons, entering a question bank for answering, and setting answering time;

s3, in the answering time, after the answering person finishes answering, submitting the question and the answer, and processing the question answer by the system;

s4, for each question and answer, predicting the question score according to the professional knowledge of each question answering person, inputting the predicted question and answer into a big data system, and recording a plurality of groups of data to construct a matrix of the question and the answer;

s5, dividing the prediction into construction matrixes;

s6, clustering the answer result prediction by using an algorithm;

s7, selecting the nearest k neighbors of each answerer according to the similarity;

s8, adding the prediction into the matrix obtained in S4, continuing to finish the steps, and performing iteration;

s9, collecting the iteration result, and processing the data to a certain extent;

s10, collecting the processed data, inputting the collected data into a big data system, and dividing the data according to the evaluation standard;

s11, according to the scores of different areas, carrying out difficulty grading;

and S12, searching the topics contained in different grades in the topic library for the corresponding topic fields again by using a big data system.

As a further description of the above technical solution:

in S3, after the answerer completes answering within the answering time, the system packages and marks the ID of the answerer for each question and answer, and breaks up through the system, and redistributes to the answerer, and when time division is performed, it is ensured that the same ID cannot be obtained for the question and answer with the same packaged and bound ID.

As a further description of the above technical solution:

in S4, each answerer predicts the question score according to his or her own expertise for the received question and answer, with a score prediction range of 0 to 10.

As a further description of the above technical solution:

in S5, the prediction is divided into a construction matrix of R ═ R (R)_ij)_mxnWherein r is_ijAnd (5) scoring the answer j for the answerer i, wherein m is the number of answerers and n is the number of questions.

As a further description of the above technical solution:

in S6, clustering is performed on the answer result predictors using a Kmeans algorithm.

As a further description of the above technical solution:

and in the step S8, adding the prediction scores into the matrix obtained in the step S4, continuously finishing the steps, and iterating until each answerer scores the received questions and answers.

As a further description of the above technical solution:

in S9, the results of the iteration are collected, the prediction of the received question by each answerer recorded in S7 is compared with the specific score obtained by the iteration, and the average of the two is calculated.

As a further description of the above technical solution:

in S10, the processed data is collected and input into a big data system, and the obtained average is divided into 0-3, 4-7, and 8-10 intervals, where 0-3 intervals are represented by red, 4-7 intervals by yellow, and 8-10 intervals by green.

As a further description of the above technical solution:

in the step S11, the 0-3 subarea is in the difficult grade, the 4-7 subarea is in the middle grade, and the 8-10 subarea is in the simple grade.

As a further description of the above technical solution:

in the step S12, a big data system is used to find the corresponding question types again in the question bank for the questions contained in different grades after the ranking, and the final question field difficulty level is obtained according to the proportion of the different grades in the question field.

In summary, due to the adoption of the technical scheme, the invention has the beneficial effects that:

the method selects a group of specific persons with relevant knowledge to answer questions, creatively introduces a method that all answerers can obtain answers of others after answering questions, so that the answerers are not only testers but also inspectors, a prediction score can be obtained by the method, simultaneously, big data analysis and judgment can be carried out on the knowledge points of the questions through a big data algorithm to obtain a score, the average value is obtained by combining the two to obtain the final score, the method effectively combines the human judgment and the big data analysis and judgment processes, greatly improves the accuracy of difficulty grading on the knowledge points, solves the problems of some open questions which are difficult to be judged by big data analysis, combines the advantages of the two, finally judges the final difficulty grading through the big data, has good application effect, and is internally provided with a backtracking step, after the knowledge points are graded, the corresponding question fields can be traced back, meanwhile, the difficulty grading can be carried out on the knowledge point fields, and the judgment range of the method are greatly improved.

Detailed Description

The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

Example 1

The invention provides a technical scheme that: the knowledge point difficulty grading method based on big data comprises the following steps:

s3, in the answering time, after the answering machine finishes answering, the system packs each question and answer with the ID of the answering machine, breaks up through the system, and redistributes the ID to the answering machine, and the same ID is ensured not to obtain the same packed and bound question and answer with the same ID;

s4, for each answerer, predicting the question score according to the professional knowledge of the answerer, wherein the score prediction range is 0-10 minutes, inputting the predicted score into a big data system, and recording a plurality of groups of data to construct a matrix by the question and the answer;

s5, dividing the prediction into a structural matrix, wherein R is (R)_ij)_mxnWherein r is_ijScoring the answer j for the answerer i, wherein m is the number of answerers and n is the number of questions;

s6, clustering the answer result forecasts by using a Kmeans algorithm;

s8, adding the prediction scores into the matrix obtained in S4, continuing to finish the steps, and performing iteration until each answerer scores the received questions and answers;

and S9, collecting the iteration results, comparing the prediction of each answerer for the received questions recorded in the S7 with the specific scores obtained by iteration, and calculating the average value of the two.

In the embodiment, a group of specific persons with relevant knowledge is selected for answering questions, a method for obtaining answers of other persons by all answerers after answering questions is creatively introduced, so that the answerers are not only testers but also inspectors, a prediction score can be obtained by the method, big data analysis and judgment can be performed on the knowledge points of the questions through a big data algorithm to obtain a score, the two are combined to obtain an average value to obtain a final score, the method effectively combines the human judgment and big data analysis and judgment processes, the accuracy of difficulty grading of the knowledge points is greatly improved, the problems of open questions which are difficult to analyze and judge through big data can be solved, the advantages of the two are combined, the final difficulty grading is judged through the big data, and the method has a good application effect.

Example 2

s6, clustering the answer result forecasts by using a Kmeans algorithm;

s9, collecting the iterative results, comparing the prediction of each answerer for the received questions recorded in S7 with the specific scores obtained through iteration, and calculating the average value of the two;

s10, collecting the processed data, inputting the collected data into a big data system, dividing the obtained average value into data according to intervals of 0-3 minutes, 4-7 minutes and 8-10 minutes, wherein the interval of 0-3 minutes is represented by red, the interval of 4-7 minutes is represented by yellow, and the interval of 8-10 minutes is represented by green;

s11, according to the scores of different areas, difficulty grading is carried out, wherein 0-3 areas are in a difficult grade, 4-7 areas are in a medium grade, and 8-10 areas are in a simple grade;

s12, searching corresponding question types of the questions contained in different grades in the question bank again after grading by using a big data system, and obtaining the final question field difficulty grade according to the proportion of the different grades in the question field.

In the embodiment, the backtracking step is arranged in the method, after the knowledge points are graded, the corresponding question fields can be backtracked, and meanwhile, the difficulty grading can be carried out on the knowledge points, so that the judgment range and the judgment breadth of the method are greatly improved.

For example 1 and example 2, the following are to be noted:

the Kmeans algorithm method is specifically as follows:

s1, taking the first k respondents as k independent clustering centers: namely the first k rows in R are used as clustering centers;

s2, calculating the distance (vector distance) between the remaining rows and each cluster center, taking the distance as a cluster member, and selecting the cluster corresponding to the nearest center;

s3, calculating the sum of all vectors corresponding to each cluster, dividing the sum by the number of the vectors in the cluster, and taking the vector as the center of a new cluster;

s4, repeating the previous two steps until the k clusters generated are not changed any more;

and S5, measuring the similarity degree of the scores of the respondents in each cluster.

In order to examine the degree of similarity of the scores of the answerers, it is necessary to analyze the degree of coincidence of the changes of the scores of the answerers, and in addition, since each answerer has different ideas and recognitions for the open knowledge points when reading the questions, the scoring standards are different, and therefore, the deviation caused by the difference of the standards is eliminated. The end use similarity function is as follows:

wherein SIM (S)_i，S_j) Degree of similarity of score between answerer i and answerer j, C_iTests in which the representative i takes part, C_ijRepresenting tests in which respondents i and j are jointly participating, r_iThe value of the received question and the answer is predicted and graded by the representative question answerer i;

the similarity function is obtained by correcting on the basis of vector cosine (common trend consistency measurement), so that the influence of different scoring standards on consistency is effectively reduced, and all SIMs (S)_i，S_j) The values of the elements can form a matrix;

SIM＝(SIM(S_i，S_j))_mxm

the algorithm can effectively reduce the deviation of the final obtained result due to different pair consistencies of the scoring standards, and further improve the accuracy of the final knowledge point difficulty grading.

The above description is only for the preferred embodiment of the present invention, but the scope of the present invention is not limited thereto, and any person skilled in the art should be considered to be within the technical scope of the present invention, and the technical solutions and the inventive concepts thereof according to the present invention should be equivalent or changed within the scope of the present invention.

Claims

1. The knowledge point difficulty grading method based on big data is characterized by comprising the following steps: the method comprises the following steps:

s5, dividing the prediction into construction matrixes;

s6, clustering the answer result prediction by using an algorithm;

2. The big-data-based knowledge point difficulty rating method of claim 1, wherein in S3, after the answerer completes answering within the answering time, for the question and answer submission, the system marks the ID of the answerer for each question and answer package, and breaks up through the system, and redistributes to the answerer, and at the next time, it is ensured that the same ID cannot be obtained to the question and answer with the same packaged bound ID.

3. The big-data-based knowledge point difficulty rating method of claim 1, wherein in S4, each answerer predicts the topic score according to his or her own expertise for the received topic and answer, and the score prediction range is 0-10 points.

4. The big-data-based knowledge point difficulty ranking method according to claim 1, wherein in S5, the prediction is divided into construction matrix, where R ═ R (R ═ R)_ij)_mxnWherein r is_ijAnd (5) scoring the answer j for the answerer i, wherein m is the number of answerers and n is the number of questions.

5. The big-data-based knowledge point difficulty ranking method according to claim 1, wherein in S6, a Kmeans algorithm is used to perform clustering on the answer result predictors.

6. The big-data-based knowledge point difficulty rating method of claim 1, wherein in S8, the prediction score is added to the matrix obtained in S4, and the above steps are continued to be performed iteratively until each answerer scores the received question and answer.

7. The big-data-based knowledge point difficulty rating method of claim 1, wherein in S9, the results of the iteration are collected, the prediction of the received questions by each answerer recorded in S7 is compared with the specific scores obtained by the iteration, and the average of the two is calculated.

8. The big data-based knowledge point difficulty ranking method according to claim 1, wherein in S10, the processed data is collected and input into the big data system, the obtained average is divided into 0-3, 4-7 and 8-10 sections according to the data, and the 0-3 section is represented by red, the 4-7 section is represented by yellow, and the 8-10 section is represented by green.

9. The big-data-based knowledge point difficulty ranking method according to claim 1, wherein in S11, a 0-3 sub-region is a difficulty level, a 4-7 sub-region is a middle level, and an 8-10 sub-region is a simple level.

10. The method for ranking difficulty of knowledge points based on big data according to claim 1, wherein in S12, the big data system is used to re-search the topics contained in different ranked levels for corresponding topic patterns in the topic database, and the final topic domain difficulty ranking is obtained according to the ratio of different ranked levels in the topic domain.