CN110737771B - Topic distribution method and device based on big data - Google Patents

Topic distribution method and device based on big data Download PDF

Info

Publication number
CN110737771B
CN110737771B CN201910866615.5A CN201910866615A CN110737771B CN 110737771 B CN110737771 B CN 110737771B CN 201910866615 A CN201910866615 A CN 201910866615A CN 110737771 B CN110737771 B CN 110737771B
Authority
CN
China
Prior art keywords
user
question
difficulty
users
data
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201910866615.5A
Other languages
Chinese (zh)
Other versions
CN110737771A (en
Inventor
孙全智
耿溟
孙艺恬
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Tenfen Technology Co ltd
Original Assignee
Beijing Tenfen Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Tenfen Technology Co ltd filed Critical Beijing Tenfen Technology Co ltd
Priority to CN201910866615.5A priority Critical patent/CN110737771B/en
Publication of CN110737771A publication Critical patent/CN110737771A/en
Application granted granted Critical
Publication of CN110737771B publication Critical patent/CN110737771B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/35Clustering; Classification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/06Resources, workflows, human or project management; Enterprise or organisation planning; Enterprise or organisation modelling
    • G06Q10/063Operations research, analysis or management
    • G06Q10/0631Resource planning, allocation, distributing or scheduling for enterprises or organisations
    • G06Q10/06311Scheduling, planning or task assignment for a person or group
    • G06Q10/063112Skill-based matching of a person or a group to a task
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
    • G06Q50/10Services
    • G06Q50/20Education
    • G06Q50/205Education administration or guidance

Landscapes

  • Business, Economics & Management (AREA)
  • Engineering & Computer Science (AREA)
  • Human Resources & Organizations (AREA)
  • Educational Administration (AREA)
  • Theoretical Computer Science (AREA)
  • Strategic Management (AREA)
  • Tourism & Hospitality (AREA)
  • Economics (AREA)
  • General Physics & Mathematics (AREA)
  • Physics & Mathematics (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Educational Technology (AREA)
  • General Business, Economics & Management (AREA)
  • Marketing (AREA)
  • Operations Research (AREA)
  • Game Theory and Decision Science (AREA)
  • Development Economics (AREA)
  • Quality & Reliability (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Primary Health Care (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention provides a title distribution method and device based on big data, and belongs to the technical field of computer information. The invention provides a big data-based question distribution method which comprises a data establishment stage and a question distribution stage, wherein in the data establishment stage, users are classified into a plurality of user category groups according to answer data of the users, and then the difficulty coefficient of each question under each user category group is determined according to the answer data of each user category group. In the topic distribution stage, a user category group of a user and the difficulty coefficient of each topic in the group are obtained, and then the topic with the proper difficulty coefficient is selected and distributed to the user. Because the difficulty of each question is determined according to the individual answer data of the user and the answer data of each user in the group to which the user belongs, and then the question with proper difficulty is selected and distributed to the user, the difficulty of the question can be more accurately evaluated, and meanwhile, the question with proper difficulty can be distributed according to the personalized requirements of the user.

Description

Question distribution method and device based on big data
Technical Field
The invention belongs to the technical field of computer information, and particularly relates to a question allocation method and device based on big data.
Background
The application of online answers such as online learning, online examination, online answer games and the like is the current trend, and currently, on some websites or application programs for online answers, when a user wants to answer, several ways of assigning questions are generally adopted, for example, the questions are randomly extracted from a question bank and assigned to the user; for another example, the user selects an answer mode, and all users answer the same question in the same mode; for another example, the difficulty of the questions is evaluated according to the personal answering conditions of the user, and the questions with moderate difficulty are selected and distributed to the user for answering. The problem distribution modes can not distribute problems according to the personal condition of each user, or difficulty evaluation can be carried out only according to the personal answer condition of the user, the contingency is high, the difficulty evaluation is inaccurate, and therefore a good learning effect cannot be achieved.
Disclosure of Invention
The invention aims to solve at least one of the technical problems in the prior art, and provides a question allocation method based on big data, which can accurately evaluate the difficulty of questions and allocate the questions according to the evaluated difficulty and the personalized requirements of users, thereby achieving better answering effect.
The technical scheme adopted for solving the technical problem of the invention is a title distribution method based on big data, which comprises the following steps:
a data establishing stage:
establishing a plurality of special topics, and respectively establishing question banks for the plurality of special topics, wherein each question bank comprises a plurality of questions;
classifying each user into a plurality of user category groups through a classifier obtained by pre-training according to answer data of each user;
determining difficulty coefficients of all the questions under each user category group according to preset algorithms according to answer data of all users in each user category group;
a topic distribution stage:
determining a special question and a question answering mode selected by a user;
acquiring a user category group to which the user belongs, and acquiring a difficulty coefficient of each topic in the topics selected by the user according to the user category group;
and selecting questions with matched difficulty coefficients from the questions to be distributed to the user for answering according to a preset difficulty coefficient interval of the answering mode selected by the user.
According to the method provided by the invention, the difficulty coefficient of each question is determined according to the individual answer data of the user and the answer data of each user in the user category group to which the user belongs, and then the question with the proper difficulty coefficient is selected and distributed to the user for answering according to the special question and the answer mode selected by the user, so that the difficulty of the question can be more accurately evaluated, and then the question is distributed according to the evaluated difficulty and the personalized requirement of the user, so that a better answer effect can be achieved.
Preferably, in the above method provided by the present invention, the method further comprises:
and after the user finishes answering, updating each question made by the user at this time according to the answer data of the user at this time and a preset algorithm, wherein the difficulty coefficient is under the user category group to which the user belongs.
Preferably, in the above method provided by the present invention, each topic in the topic library includes at least one tag; the classifying of each user into a plurality of user category groups by a classifier obtained through pre-training according to the answer data of each user specifically includes:
respectively generating associated data of each user and a label according to the answer data of the question answered by each user and the label included in the question answered by each user; the associated data is the number of the questions answered by the user and the number of the questions answered by the user under the label included by the questions answered by each user;
and classifying each user into a plurality of user category groups according to the associated data and a classifier obtained by pre-training.
Preferably, in the method provided by the present invention, the pre-trained classifier uses a clustering algorithm to classify each user into a plurality of user category groups.
Preferably, in the above method provided by the present invention, the clustering algorithm includes any one of a K-means clustering algorithm, a center point clustering algorithm, and a random selection clustering algorithm.
Preferably, in the method provided by the present invention, the preset algorithm satisfies the following condition:
Figure BDA0002201445250000031
wherein K1 is the initial difficulty coefficient of the topic; the AC is the total number of users of the user category group to which the user belongs; the ACF is the number of users who have done the topic in the total number of users; ACFR is the number of users who have done the answer to the track.
Correspondingly, the invention also provides a title distribution device based on big data, which comprises: a data establishing unit and a topic distributing unit;
the data establishing unit specifically includes:
the system comprises an item bank establishing module, a plurality of item banks and a plurality of item bank setting modules, wherein the item bank establishing module is used for establishing a plurality of special items and respectively establishing item banks for the plurality of special items, and each item bank comprises a plurality of items;
the user classification module is used for classifying each user into a plurality of user category groups through a classifier obtained through pre-training according to answer data of each user;
the difficulty calculation module is used for determining the difficulty coefficient of each question under each user category group according to a preset algorithm and the answer data of each user in each user category group;
the title allocation unit specifically includes:
the selection module is used for determining the special questions and answer modes selected by the user;
the acquisition module is used for acquiring a user category group to which the user belongs and acquiring difficulty coefficients of all the topics in the special topics selected by the user according to the user category group;
and the distribution module is used for selecting questions with matched difficulty coefficients from the questions to distribute to the user for answering according to a difficulty coefficient interval preset in the answering mode selected by the user.
Preferably, in the above apparatus provided by the present invention, the apparatus further comprises:
and the difficulty updating unit is used for updating each question made by the user at this time according to the answer data of the user at this time and a preset algorithm after the user finishes answering the questions, and the difficulty coefficient under the user category group to which the user belongs.
Preferably, in the above apparatus provided by the present invention, each topic in the topic library includes at least one tag; the user classification module specifically comprises:
the first module is used for respectively generating associated data of each user and a label according to the answer data of the question answered by each user and the label included in the question answered by each user; the associated data is the number of the questions answered by the user and the number of the questions answered by the user under the label included by the questions answered by each user;
and the second module is used for classifying each user into a plurality of user category groups according to the associated data and the classifier obtained by pre-training.
Preferably, in the apparatus provided by the present invention, in the second module, the pre-trained classifier classifies each user into a plurality of user category groups by using a clustering algorithm.
Preferably, in the above apparatus provided by the present invention, the clustering algorithm includes any one of a K-means clustering algorithm, a center point clustering algorithm, and a random selection clustering algorithm.
Preferably, in the above apparatus provided by the present invention, in the difficulty calculating module and/or the difficulty updating unit, the preset algorithm satisfies:
Figure BDA0002201445250000041
wherein K1 is the initial difficulty coefficient of the topic; the AC is the total number of users of the user category group to which the user belongs; the ACF is the number of users who have done the topic in the total number of users; ACFR is the number of users who have done the answer to the track.
Drawings
Fig. 1 is a flowchart of a data establishment phase in a method for assigning titles based on big data according to this embodiment;
fig. 2 is a flowchart of a topic allocation stage in the method for allocating topics based on big data according to this embodiment;
fig. 3 is a detailed flowchart of step 12 in the data establishment phase in the method for assigning titles based on big data according to this embodiment;
fig. 4 is a schematic structural diagram of a title distribution device based on big data according to this embodiment.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention clearer, the present invention will be described in further detail with reference to the accompanying drawings, and it is obvious that the described embodiments are only some embodiments of the present invention, not all embodiments. All other embodiments, which can be obtained by a person skilled in the art without making any creative effort based on the embodiments in the present invention, belong to the protection scope of the present invention.
The shapes and sizes of the various elements in the drawings are not to scale and are merely intended to facilitate an understanding of the contents of the embodiments of the invention.
The embodiment provides a title distribution method based on big data, which comprises the following steps:
as shown in fig. 1, in the data establishment phase:
s11, establishing a plurality of topics, and respectively establishing question banks for the topics, wherein each question bank comprises a plurality of questions.
Specifically, a plurality of topics are established according to needs, for example, mathematics, politics and English topics are respectively established, and then, an item base is respectively established for the plurality of topics according to contents required by the topics, wherein the item base of each topic comprises a plurality of items.
Further, after the topic library of each topic is established, establishing initialization information for each topic in each topic library, where the initialization information of each topic may include an initial difficulty coefficient K1 of the topic and a label of the topic, where the initial difficulty coefficients K1 of the topics are the same, and K1 may be assigned arbitrarily. The label of each topic can be determined according to the content of the topic, and each topic at least comprises one label. For example, in a topic library of historical topics, if there is a topic, the topic content is "which year the disorder of history occurs? "the labels such as" chinese history "," down dynasty "," year "and the like may be established for the title, and the specific label may be established as required, which is not limited herein.
And S12, classifying the users into a plurality of user category groups through a classifier obtained through pre-training according to the answer data of the users.
Specifically, the users can be classified according to the historical answer data of the users and the labels of the answered questions. As shown in fig. 2, S12 may specifically include:
s121, respectively generating associated data of each user and a label according to answer data of the question answered by each user and the label included in the question answered by each user; the associated data may be the number of questions answered by the user and the number of questions answered by the user under the label included in the questions answered by each user.
Specifically, all tags included in the questions answered by the user are obtained, the number of questions answered by the user and the number of questions answered by the user under each tag are obtained, and associated data of the user and each tag are generated.
For example, for a user a, the user a answers 10 questions, 9 questions are answered in 10 questions, the 10 questions include a label M and a label N, under the label M, the user a answers 7 questions and 6 questions, under the label N, the user a answers 3 questions and answers 3 questions, and then associated data of the user a, the label M and the label N can be generated according to answer data of the 10 questions answered by the user a, which is shown in table 1-1:
user' s Label (R) Number of pairs/number of questions Accuracy rate
A M 6/7 85%
A N 3/3 100%
TABLE 1-1
Of course, the associated data may be in other forms, may include other contents, and may be specifically designed according to needs, and is not limited herein.
It should be noted that the answer data of the user specifically includes the question answered by the user and information about whether the answered question is answered or not answered.
And S122, classifying the users into a plurality of user category groups according to the associated data of the users and a classifier obtained by pre-training.
Specifically, according to the associated data of each user, the classifier obtained by pre-training may classify each user into a plurality of user class groups by using a clustering algorithm. The classifier obtains the associated data of each user, takes the associated data of each user as input data, takes each label as a dimension, selects a proper metric, such as Euclidean metric or Manhattan distance metric, performs cluster analysis on each user to obtain a plurality of data clusters, each data cluster is each user category group, and each data in the data clusters is each user. According to the questions answered by the users under the labels and the accuracy of the answered questions, the users are divided into a plurality of user category groups, namely in the same user category group, the relevance of the answers of the users is the highest, so that the users can be classified more accurately, and the problem that the classification is inaccurate due to the fact that a single variable is used for classifying the users is avoided. Of course, other methods may be adopted to classify the users, and the specific method may be designed according to actual needs, and is not limited herein.
Optionally, the classifier obtained by pre-training may adopt various types of clustering algorithms to classify the users, for example, any one of a K-means clustering algorithm, a center point clustering algorithm, and a random selection clustering algorithm. Taking the classification of users by adopting a K-means clustering algorithm as an example, if the users need to be classified into K user category groups, establishing a multi-dimensional coordinate system by using the associated data of each user, firstly randomly selecting the associated data of K users as an initial clustering center, then calculating the distance between the associated data of each user and the associated data of each user serving as the clustering center, and allocating each user to the user serving as the clustering center closest to the user according to the calculation. If all the users are distributed, K data clusters are obtained, and then K new users serving as clustering centers are calculated according to the position of each user in the K data clusters. This process is repeated until a termination condition is met, which can be set as desired, e.g., no (or a minimum number) of users are reassigned to different data clusters, or no (or a minimum number) of cluster centers of the data clusters change. Specifically, the design is as required, and is not limited herein.
Further, the user category group of each user may be updated according to the answer data of the user.
And S13, according to the answer data of each user in each user category group, determining the difficulty coefficient of each question in the user category group according to a preset algorithm.
Specifically, the preset algorithm satisfies:
Figure BDA0002201445250000071
wherein K1 is the initial difficulty coefficient of the topic; the AC is the total number of users of the user category group to which the user belongs; the ACF is the number of users who have done the topic in the total number of users of the user category group to which the user belongs; ACFR is the number of users who have done the answer to the track. Of course, the difficulty coefficient of the topic may also be calculated in other manners, and the specific design is designed according to the requirement, which is not limited herein.
For a user category group, a first item of a preset algorithm formula is the basic difficulty of a question in the user category group, a second item of the preset algorithm formula is the accuracy rate of answering the question of the user category group, the difficulty coefficient of the question is evaluated by combining big data, namely answer data of a plurality of users, and the difficulty of the question is evaluated by integrating the basic difficulty and the accuracy rate in the user category group, so that the difficulty of the question can be evaluated more accurately, and the difficulty coefficient caused by the fact that a single variable is used for calculating the difficulty coefficient of the question is avoided from being inaccurate. On the other hand, when the difficulty coefficient of one topic is calculated, different difficulty coefficients are calculated for different user category groups, so that when the topics are distributed, the topic distribution is favorably carried out according to the individual requirements of the users.
As shown in fig. 3, in the title assignment phase:
and S21, determining the special subject and the answering mode selected by the user.
Specifically, besides setting multiple special questions, multiple answer modes may also be set, for example, if the method provided in this embodiment is applied to learning software, a learning mode, an examination mode, an easy-to-error-question checking mode, and other modes may be set, each answer mode is preset with a different difficulty coefficient interval, and a user may select an answer mode according to personal requirements.
S22, obtaining a user category group to which the user belongs, and obtaining the difficulty coefficient of each topic in the topics selected by the user according to the user category group to which the user belongs.
Specifically, each topic has different difficulty coefficients in different user category groups, so after the topic selected by the user is determined, the user category group to which the user belongs can be obtained, and then the difficulty coefficient of each topic in the topic selected by the user in the user category group is obtained according to the user category group to which the user belongs.
And S23, selecting the questions matched with the difficulty coefficient from the questions in the special questions selected by the user according to the difficulty coefficient interval preset by the answer mode selected by the user, and distributing the selected questions to the user for answering.
Specifically, the questions with the difficulty coefficient in the difficulty coefficient interval of the mode selected by the user are screened out from the question bank selected by the user, and then the selected questions are distributed to the user according to a certain rule for answering, for example, according to a rule that the difficulty coefficient is from low to high. Furthermore, each answer mode is preset with different difficulty coefficient intervals according to the requirement, for example, a difficulty coefficient interval with a larger range can be set in the learning mode, so that the problem coverage allocated by the user is larger. The specific design may be as required, and is not limited herein.
In summary, in the method provided in this embodiment, the difficulty coefficient of each question is determined according to the answer data of the individual user and the answer data of each user in the user category group to which the user belongs, and then the question with the appropriate difficulty coefficient is selected and allocated to the user for answering according to the special question and the answer mode selected by the user, so that the difficulty of the question can be more accurately evaluated, and then the question is allocated according to the evaluated difficulty for the personalized requirement of the user, so as to achieve a better answer effect.
Optionally, in the method provided in this embodiment, the method may further include:
and S31, after the user finishes answering, updating each question made by the user at this time according to the answer data of the user at this time and a preset algorithm, and the difficulty coefficient under the user category group to which the user belongs.
Specifically, after the user finishes doing the question each time, the question that the user has answered this time and the information whether each question is answered are recorded, and then the difficulty coefficient of the question that the user has answered this time is updated according to the algorithm in the S13. The difficulty coefficient of each question is updated along with the change of the user group type, the change of the answer data of each user and the change of a preset algorithm. The difficulty coefficient of the question is dynamically updated by combining the answer data of each user, the difficulty coefficient of the question can be more accurately evaluated, and therefore the question with the proper difficulty coefficient can be better distributed to the user according to the personalized requirements of the user.
Optionally, when the user answers for the first time, questions including each difficulty coefficient may be randomly screened out from the multiple special questions, and distributed to the user for answering, and then the user is classified according to the answer data of the user.
Correspondingly, as shown in fig. 4, the present embodiment further provides a title distribution device based on big data, including: a data establishing unit 1 and a topic allocating unit 2.
Specifically, the data establishing unit 1 specifically includes:
the question bank establishing module 11 is used for establishing a plurality of special subjects and respectively establishing question banks for the plurality of special subjects, wherein each question bank comprises a plurality of questions.
And the user classification module 12 is configured to classify each user into a plurality of user category groups through a classifier obtained through pre-training according to the answer data of each user.
And the difficulty calculating module 13 is configured to determine, according to the answer data of each user in each user category group, a difficulty coefficient of each question in the user category group according to a preset algorithm.
Specifically, the topic allocation unit 2 specifically includes:
and a selection module 21 for determining the special question and answer mode selected by the user.
The obtaining module 22 is configured to obtain a user category group to which the user belongs, and obtain a difficulty coefficient of each topic in the topics selected by the user according to the user category group to which the user belongs.
And the distribution module 23 is configured to select, according to a difficulty coefficient interval preset in the answer mode selected by the user, a question with a matching difficulty coefficient from among the questions in the special question selected by the user, and distribute the selected question to the user for answering.
Optionally, in the above apparatus provided in this embodiment, the apparatus further includes:
and the difficulty updating unit 3 is used for updating each question made by the user at this time according to the answer data of the user at this time and a preset algorithm after the user finishes answering the question, and the difficulty coefficient under the user category group to which the user belongs.
Optionally, in the above apparatus provided in this embodiment, each topic in the topic library building module 11 includes at least one tag. The user classification module 12 specifically includes:
the first module 01 is used for respectively generating associated data of each user and a label according to answer data of the question answered by each user and the label included in the question answered by each user; the associated data is the number of the questions answered by the user and the number of the questions answered by the user under the label included by the questions answered by each user.
A second module 02, configured to classify each user into a plurality of user category groups according to the associated data and a classifier obtained through pre-training.
Optionally, in the apparatus provided in this embodiment, in the second module 02, a pre-trained classifier classifies each user into a plurality of user category groups by using a clustering algorithm.
Optionally, in the apparatus provided in this embodiment, in the second module 02, the clustering algorithm adopted by the pre-trained classifier includes any one of a mean clustering algorithm, a central point clustering algorithm, and a random selection clustering algorithm.
Optionally, in the apparatus provided in this embodiment, in the difficulty calculating module 13 and/or the difficulty updating unit 03, a preset algorithm satisfies:
Figure BDA0002201445250000111
wherein K1 is the initial difficulty coefficient of the topic; the AC is the total number of users of the user category group to which the user belongs; the ACF is the number of users who have done the topic in the total number of users of the user category group to which the user belongs; ACFR responds the number of users who have done the track to the number of users who have done the track.
In summary, according to the question allocation method based on big data provided by the present invention, the difficulty coefficient of each question is determined according to the answer data of the individual user and the answer data of each user in the user category group to which the user belongs, and then the question with the appropriate difficulty coefficient is selected and allocated to the user for answering according to the special question and the answer mode selected by the user, so that the difficulty of the question can be more accurately evaluated, and then the question is allocated according to the evaluated difficulty and the personalized requirement of the user, thereby achieving a better answer effect.
It will be understood that the above embodiments are merely exemplary embodiments adopted to illustrate the principles of the present invention, and the present invention is not limited thereto. It will be apparent to those skilled in the art that various modifications and improvements can be made without departing from the spirit and substance of the invention, and these modifications and improvements are also considered to be within the scope of the invention.

Claims (8)

1. A title distribution method based on big data is characterized by comprising the following steps:
a data establishment stage:
establishing a plurality of special topics, and respectively establishing question banks for the plurality of special topics, wherein each question bank comprises a plurality of questions;
classifying each user into a plurality of user category groups through a classifier obtained through pre-training according to answer data of each user;
determining difficulty coefficients of all the questions under each user category group according to preset algorithms according to answer data of all users in each user category group;
a topic distribution stage:
determining a special question and a question answering mode selected by a user;
acquiring a user category group to which the user belongs, and acquiring difficulty coefficients of all the topics in the special topics selected by the user according to the user category group;
selecting questions with matched difficulty coefficients from the questions to be distributed to the user for answering according to a difficulty coefficient interval preset in the answering mode selected by the user;
each question in the question bank comprises at least one label; the classifying of each user into a plurality of user category groups by a classifier obtained through pre-training according to the answer data of each user specifically includes:
respectively generating associated data of each user and a label according to the answer data of the question answered by each user and the label included in the question answered by each user; the associated data is the number of the questions answered by the user and the number of the questions answered by the user under the label included by the questions answered by each user;
classifying each user into a plurality of user category groups according to the associated data and a classifier obtained by pre-training; the relevance of the answers of all users is highest in the same user category group;
the preset algorithm satisfies the following conditions:
Figure FDA0003655825000000021
wherein K1 is the initial difficulty coefficient of the topic; the AC is the total number of users of the user category group to which the user belongs; the ACF is the number of users who do the topic in the total number of users; ACFR is the number of users who have done the answer to the track.
2. The method of claim 1, further comprising:
and after the user finishes answering, updating each question made by the user at this time according to the answer data of the user at this time and a preset algorithm, wherein the difficulty coefficient is under the user category group to which the user belongs.
3. The method of claim 1, wherein the pre-trained classifier employs a clustering algorithm to classify users into a plurality of user class groups.
4. The method according to claim 3, wherein the clustering algorithm comprises any one of a K-means clustering algorithm, a center point clustering algorithm, and a random selection clustering algorithm.
5. A topic allocation device based on big data is characterized by comprising: a data establishing unit and a topic distributing unit;
the data establishing unit specifically includes:
the system comprises an item bank establishing module, a plurality of item banks and a plurality of item bank setting modules, wherein the item bank establishing module is used for establishing a plurality of special items and respectively establishing item banks for the plurality of special items, and each item bank comprises a plurality of items;
the user classification module is used for classifying each user into a plurality of user category groups through a classifier obtained through pre-training according to answer data of each user;
the difficulty calculation module is used for determining a difficulty coefficient of each question under each user category group according to a preset algorithm according to the answer data of each user in each user category group;
the title allocation unit specifically comprises:
the selection module is used for determining the special subject and the answering mode selected by the user;
the acquisition module is used for acquiring a user category group to which the user belongs and acquiring difficulty coefficients of all the topics in the special topics selected by the user according to the user category group;
the distribution module is used for selecting questions with matched difficulty coefficients from all the questions to distribute to the user for answering according to a difficulty coefficient interval preset in the answering mode selected by the user;
each question in the question bank comprises at least one label; the user classification module specifically comprises:
the first module is used for respectively generating associated data of each user and a label according to the answer data of the question answered by each user and the label included in the question answered by each user; the associated data is the number of the questions answered by each user and the number of the questions answered by the user under the label included by the questions answered by each user;
the second module is used for classifying each user into a plurality of user category groups according to the associated data and a classifier obtained by pre-training; wherein, in the same user category group, the relevance of each user answering is highest;
in the difficulty calculation module and/or the difficulty updating unit, the preset algorithm satisfies the following condition:
Figure FDA0003655825000000031
wherein, K1 is the initial difficulty coefficient of the subject; the AC is the total number of users of the user category group to which the user belongs; the ACF is the number of users who have done the topic in the total number of users; ACFR responds the number of users who have done the track to the number of users who have done the track.
6. The apparatus of claim 5, further comprising:
and the difficulty updating unit is used for updating each question made by the user at this time according to the answer data of the user at this time and a preset algorithm after the user finishes answering, and the difficulty coefficient under the user category group to which the user belongs.
7. The apparatus of claim 5, wherein the pre-trained classifier in the second module classifies users into a plurality of user class groups using a clustering algorithm.
8. The apparatus of claim 7, wherein the clustering algorithm comprises any one of a K-means clustering algorithm, a center point clustering algorithm, and a random selection clustering algorithm.
CN201910866615.5A 2019-09-12 2019-09-12 Topic distribution method and device based on big data Active CN110737771B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910866615.5A CN110737771B (en) 2019-09-12 2019-09-12 Topic distribution method and device based on big data

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910866615.5A CN110737771B (en) 2019-09-12 2019-09-12 Topic distribution method and device based on big data

Publications (2)

Publication Number Publication Date
CN110737771A CN110737771A (en) 2020-01-31
CN110737771B true CN110737771B (en) 2022-09-27

Family

ID=69267900

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910866615.5A Active CN110737771B (en) 2019-09-12 2019-09-12 Topic distribution method and device based on big data

Country Status (1)

Country Link
CN (1) CN110737771B (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112164261A (en) * 2020-09-24 2021-01-01 浙江太学科技集团有限公司 Intelligent assessment method
CN112561377A (en) * 2020-12-23 2021-03-26 作业帮教育科技(北京)有限公司 Dynamic assignment method and device for questions and electronic equipment

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107343223A (en) * 2017-07-07 2017-11-10 北京慕华信息科技有限公司 The recognition methods of video segment and device
CN108452527A (en) * 2018-03-15 2018-08-28 掌阅科技股份有限公司 Answer method, electronic equipment and the computer storage media of e-book problem
CN109949638A (en) * 2019-04-22 2019-06-28 软通智慧科技有限公司 Knowledge mastery degree determination method, device, terminal and medium

Family Cites Families (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104504597A (en) * 2014-12-26 2015-04-08 湖南亿谷信息科技发展有限公司 Knowledge shop management system and knowledge shop management method of study platform
CN105138653B (en) * 2015-08-28 2018-08-21 天津大学 It is a kind of that method and its recommendation apparatus are recommended based on typical degree and the topic of difficulty
CN105787839A (en) * 2016-03-23 2016-07-20 成都准星云学科技有限公司 Method and device for pushing learning resources
CN106781785B (en) * 2017-01-04 2019-11-29 广东小天才科技有限公司 A kind of item difficulty construction method and device, service equipment based on big data
CN106846962A (en) * 2017-03-20 2017-06-13 安徽七天教育科技有限公司 A kind of wrong answer list generation method based on the wrong topic of student and accurate recommendation
CN107562769A (en) * 2017-05-24 2018-01-09 广东工业大学 A kind of online answer topic recommends method and device
CN108932246A (en) * 2017-05-24 2018-12-04 肖方良 User's on-line study capability evaluation and topic recommended method and system
CN108335242A (en) * 2017-12-20 2018-07-27 卓智网络科技有限公司 Student's differentiating method and device

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107343223A (en) * 2017-07-07 2017-11-10 北京慕华信息科技有限公司 The recognition methods of video segment and device
CN108452527A (en) * 2018-03-15 2018-08-28 掌阅科技股份有限公司 Answer method, electronic equipment and the computer storage media of e-book problem
CN109949638A (en) * 2019-04-22 2019-06-28 软通智慧科技有限公司 Knowledge mastery degree determination method, device, terminal and medium

Also Published As

Publication number Publication date
CN110737771A (en) 2020-01-31

Similar Documents

Publication Publication Date Title
US10467234B2 (en) Differentially private database queries involving rank statistics
Chen et al. Mining students' learning patterns and performance in Web-based instruction: a cognitive style approach
US8117202B1 (en) User segment population techniques
Gottfried Peer effects in urban schools: Assessing the impact of classroom composition on student achievement
US20090259606A1 (en) Diversified, self-organizing map system and method
CN110737771B (en) Topic distribution method and device based on big data
Ragab et al. HRSPCA: Hybrid recommender system for predicting college admission
WO2011133551A2 (en) Reducing the dissimilarity between a first multivariate data set and a second multivariate data set
CN112488863B (en) Dangerous seed recommendation method and related equipment in user cold start scene
CN106485529A (en) The sort method of advertisement position and device
CN109447103B (en) Big data classification method, device and equipment based on hard clustering algorithm
CN110727859B (en) Recommendation information pushing method and device
CN113076437B (en) Small sample image classification method and system based on label redistribution
CN108304428A (en) Information recommendation method and device
CN108629047A (en) A kind of song list generation method and terminal device
CN109272009A (en) A kind of crowd portrayal extracting method and device based on big data analysis
CN110399558A (en) A kind of examination question recommended method and system
CN111639077B (en) Data management method, device, electronic equipment and storage medium
CN111047201A (en) Dormitory allocation method and device based on deep learning
Barber et al. Who is ideological? Measuring ideological consistency in the American public
US20190005519A1 (en) Peak sale and one year sale prediction for hardcover first releases
JP7099521B2 (en) Scoring device, scoring method, recording medium
CN114926060A (en) Method for analyzing results of satisfaction degree of respondents aiming at network questionnaire
CN106549914B (en) identification method and device for independent visitor
CN112732891A (en) Office course recommendation method and device, electronic equipment and medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant