CN117670269B - Method and device for realizing questions for multidimensional forced selection test and electronic equipment - Google Patents
Method and device for realizing questions for multidimensional forced selection test and electronic equipment Download PDFInfo
- Publication number
- CN117670269B CN117670269B CN202311655474.5A CN202311655474A CN117670269B CN 117670269 B CN117670269 B CN 117670269B CN 202311655474 A CN202311655474 A CN 202311655474A CN 117670269 B CN117670269 B CN 117670269B
- Authority
- CN
- China
- Prior art keywords
- questions
- question
- answer
- selecting
- topic
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000000034 method Methods 0.000 title claims abstract description 80
- 238000012360 testing method Methods 0.000 title claims abstract description 73
- 230000009191 jumping Effects 0.000 claims abstract description 9
- 230000008569 process Effects 0.000 claims description 37
- 239000011159 matrix material Substances 0.000 claims description 24
- 238000010276 construction Methods 0.000 claims description 22
- 238000004422 calculation algorithm Methods 0.000 claims description 21
- 238000005259 measurement Methods 0.000 claims description 17
- 238000012545 processing Methods 0.000 claims description 9
- 230000004044 response Effects 0.000 claims description 8
- 238000004364 calculation method Methods 0.000 claims description 7
- 238000004458 analytical method Methods 0.000 claims description 5
- 230000006870 function Effects 0.000 claims description 4
- 230000014509 gene expression Effects 0.000 claims description 4
- 230000002441 reversible effect Effects 0.000 claims description 3
- 238000011156 evaluation Methods 0.000 abstract description 10
- 230000009286 beneficial effect Effects 0.000 abstract description 6
- 238000010586 diagram Methods 0.000 description 5
- 230000008901 benefit Effects 0.000 description 4
- 230000003044 adaptive effect Effects 0.000 description 3
- 238000013461 design Methods 0.000 description 3
- 238000011161 development Methods 0.000 description 3
- 238000005516 engineering process Methods 0.000 description 3
- 238000013100 final test Methods 0.000 description 3
- 230000006872 improvement Effects 0.000 description 3
- 238000010187 selection method Methods 0.000 description 3
- 238000006243 chemical reaction Methods 0.000 description 2
- 230000007547 defect Effects 0.000 description 2
- 230000000694 effects Effects 0.000 description 2
- 238000004519 manufacturing process Methods 0.000 description 2
- 238000005070 sampling Methods 0.000 description 2
- 238000007619 statistical method Methods 0.000 description 2
- 230000006978 adaptation Effects 0.000 description 1
- 230000001149 cognitive effect Effects 0.000 description 1
- 238000009795 derivation Methods 0.000 description 1
- 230000002349 favourable effect Effects 0.000 description 1
- 238000005457 optimization Methods 0.000 description 1
- 239000005432 seston Substances 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
- 239000013589 supplement Substances 0.000 description 1
- 230000009469 supplementation Effects 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q10/00—Administration; Management
- G06Q10/10—Office automation; Time management
- G06Q10/105—Human resources
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/90—Details of database functions independent of the retrieved data types
- G06F16/903—Querying
- G06F16/90335—Query processing
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/10—Complex mathematical operations
- G06F17/18—Complex mathematical operations for evaluating statistical data, e.g. average values, frequency distributions, probability functions, regression analysis
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02D—CLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
- Y02D10/00—Energy efficient computing, e.g. low power processors, power management or thermal management
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- Theoretical Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Business, Economics & Management (AREA)
- Human Resources & Organizations (AREA)
- Databases & Information Systems (AREA)
- Computational Mathematics (AREA)
- Entrepreneurship & Innovation (AREA)
- Mathematical Analysis (AREA)
- Mathematical Optimization (AREA)
- Operations Research (AREA)
- General Engineering & Computer Science (AREA)
- Pure & Applied Mathematics (AREA)
- Strategic Management (AREA)
- Mathematical Physics (AREA)
- Evolutionary Biology (AREA)
- Software Systems (AREA)
- Algebra (AREA)
- Computational Linguistics (AREA)
- Probability & Statistics with Applications (AREA)
- Bioinformatics & Computational Biology (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Economics (AREA)
- Marketing (AREA)
- Quality & Reliability (AREA)
- Tourism & Hospitality (AREA)
- General Business, Economics & Management (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The application relates to a method, a device and an electronic device for realizing a multi-dimensional forced selection test, which belong to the technical field of evaluation, and the method comprises a first selection step, a second selection step and a third selection step, wherein the first selection step is based on random selection, a plurality of questions are not repeatedly selected from a question library to serve as first answer questions, answer data obtained by answering corresponding questions by a target subject are obtained, and all the measured dimensions of all the first answer questions cover all the dimensions to be measured; a second selecting step of selecting a question from the remaining questions in the question bank as a second answer question and obtaining answer data of the corresponding question answered by the target subject, and evaluating whether dynamic selecting can be performed based on Fisher information quantity, if yes, jumping to execute a third selecting step, otherwise continuing to execute the second selecting step; and thirdly, performing dynamic question answering based on the Fisher information quantity method until the question answering condition of the target subject meets the preset requirement. The application is beneficial to more efficiently completing the related test.
Description
Technical Field
The application belongs to the technical field of evaluation, and particularly relates to a method and a device for realizing questions for multidimensional forced selection test and electronic equipment.
Background
The forced selection test is a test form for measuring the non-ability characteristics (such as personality, attitude and thinking) of a tested person in personnel selection, the division of personality traits is more and more refined at present, so that the requirement for high-dimensional tests (referring to more feature dimensions related to the tests) is increased, and the number of feature dimensions measured by the forced selection test of the current commercial application is more than 20.
In the technical field of talent evaluation, in order to realize evaluation paths which are different from person to person, the questions answered by each testee are different, meanwhile, only the questions matched with the level of the testee can be answered, so that the number of the answered questions is reduced, the test efficiency is improved, a computer self-adaptive test CAT (computed ADAPTIVE TEST) appears, and the CAT is a measuring means which appears along with the project reaction theory IRT (Item Response Theory) and the development of computer technology. In the related art, CAT realizes topic selection based on Fisher information in application implementation, and a topic selection method based on Fisher information is relatively mature in a single-dimensional topic scenario.
However, the existing CAT implementation method does not consider the influence on Fisher information matrix calculation when the dimension is higher; if in the multi-dimensional forced selection test scene, the number of dimensions to be measured is large, and the following problems can be caused by migrating the traditional Fisher information matrix-based adaptive question selection method: 1) The number of the prepositioned questions is unstable, and a plurality of questions can be tested to start computing the Fisher information matrix, so that the overall test efficiency is affected; 2) The same title is always selected in the pre-stage, so that part of the title is overexposed to influence the final test effect.
Therefore, how to realize more effective CAT choice questions under the multi-dimensional forced choice test scenario becomes a technical problem to be solved urgently.
The foregoing is provided merely for the purpose of facilitating understanding of the technical solutions of the present invention and is not intended to represent an admission that the foregoing is prior art.
Disclosure of Invention
In order to overcome the problems existing in the related art at least to a certain extent, the application provides a method, a device and electronic equipment for realizing the questions for the multidimensional forced selection test, so as to solve the technical problem of how to realize more effective CAT questions under the scenario of the multidimensional forced selection test.
In order to achieve the above purpose, the application adopts the following technical scheme:
In a first aspect of the present invention,
The application provides a method for realizing the question selection for multidimensional forced selection test, which comprises the following steps:
A first selecting step, based on random selection, of not repeatedly selecting a plurality of questions from a pre-constructed question library as first answer questions, and obtaining answer data obtained by answering each first answer question by a target subject, wherein the measured dimensions of all the first answer questions cover all the dimensions to be measured;
A second selecting step, namely selecting the questions from the rest questions which are not selected in the question library in a preset mode as second answer questions, acquiring answer data corresponding to the second answer questions by the target subjects, evaluating whether dynamic questions can be selected based on Fisher information amount or not based on all acquired answer data and item parameter information of the corresponding questions, if yes, jumping to execute a third selecting step, otherwise, continuing to execute the second selecting step;
A third selecting step, based on a Fisher information quantity method, selecting a third answer question from the rest questions which are not selected in the question library, and obtaining answer data of the corresponding third answer questions of the target subject answer until the question answer condition of the target subject meets the preset requirement;
In the first selecting step, a process of selecting a first answer sheet specifically includes:
step one, establishing a dimension vector comprising all dimensions to be measured, and randomly selecting a question from the question library as a current selected question;
Step two, performing first marking on the title corresponding to the currently selected title in the title library, and removing the dimension measured by the title from the dimension vector;
judging whether the vector number in the dimension vector is zero, if so, taking all the questions marked by the first marks as the first answer questions and ending the selection process, otherwise, continuing to execute the fourth step;
And step four, respectively carrying out matching analysis on other topics which are not marked by the first mark in the topic library according to the dimension vector, correspondingly obtaining the dimension coincidence number of the dimension measured by the corresponding topic as the dimension in the dimension vector, comparing the dimension coincidence number corresponding to the other topics, taking the topic with the largest dimension coincidence number as the new currently selected topic, and jumping to execute the step two.
Optionally, in the second selecting step, the evaluating whether the dynamic question can be selected based on the Fisher information amount based on all the obtained answer data and the item parameter information of the corresponding question includes:
calculating a Fisher information matrix according to all acquired answer data and item parameter information of the corresponding subject;
When the Fisher information matrix obtained through calculation is reversible, determining that dynamic questions can be selected based on the Fisher information quantity, or determining that dynamic questions cannot be selected based on the Fisher information quantity.
Optionally, in the third selecting step, a third answer question is selected from the remaining questions not selected in the question bank based on the Fisher information amount method, including:
Combining each of the residual questions with the answered questions respectively aiming at the unselected residual questions in the question library to obtain a question combination comprising the corresponding residual questions;
Calculating the Fisher information matrix corresponding to each question combination, comparing the inverted trace of the obtained Fisher information matrix, and taking the remaining questions in the question combination corresponding to the Fisher information matrix with the smallest trace as the third answer questions selected at the time.
Optionally, each topic in the topic library is a fixed n-element forced choice topic group topic, where n is greater than or equal to 2.
Optionally, in the pre-construction process of the topic library, selecting axiom based on Luce to perform model construction so as to obtain a topic block measurement model required by project parameter estimation;
The mathematical constraint of the block measurement model comprises that the sum of difficulty parameters of all items in each block is zero.
Optionally, each question in the question library is a fixed quaternary forced choice item group question, and the answer form is ordering;
in the pre-construction process of the topic library, the expression description of the constructed topic block measurement model comprises the following steps:
Wherein,
Representing four items in the question block j,
Corresponding representation item/>The characteristic parameters of the dimension under test,
Subject selection when presenting answer block jIs a function of the probability of (1),
Representing the probability of selection,/>The probability of not being chosen is expressed,
The distinguishing parameter of the item representing the question block j,
The difficulty parameter representing an item in the question block j.
Optionally, in the pre-construction process of the question library, according to the obtained test response data aiming at the question library, performing model parameter estimation by adopting an EM algorithm based on the question measurement model so as to obtain item parameter information of each question;
wherein, M steps of the EM algorithm are realized based on a Newton-like algorithm.
In a second aspect of the present invention,
The application provides a choice question realizing device for multidimensional forced choice test, which comprises:
the first selecting and processing module is used for not repeatedly selecting a plurality of questions from a pre-constructed question library to serve as first answer questions based on random selection in a first selecting process, and acquiring answer data obtained by answering each first answer question by a target subject, wherein the measured dimensions of all the first answer questions cover all the dimensions to be measured;
the second selecting and processing module is used for selecting the questions from the rest questions which are not selected in the question library in a preset mode as second answer questions and obtaining answer data of corresponding second answer questions of a target subject, evaluating whether dynamic selection questions can be carried out based on Fisher information quantity or not based on all the obtained answer data and item parameter information of the corresponding questions, if yes, skipping to execute a third selecting step, otherwise, continuing to execute the second selecting step;
The third selecting and processing module is used for selecting a third answer question from the rest questions which are not selected in the question library based on the Fisher information amount method in the third selecting process, and acquiring answer data of the corresponding third answer questions of the target subject until the answer condition of the questions of the target subject meets the preset requirement;
In the first selecting step, a process of selecting a first answer sheet specifically includes:
step one, establishing a dimension vector comprising all dimensions to be measured, and randomly selecting a question from the question library as a current selected question;
Step two, performing first marking on the title corresponding to the currently selected title in the title library, and removing the dimension measured by the title from the dimension vector;
judging whether the vector number in the dimension vector is zero, if so, taking all the questions marked by the first marks as the first answer questions and ending the selection process, otherwise, continuing to execute the fourth step;
And step four, respectively carrying out matching analysis on other topics which are not marked by the first mark in the topic library according to the dimension vector, correspondingly obtaining the dimension coincidence number of the dimension measured by the corresponding topic as the dimension in the dimension vector, comparing the dimension coincidence number corresponding to the other topics, taking the topic with the largest dimension coincidence number as the new currently selected topic, and jumping to execute the step two.
In a third aspect of the present invention,
The present application provides an electronic device including:
A memory having an executable program stored thereon;
And a processor for executing the executable program in the memory to implement the steps of the method described above.
The application adopts the technical proposal and has at least the following beneficial effects:
In the technical scheme of the application, the method for realizing the topic selection comprises the following steps: a first selecting step, based on random selection, of not repeatedly selecting a plurality of questions from a pre-constructed question library as first answer questions, and obtaining answer data obtained by answering each first answer question by a target subject, wherein the measured dimensions of all the first answer questions cover all the dimensions to be measured; a second selecting step of selecting the questions from the rest questions not selected in the question library in a preset mode as second answer questions and obtaining answer data corresponding to the second answer questions by the target subjects, evaluating whether dynamic questions can be selected based on Fisher information amount or not based on all the obtained answer data and item parameter information of the corresponding questions, if yes, skipping to execute the third selecting step, otherwise continuing to execute the second selecting step; and a third selecting step, namely selecting a third answer question from the rest questions which are not selected in the question library based on the Fisher information amount, and acquiring answer data of the corresponding third answer questions of the target subject, until the question answer condition of the target subject meets the preset requirement. Compared with the existing CAT topic selection implementation, the technical scheme of the application increases the pre-control topic selection link before dynamic topic selection based on Fisher information quantity, thereby avoiding the defect of directly applying Fisher information quantity to conduct topic selection and being beneficial to more efficiently completing related tests under the high-dimensional test scenario.
Additional advantages, objects, and features of the invention will be set forth in part in the description which follows and in part will become apparent to those having ordinary skill in the art upon examination of the following or may be learned from practice of the invention.
Drawings
The accompanying drawings are included to provide a further understanding of the technical aspects or prior art of the present application, and are incorporated in and constitute a part of this specification. The drawings, which are used to illustrate the technical scheme of the present application, are not limited to the technical scheme of the present application.
FIG. 1 is a schematic illustration of various flow links for talent assessment implementation in the context of the present application;
FIG. 2 is a schematic flow diagram illustrating a method for implementing questions for multidimensional forced selection testing according to one embodiment of the present application;
FIG. 3 is a schematic illustration of the implementation of the first selection step of the method for implementing questions for multidimensional forced selection test in one embodiment of the present application;
FIG. 4 is a schematic illustration of an algorithm iStEM applied during the process of question bank construction in accordance with one embodiment of the present application;
FIG. 5 is a schematic diagram of a device for implementing questions for multidimensional forced selection test according to an embodiment of the present application;
Fig. 6 is a schematic structural diagram of an electronic device according to an embodiment of the present application.
Detailed Description
In order to make the objects, technical solutions and advantages of the present application more apparent, the technical solutions of the present application will be described in detail below. It will be apparent that the described embodiments are only some, but not all, embodiments of the application. All other embodiments, based on the examples herein, which are within the scope of the application as defined by the claims, will be within the scope of the application as defined by the claims.
In order to facilitate understanding of the technical scene related to the present application, each flow link for realizing talent evaluation in the scene of the present application is briefly described below.
As shown in FIG. 1, the implementation of talent evaluation mainly comprises several links of question bank construction, question bank test, question bank project parameter calibration, test operation, question replenishment and iteration, project parameter calibration and new question calibration.
In the application, the question bank construction, the question bank test and the question bank item parameter calibration are collectively called as a pre-construction stage of the question bank, and the test operation links refer to an actual test stage, the question selecting implementation method in the application is applied to the stage, then the question supplement and iteration, the item parameter calibration and the new question calibration are carried out, and the technology is substantially the same as the question bank construction and the question bank item parameter calibration, belonging to the further improvement of the question bank, and the links are introduced conceptually.
Question bank construction
In the question bank construction link (or initial question bank construction link), a test designer needs to determine the number and content of the characteristics to be measured, namely one dimension, write item items according to operation definition of each dimension, only measure one dimension for each item, and finally combine the items into question-forcing blocks (one question-forcing block) according to a certain design, so that each question-forcing block contains a plurality of items with different dimensions, and the question-forcing test measures the characteristics of the tested person in different dimensions by requiring the tested person to select the item meeting the requirement of the tester in each question block. The question bank is now the first version of the test, in which there should be enough questions included to ensure measurement accuracy and accuracy of parameter estimation.
The design of the forced choice question block comprises two elements, wherein one element is the number of items contained in each question block, and the common design is a binary group, a ternary group and a quaternary group, namely, each question block comprises two, three and four items; the second type of answer format mainly comprises choosing (Pick, selecting the most suitable item from a plurality of items of a question block), the best and worst (mobile, selecting the most suitable item and the non-suitable item), and sorting (Rank, sorting the items in the question block according to the degree of being suitable for itself). In practice, the three answer formats of two to four elements, which cover most practical application requirements, are too complex for the testee to use and are not suitable for commercial evaluation and evaluation of most academic classes.
In the present application, the specific manufacturing method and process of the forced choice questions in the question library in the related embodiments are explicitly described in the related literature, and are not described here in detail.
Question bank test
The test link of the question bank is mainly used for collecting a certain amount of test data so as to realize the preliminary estimation of the difficulty and distinguishing degree parameters of the questions in the question bank in the subsequent parameter calibration of the question bank project. During testing, each tested person participating in test answers to all the questions in the question bank, and the test can be implemented through an online test platform or through a mode of paper pen answer re-input in practical implementation. For example, if the question block is a triplet ordering question, there may be 6 answer results for each question, and the final test answer matrix (test data) is a matrix of the number of tested persons and the number of questions, and each element in the matrix is a certain value (used for representing a certain answer result) in 1-6.
Question bank project parameter calibration
In the item parameter calibration of the question bank, the difficulty and the distinguishing degree parameter of each item in the question bank are estimated by using the test data obtained in the last link. As known to those skilled in the art, the related basic principles of the project reaction theory IRT (Item Response Theory) are referred to herein, mainly in terms of the construction of a measurement model and the estimation of model parameters. The improvement in the technical scheme of the present application will be described in detail hereinafter, and will not be described in detail here.
Test operation
The link is the development of the actual test, and is realized through a relevant answer system, and CAT mentioned in the background technology can be realized for a certain individual subject based on the answer system, so that the test efficiency is improved. The final result of the link is to evaluate the subject according to the response data, and the evaluation is based on the statistical analysis of crowd data samples, and the description of the relative situation of individuals in the whole crowd is not artificial subjective evaluation.
Question replenishment and iteration, project parameter calibration and new question calibration
In practice, the initial question library often has many limitations, such as the measurement accuracy and efficiency cannot fully exert the advantages of adaptive questions, and meanwhile, the test questions may need to be replaced in the process of test operation. In some special forced choice tests, such as conditional inference tests, the questions lose measurement effectiveness due to increased exposure and therefore need to be replaced from time to time, i.e., question replenishment and iteration.
The project parameter calibration and the new question calibration are carried out by combining the new and old question parameter calibration based on the new and old question parameter calibration, which is characterized by expanding a statistical analysis sample in essence, so that the related estimated parameters are closer to ideal practice.
In a word, the two links of question supplementation and iteration, item parameter calibration and new question calibration are performed in the process of perfecting a question bank, and the two processes can be circularly performed based on the duration of test operation in practice.
Based on the above description of the technical scenario, the method for implementing the questions for multidimensional forced selection test provided by the application is further described below.
As described in the background art, the feature dimensions measured simultaneously by the forced selection test applied commercially today are all above 20, but the influence on Fisher information matrix calculation when the dimensions are higher is not considered in the existing CAT implementation method; in a multidimensional forced selection test scene, more dimensions need to be measured, and the following problems can be caused by migrating the traditional Fisher information matrix-based self-adaptive question selection method: 1) The number of the prepositioned questions is unstable, and a plurality of questions can be tested to start computing the Fisher information matrix, so that the overall test efficiency is affected; 2) The same title is always selected in the pre-stage, so that part of the title is overexposed to influence the final test effect.
In view of this, the present application proposes a method for implementing a question for multidimensional forced selection test, the method is applied to a test operation link in fig. 1, an execution subject is a background of a relevant answer system, in an embodiment, as shown in fig. 2, the method for implementing a question of the present application includes:
A first selecting step, based on random selection, of not repeatedly selecting a plurality of questions from a pre-constructed question library as first answer questions, and obtaining answer data obtained by answering each first answer question by a target subject, wherein the measured dimensions of all the first answer questions cover all the dimensions to be measured;
specifically, as a specific embodiment, in order to complete the first selection step as soon as possible, the selection stage is completed with the least amount of questions as possible, as shown in fig. 3, in the first selection step, the process of selecting the first answer sheet specifically includes:
step one, establishing a dimension vector comprising all dimensions to be measured, and randomly selecting a question from a question library as a current selected question, wherein the description is that in a question selecting stage, characteristic dimension information and related item parameters required to be measured of items of each question in the question library are known (in practice, the information is obtained and confirmed in a question library pre-construction stage and is built in a system);
Step two, performing first marking on the title corresponding to the currently selected title in the title library, and removing the dimension measured by the title from the dimension vector;
step three, judging whether the vector number in the dimension vector is zero, if so, taking all the questions marked by the first marks as first answer questions and ending the selection process, otherwise, continuing to execute the step four;
And step four, respectively carrying out matching analysis on other topics which are not marked by the first mark in the topic library according to the dimension vector, correspondingly obtaining the dimension corresponding to the dimension measured by the corresponding topic as the dimension corresponding number of the dimension in the dimension vector, comparing the dimension corresponding to the other topics, taking the topic with the largest dimension corresponding number as the new currently selected topic, in other words, selecting the topic with the largest measured (or not related) dimension as the new currently selected topic, and jumping to execute the step two.
In the implementation of the actual test, in the implementation of answering with the target subject, based on specific requirements, the method can push a first marked question to the subject for answering and obtain corresponding answering data each time, or can obtain all the first answering questions at one time and push the first answering questions to the subject for answering and obtain answering data in any mode.
Continuing to return to fig. 2, after the first selecting step, performing a second selecting step, selecting a question from the remaining questions not selected in the question bank in a preset manner as a second answer question, acquiring answer data corresponding to the second answer question by the target subject, evaluating whether dynamic selecting based on Fisher information amount is possible or not based on all the acquired answer data (including the answer data obtained in the first selecting step) and item parameter information of the corresponding question, if yes, jumping to perform a third selecting step, otherwise continuing to perform the second selecting step;
it should be noted that the preset manner mentioned in the second selecting step may be any manner; for example, in an implementation scenario, for convenience of implementation, the preset manner is a random manner;
specifically, in the second selecting step, based on all the obtained answer data and the item parameter information of the corresponding questions, whether dynamic questions can be selected based on Fisher information amount is evaluated, including:
Calculating a Fisher information matrix according to all acquired answer data and item parameter information of corresponding topics, wherein the related calculation mode is the same as that of the prior art, and the application is not repeated here;
When the Fisher information matrix obtained through calculation is reversible, determining that dynamic questions can be selected based on the Fisher information quantity, or determining that dynamic questions cannot be selected based on the Fisher information quantity.
As shown in fig. 2, in the technical solution of the present application, a third selecting step, based on the Fisher information method, selects a third answer question from the remaining questions not selected in the question library, and obtains answer data corresponding to the third answer question by the target subject until the answer condition of the target subject meets the preset requirement, if the estimation accuracy of the characteristic parameter of the target subject reaches the set standard or the test length reaches the set maximum length, the answer process of the selected question is ended, for example, the test length set based on experience is 80 questions, and if the answer condition of a certain subject reaches 80 questions, the answer process of the selected question is ended;
It should be noted that, the related principles of the Fisher A/D method choice are already described in the related publications, and the application is not repeated here; in the present application, in the third selecting step, the selecting implementation process includes:
aiming at the residual questions which are not selected in the question library, respectively combining each residual question with the answered questions to obtain a question combination comprising the corresponding residual questions;
calculating the Fisher information matrix corresponding to each question combination, comparing the inverted trace of the obtained Fisher information matrix, and taking the remaining questions in the question combination corresponding to the Fisher information matrix with the smallest trace as the third selected answer questions.
Compared with the existing CAT topic selection implementation, the technical scheme of the application increases the pre-control topic selection link before dynamic topic selection based on Fisher information quantity, thereby avoiding the defect of directly applying Fisher information quantity to conduct topic selection and being beneficial to more efficiently completing related tests under the high-dimensional test scenario.
The following describes the related improvements of the present application in the pre-construction stage of the question bank.
In the prior art, the most widely used R packet of TIRT model and parameter estimation thereof is currently, but the optimization space exists in both the model and the parameter estimation method, and in particular,
1) On the model, TIRT model adopts the rule of the comparison of the Seston, simulates the psychological decision process of selecting the testee in the problem blocks, and considers that all items in one problem block are compared one by one, for example, in one problem block containing three items ABC, the testee can perform three times of comparison of AB/AC/BC. This way of decision may make the information redundant, for example, when we confirm a > B, B > C, we can confirm the ordering of the three items, and the comparison of AC is the redundant information, and a > C can be obtained without the comparison. There is also a simpler decision process-Luce selection axiom which regards the ordering as a series of mutually independent processes for best selection, such as in the same triplet, where the subject only needs to make two comparisons, in the first time, selecting the option from ABC that best fits himself, such as a; in the second time, the options more conforming to the user's own option, such as B, are selected from the remaining two BC items, so far the ordering of the three items can be confirmed. The application of the rule of comparison of the sephson in the quadruple requires 6 comparisons, whereas Luce has the advantage of choosing axiom more, only 3 comparisons are required.
Therefore, in the technical scheme of the application, axiom is selected based on Luce to construct a model so as to obtain a problem block measurement model required by project parameter estimation, the model is favorable for saving cognitive load, and mathematical limitation specific to a multi-element group is increased through strict mathematical derivation, so that the model is more suitable for a sorting process of a tri-element group and a larger problem type and is named as a 2PL-RANK model. In other words, in the present application, in the pre-construction process of the question bank, each question in the question bank is a fixed n-element forced choice item group question, where n is greater than or equal to 2, and it is easy to understand that "fixed" herein means that question blocks of different sizes cannot occur simultaneously in one test.
For example, in one embodiment, each topic in the topic library is a fixed quaternary forced choice item group topic, and the responses are ordered;
In the pre-construction process of the topic library, the expression description of the constructed topic block measurement model (2 PL-RANK model) comprises the following steps:
(3)
(4)
(5)
(6)
(7)
in the expressions (3) to (7),
Representing four items in the question block j,
Respectively and correspondingly represent items/>The characteristic parameters of the dimension under test,
Subject selection when presenting answer block jIs a function of the probability of (1),
Representing the probability of selection,/>The probability of not being chosen is expressed,
The distinguishing parameter of the item representing the question block j,
The difficulty parameter representing an item in the question block j.
It will be readily appreciated that in the above description of the model expressionMerely an example of one possible outcome, the probability of each possible outcome is calculated in fact.
In the technical scheme of the application, the mathematical constraint of the block measurement model comprises that the sum of difficulty parameters of all items in each block is zero, so that because the difficulty of a plurality of items in one block is relative rather than absolute, for example, in a triplet consisting of abc three items, the difficulty of a is relatively high, which indicates that the probability of selecting a is relatively small under the same capability, but the small is relatively small, only smaller than b and c, and the sum of the difficulty of abc is unchanged, and the accuracy of model parameter estimation can be increased by adding the constraint in an expanded model.
2) In the prior art, a TIRT model is generally estimated by using Mplus software by using a least square method or related R packets are called to be estimated by using an MCMC algorithm, but Mplus software cannot generate Standard Errors (SE), can not be converged in a high-dimensional environment and occupies higher memory; the MCMC algorithm has long estimation time, and each parameter estimation needs to be self-coded, so that the method is not convenient.
According to the technical scheme of the application, a iStEM algorithm with high estimation precision and high estimation speed is selected, the algorithm is modified aiming at a forced selected special problem type, and R packets which can be directly called are actually written for parameter estimation, namely in the pre-construction process of a problem library, model parameter estimation is carried out by adopting iStEM algorithm based on a problem measurement model according to obtained test response data aiming at the problem library, so that project parameter information of each problem is obtained.
As shown in fig. 4, in practical implementation, the algorithm is implemented as follows:
(1) E, step E: extracting capability parameters from the gibbs sampler;
(2) M steps: the BFGS algorithm is utilized to maximize likelihood functions, and difficulty and distinguishing degree parameters are estimated;
And (5) continuously iterating the step E and the step M, and storing parameter estimation results of each group of iterations. When Geweke statistics reach a standard value, only keeping the last 10 groups of iteration results, continuing iteration and keeping the parameter estimation result of each group of iteration; and stopping iteration when the group variance reaches a standard value, and calculating the average value of all the group iteration results as a final estimated value.
The algorithm belongs to an EM algorithm, and because the estimation result of the M steps forms a time-aligned Markov chain, the difficulty and the distinguishing degree parameter obtained by each iteration can be stably converged to a true value finally. The convergence characteristic of the Markov chain ensures the accuracy of sampling, and compared with the MCMC algorithm which completely relies on sampling, the iStEM algorithm introduces a Newton-like algorithm in the M step, so that the efficiency is improved, and therefore, the algorithm has enough accuracy and higher time efficiency than the MCMC algorithm.
In the technical scheme of the application, the time efficiency and the estimation precision of the implementation mode of project parameter estimation are higher, the calculation and time cost are saved, and the method is beneficial to test development and iteration; the measurement model has strong flexibility and wider question type adaptation, and is beneficial to meeting different test requirements.
Fig. 5 is a schematic structural diagram of a topic implementation device for multidimensional forced selection testing according to an embodiment of the present application, as shown in fig. 5, the topic implementation device 500 includes:
A first selection processing module 501, configured to, in a first selection process, based on random selection, not repeatedly select a plurality of questions from a pre-constructed question library as first answer questions, and obtain answer data obtained by answering each first answer question by a target subject, where all dimensions measured by the first answer questions cover all dimensions to be measured;
The second selection processing module 502 is configured to select, in a second selection process, a question from the remaining questions that are not selected in the question bank in a preset manner, as a second answer question, obtain answer data corresponding to the second answer question by the target subject, evaluate whether dynamic selection of the question can be performed based on the Fisher information amount based on all the obtained answer data and item parameter information of the corresponding question, and skip the third selection step if yes, otherwise continue the second selection step;
and a third selection processing module 503, configured to select a third answer question from the remaining questions not selected in the question bank based on the Fisher information method in the third selection process, and obtain answer data of the target subject for answering the corresponding third answer question until the question answer condition of the target subject meets the preset requirement.
With regard to the topic implementation apparatus 500 in the related embodiment described above, the specific manner in which each module performs the operation has been described in detail in the embodiment related to the method, and will not be described in detail here.
Fig. 6 is a schematic structural diagram of an electronic device according to an embodiment of the present application, as shown in fig. 6, the electronic device 600 includes:
a memory 601 on which an executable program is stored;
a processor 602 for executing the executable program in the memory 601 to implement the steps of the above method.
The specific manner in which the processor 602 executes the program in the memory 601 of the electronic device 600 in the above embodiment has been described in detail in the embodiment related to the method, and will not be described in detail here.
The present invention is not limited to the above-mentioned embodiments, and any changes or substitutions that can be easily understood by those skilled in the art within the scope of the present invention are intended to be included in the scope of the present invention. Therefore, the protection scope of the present invention should be subject to the protection scope of the claims.
Claims (9)
1. The method for realizing the questions for the multidimensional forced selection test is characterized by comprising the following steps:
A first selecting step, based on random selection, of not repeatedly selecting a plurality of questions from a pre-constructed question library as first answer questions, and obtaining answer data obtained by answering each first answer question by a target subject, wherein the measured dimensions of all the first answer questions cover all the dimensions to be measured;
A second selecting step, namely selecting the questions from the rest questions which are not selected in the question library in a preset mode as second answer questions, acquiring answer data corresponding to the second answer questions by the target subjects, evaluating whether dynamic questions can be selected based on Fisher information amount or not based on all acquired answer data and item parameter information of the corresponding questions, if yes, jumping to execute a third selecting step, otherwise, continuing to execute the second selecting step;
A third selecting step, based on a Fisher information quantity method, selecting a third answer question from the rest questions which are not selected in the question library, and obtaining answer data of the corresponding third answer questions of the target subject answer until the question answer condition of the target subject meets the preset requirement;
In the first selecting step, a process of selecting a first answer sheet specifically includes:
step one, establishing a dimension vector comprising all dimensions to be measured, and randomly selecting a question from the question library as a current selected question;
Step two, performing first marking on the title corresponding to the currently selected title in the title library, and removing the dimension measured by the title from the dimension vector;
judging whether the vector number in the dimension vector is zero, if so, taking all the questions marked by the first marks as the first answer questions and ending the selection process, otherwise, continuing to execute the fourth step;
And step four, respectively carrying out matching analysis on other topics which are not marked by the first mark in the topic library according to the dimension vector, correspondingly obtaining the dimension coincidence number of the dimension measured by the corresponding topic as the dimension in the dimension vector, comparing the dimension coincidence number corresponding to the other topics, taking the topic with the largest dimension coincidence number as the new currently selected topic, and jumping to execute the step two.
2. The method according to claim 1, wherein in the second selecting step, the evaluating whether the dynamic topic can be selected based on the Fisher information amount based on all the obtained answer data and the item parameter information of the corresponding topic includes:
calculating a Fisher information matrix according to all acquired answer data and item parameter information of the corresponding subject;
When the Fisher information matrix obtained through calculation is reversible, determining that dynamic questions can be selected based on the Fisher information quantity, or determining that dynamic questions cannot be selected based on the Fisher information quantity.
3. The method of claim 1, wherein in the third selecting step, a third answer question is selected from the remaining questions not selected in the question bank based on a Fisher information amount method, including:
Combining each of the residual questions with the answered questions respectively aiming at the unselected residual questions in the question library to obtain a question combination comprising the corresponding residual questions;
Calculating the Fisher information matrix corresponding to each question combination, comparing the inverted trace of the obtained Fisher information matrix, and taking the remaining questions in the question combination corresponding to the Fisher information matrix with the smallest trace as the third answer questions selected at the time.
4. The method of claim 1, wherein each topic in the topic library is a fixed n-gram topic, where n is greater than or equal to 2.
5. The method according to claim 4, wherein, in the pre-construction process of the topic library, a model construction is performed based on Luce selection axiom to obtain a topic block measurement model required for project parameter estimation;
The mathematical constraint of the block measurement model comprises that the sum of difficulty parameters of all items in each block is zero.
6. The method of claim 5, wherein each topic in the topic library is a fixed quaternary forced choice item group topic, and the responses are ordered;
in the pre-construction process of the topic library, the expression description of the constructed topic block measurement model comprises the following steps:
Wherein,
Representing four items in the question block j,
Corresponding representation item/>The characteristic parameters of the dimension under test,
Subject selection/>, when presenting answer block jIs a function of the probability of (1),
Representing the probability of selection,/>The probability of not being chosen is expressed,
The distinguishing parameter of the item representing the question block j,
The difficulty parameter representing an item in the question block j.
7. The method according to claim 5, wherein in the pre-construction process of the question bank, model parameter estimation is performed by using an EM algorithm based on the question measurement model according to the obtained test response data for the question bank, so as to obtain item parameter information of each question;
wherein, M steps of the EM algorithm are realized based on a Newton-like algorithm.
8. A choice question realizing device for multidimensional forced selection test, comprising:
the first selecting and processing module is used for not repeatedly selecting a plurality of questions from a pre-constructed question library to serve as first answer questions based on random selection in a first selecting process, and acquiring answer data obtained by answering each first answer question by a target subject, wherein the measured dimensions of all the first answer questions cover all the dimensions to be measured;
the second selecting and processing module is used for selecting the questions from the rest questions which are not selected in the question library in a preset mode as second answer questions and obtaining answer data of corresponding second answer questions of a target subject, evaluating whether dynamic selection questions can be carried out based on Fisher information quantity or not based on all the obtained answer data and item parameter information of the corresponding questions, if yes, skipping to execute a third selecting step, otherwise, continuing to execute the second selecting step;
The third selecting and processing module is used for selecting a third answer question from the rest questions which are not selected in the question library based on the Fisher information amount method in the third selecting process, and acquiring answer data of the corresponding third answer questions of the target subject until the answer condition of the questions of the target subject meets the preset requirement;
In the first selecting step, a process of selecting a first answer sheet specifically includes:
step one, establishing a dimension vector comprising all dimensions to be measured, and randomly selecting a question from the question library as a current selected question;
Step two, performing first marking on the title corresponding to the currently selected title in the title library, and removing the dimension measured by the title from the dimension vector;
judging whether the vector number in the dimension vector is zero, if so, taking all the questions marked by the first marks as the first answer questions and ending the selection process, otherwise, continuing to execute the fourth step;
And step four, respectively carrying out matching analysis on other topics which are not marked by the first mark in the topic library according to the dimension vector, correspondingly obtaining the dimension coincidence number of the dimension measured by the corresponding topic as the dimension in the dimension vector, comparing the dimension coincidence number corresponding to the other topics, taking the topic with the largest dimension coincidence number as the new currently selected topic, and jumping to execute the step two.
9. An electronic device, comprising:
A memory having an executable program stored thereon;
A processor for executing the executable program in the memory to implement the steps of the method of any one of claims 1-7.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202311655474.5A CN117670269B (en) | 2023-12-05 | 2023-12-05 | Method and device for realizing questions for multidimensional forced selection test and electronic equipment |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202311655474.5A CN117670269B (en) | 2023-12-05 | 2023-12-05 | Method and device for realizing questions for multidimensional forced selection test and electronic equipment |
Publications (2)
Publication Number | Publication Date |
---|---|
CN117670269A CN117670269A (en) | 2024-03-08 |
CN117670269B true CN117670269B (en) | 2024-06-21 |
Family
ID=90069347
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202311655474.5A Active CN117670269B (en) | 2023-12-05 | 2023-12-05 | Method and device for realizing questions for multidimensional forced selection test and electronic equipment |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN117670269B (en) |
Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108346030A (en) * | 2017-12-29 | 2018-07-31 | 北京北森云计算股份有限公司 | Computer adaptive ability testing method and device |
Family Cites Families (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US9165254B2 (en) * | 2008-01-14 | 2015-10-20 | Aptima, Inc. | Method and system to predict the likelihood of topics |
US20160019803A1 (en) * | 2014-07-21 | 2016-01-21 | New York University | System, method and computer-accessible medium for scalable testing and evaluation |
CN106871909B (en) * | 2017-02-20 | 2019-10-08 | 中国人民解放军国防科学技术大学 | Pulsar satellite selection method based on Fisher information matrix under a kind of multi spacecraft system |
CN110287103B (en) * | 2019-05-22 | 2022-05-17 | 深圳壹账通智能科技有限公司 | Software product evaluation processing method and device, computer equipment and storage medium |
CN110428911A (en) * | 2019-07-24 | 2019-11-08 | 北京智鼎优源管理咨询有限公司 | Adaptive assessment method and equipment |
US11340356B2 (en) * | 2020-02-13 | 2022-05-24 | Mitsubishi Electric Research Laboratories, Inc. | System and method for integer-less GNSS positioning |
CN111554143B (en) * | 2020-03-31 | 2021-08-27 | 北京课程帮科技有限公司 | Evaluation method and device based on CO-MIRT algorithm model |
CN114691856B (en) * | 2022-04-20 | 2024-07-26 | 平安科技(深圳)有限公司 | Question recommendation method, device, equipment and medium |
CN116109454A (en) * | 2023-02-09 | 2023-05-12 | 科大讯飞股份有限公司 | Method, device, equipment and storage medium for determining question difficulty in capability evaluation |
CN116562836B (en) * | 2023-06-27 | 2023-09-05 | 北森云计算有限公司 | Method, device, electronic equipment and storage medium for multidimensional forced choice question character test |
-
2023
- 2023-12-05 CN CN202311655474.5A patent/CN117670269B/en active Active
Patent Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108346030A (en) * | 2017-12-29 | 2018-07-31 | 北京北森云计算股份有限公司 | Computer adaptive ability testing method and device |
Non-Patent Citations (2)
Title |
---|
A Multi-Unidimensional Pairwise-Preference Model for RANK Response Format Data;Juan Liu等;《www.psyarxiv.com/rygu8/download》;20211231;正文第4-5页 * |
多维计算机化多阶段自适应测验研究:自动组卷算法及其路由规则;徐玲玲;《中国优秀硕士论文》;20200315;正文第16页 * |
Also Published As
Publication number | Publication date |
---|---|
CN117670269A (en) | 2024-03-08 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
WO2017157203A1 (en) | Reference test method and device for supervised learning algorithm in distributed environment | |
CN111651676B (en) | Method, device, equipment and medium for performing occupation recommendation based on capability model | |
CN111858906B (en) | Problem recommendation method and device, electronic equipment and computer readable storage medium | |
Zhuang et al. | Fully adaptive framework: Neural computerized adaptive testing for online education | |
CN111310918B (en) | Data processing method, device, computer equipment and storage medium | |
CN111737439A (en) | Question generation method and device | |
KR20180061998A (en) | System and method for diagnosing mass attributes of leaner | |
CN113827977A (en) | Game loss user prediction method and system based on BP neural network | |
US20210358317A1 (en) | System and method to generate sets of similar assessment papers | |
CN113988044A (en) | Method for judging error question reason type | |
CN117670269B (en) | Method and device for realizing questions for multidimensional forced selection test and electronic equipment | |
CN116562836B (en) | Method, device, electronic equipment and storage medium for multidimensional forced choice question character test | |
CN117825912A (en) | Chip testing method and device, computer equipment and storage medium | |
CN114860617B (en) | Intelligent pressure testing method and system | |
CN113393023B (en) | Mold quality evaluation method, apparatus, device and storage medium | |
CN115985152A (en) | Self-adaptive recommendation method for online programming teaching and related equipment | |
CN115203556A (en) | Score prediction model training method and device, electronic equipment and storage medium | |
CN113827981A (en) | Game loss user prediction method and system based on naive Bayes | |
CN114896105A (en) | Reliability evaluation method, device, equipment and medium for electronic equipment | |
CN112733036A (en) | Knowledge point recommendation method and device, storage medium and electronic device | |
CN111242235B (en) | Similar characteristic test data set generation method | |
CN110069783A (en) | A kind of answer content evaluating method and device | |
US20220040532A1 (en) | Utilizing machine learning and cognitive state analysis to track user performance | |
US20240029882A1 (en) | Diagnostic classification device and method | |
CN111507639B (en) | Financing risk analysis method and device |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |