CN112365384B - Target event result index weight, influence factor value determination method and related device - Google Patents
Target event result index weight, influence factor value determination method and related device Download PDFInfo
- Publication number
- CN112365384B CN112365384B CN202110050453.5A CN202110050453A CN112365384B CN 112365384 B CN112365384 B CN 112365384B CN 202110050453 A CN202110050453 A CN 202110050453A CN 112365384 B CN112365384 B CN 112365384B
- Authority
- CN
- China
- Prior art keywords
- index
- value
- weight
- correlation
- event result
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000000034 method Methods 0.000 title claims abstract description 133
- 239000011159 matrix material Substances 0.000 claims abstract description 370
- 238000012549 training Methods 0.000 claims abstract description 183
- 230000009467 reduction Effects 0.000 claims description 236
- 230000000875 corresponding effect Effects 0.000 claims description 131
- 238000004422 calculation algorithm Methods 0.000 claims description 26
- 238000012216 screening Methods 0.000 claims description 10
- 230000002596 correlated effect Effects 0.000 claims description 7
- 238000003646 Spearman's rank correlation coefficient Methods 0.000 claims description 5
- 238000002360 preparation method Methods 0.000 description 43
- 230000000977 initiatory effect Effects 0.000 description 26
- 230000006399 behavior Effects 0.000 description 25
- 230000008569 process Effects 0.000 description 20
- 238000004364 calculation method Methods 0.000 description 15
- 238000004891 communication Methods 0.000 description 12
- 230000009471 action Effects 0.000 description 11
- 238000010586 diagram Methods 0.000 description 11
- 230000006870 function Effects 0.000 description 11
- 230000008859 change Effects 0.000 description 4
- 230000002349 favourable effect Effects 0.000 description 4
- 230000004048 modification Effects 0.000 description 4
- 238000012986 modification Methods 0.000 description 4
- 238000012545 processing Methods 0.000 description 4
- 238000003491 array Methods 0.000 description 3
- 239000006185 dispersion Substances 0.000 description 3
- 238000005516 engineering process Methods 0.000 description 3
- 238000007637 random forest analysis Methods 0.000 description 3
- 230000009286 beneficial effect Effects 0.000 description 2
- 238000006243 chemical reaction Methods 0.000 description 2
- 230000000694 effects Effects 0.000 description 2
- 238000000513 principal component analysis Methods 0.000 description 2
- 238000004458 analytical method Methods 0.000 description 1
- 230000003542 behavioural effect Effects 0.000 description 1
- 230000005540 biological transmission Effects 0.000 description 1
- 238000013135 deep learning Methods 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 230000006872 improvement Effects 0.000 description 1
- 238000002372 labelling Methods 0.000 description 1
- 230000008450 motivation Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q50/00—Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
- G06Q50/10—Services
- G06Q50/20—Education
- G06Q50/205—Education administration or guidance
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/213—Feature extraction, e.g. by transforming the feature space; Summarisation; Mappings, e.g. subspace methods
- G06F18/2135—Feature extraction, e.g. by transforming the feature space; Summarisation; Mappings, e.g. subspace methods based on approximation criteria, e.g. principal component analysis
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/214—Generating training patterns; Bootstrap methods, e.g. bagging or boosting
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Business, Economics & Management (AREA)
- Physics & Mathematics (AREA)
- Computer Vision & Pattern Recognition (AREA)
- General Physics & Mathematics (AREA)
- Bioinformatics & Computational Biology (AREA)
- Evolutionary Computation (AREA)
- Evolutionary Biology (AREA)
- General Engineering & Computer Science (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Artificial Intelligence (AREA)
- Life Sciences & Earth Sciences (AREA)
- Strategic Management (AREA)
- Educational Technology (AREA)
- Educational Administration (AREA)
- Tourism & Hospitality (AREA)
- Health & Medical Sciences (AREA)
- Economics (AREA)
- General Health & Medical Sciences (AREA)
- Human Resources & Organizations (AREA)
- Marketing (AREA)
- Primary Health Care (AREA)
- General Business, Economics & Management (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
Abstract
The embodiment of the invention provides a method for determining target event result index weight and influence factor value and a related device, wherein the method for determining the target event result index weight comprises the following steps: acquiring a first training data set; training a target event result fitting model to be trained by using the target event result values and the first related index values of the data units of the first training data set until a target event result fitting model meeting the training requirement is obtained, and acquiring a fitting matrix; obtaining a weight matrix at least according to the fitting matrix, wherein each element of the weight matrix corresponds to each first relevant index respectively; and acquiring each first relevant index weight and the sign of each first relevant index weight according to each element of the weight matrix. The target event result index weight and influence factor value determining method and the related device provided by the embodiment of the invention can realize the influence factor value which indirectly influences the target event result.
Description
Technical Field
The embodiment of the invention relates to the field of computers, in particular to a method for determining target event result index weight and target event result influence factor values and a related device.
Background
With the development of computer technology and deep learning technology, the demand for predicting the probability of an event occurring in the future is realized to a certain extent through the technology.
Such as: in an education scene, in order to ensure the stability of a biographical source, the continuous report rate of the existing students needs to be improved, and the back rate of the existing students needs to be reduced, so that the future continuous report rate or the back rate needs to be obtained according to the basic conditions of the existing students, and the continuous report rate or the back rate is influenced by the learning results of the students, and the learning results of the students are directly reflected by indexes such as learning tendency, learning preference and the like and are indirectly influenced by the teaching action of a teacher, so that if the teacher knows the influence degree of different teaching actions of each class on the learning results of the students, the improvement of the continuous report rate or the reduction of the back rate can be realized by adjusting the teaching action.
Therefore, how to determine the influence factor value which indirectly affects the target event result becomes an urgent technical problem to be solved.
Disclosure of Invention
The embodiment of the invention provides a target event result index weight, a target event result influence factor value determination method and a related device, so as to realize an influence factor value which indirectly influences a target event result.
To solve the above problem, an embodiment of the present invention provides a method for determining target event result index weight, including:
acquiring a first training data set, wherein the first training data set comprises a target event result value and each first correlation index value of each data unit, the first correlation index value is a numerical value of each first index of which the correlation degree with the target event result meets a first correlation degree threshold, the target event result value is acquired at least based on each second correlation index value and a second correlation index weight of the data unit, the second correlation index is an index associated with the target event result, the second correlation index value directly reflects the target event result value, and the first correlation index value indirectly affects the target event result value;
training a target event result fitting model to be trained by using the target event result value and each first correlation index value of each data unit of the first training data set until the target event result fitting model meeting the training requirement is obtained, and acquiring a fitting matrix;
obtaining a weight matrix at least according to the fitting matrix, wherein each element of the weight matrix corresponds to each first relevant index;
and acquiring each first relevant index weight and the sign of each first relevant index weight according to each element of the weight matrix.
In order to solve the above problem, an embodiment of the present invention further provides a method for determining a target event result influence factor value, where the method includes:
acquiring a first correlation index weight of each first correlation index of a data unit determined by a target event result index weight determination method according to any one of the embodiments and a sign of the first correlation index weight, and acquiring a first correlation index value of each first correlation index of the data unit;
and obtaining each influence factor value by using the first correlation index value, the first correlation index weight and the sign of the first correlation index weight which correspond to each other.
In order to solve the above problem, an embodiment of the present invention further provides a target event result index weight determining device, including:
a first training data set acquisition unit adapted to acquire a first training data set, wherein the first training data set includes a target event result value and each first correlation index value of each data unit, the first correlation index value is a numerical value of each first index whose correlation with the target event result satisfies a first correlation threshold, the target event result value is acquired based on at least the data unit and each second correlation index value and a second correlation index weight, the second correlation index is an index associated with the target event result, and the second correlation index value directly reflects the target event result value, the first correlation index value indirectly affects the target event result value;
a fitting matrix obtaining unit, adapted to train a target event result fitting model to be trained by using the target event result value and each first correlation index value of each data unit of the first training data set until the target event result fitting model meeting the training requirement is obtained, and obtain a fitting matrix;
a weight matrix obtaining unit, adapted to obtain a weight matrix at least according to the fitting matrix, wherein each element of the weight matrix corresponds to each of the first correlation indexes;
and the index weight acquisition unit is suitable for acquiring each first relevant index weight and the sign of each first relevant index weight according to each element of the weight matrix.
In order to solve the above problem, an embodiment of the present invention further provides a device for determining a target event result influence factor value, including:
a data unit data value obtaining unit adapted to obtain a first correlation index weight of each first correlation index of a data unit and a sign of the first correlation index weight determined by the target event result index weight determination method according to any one of claims 1 to 14, and obtain a first correlation index value of each first correlation index of the data unit;
an influence factor value obtaining unit adapted to obtain each of the influence factor values by using the first correlation index values, the first correlation index weights, and signs of the first correlation index weights corresponding to each other.
To solve the above problems, embodiments of the present invention provide a storage medium storing a program suitable for target event result index weight determination to implement the target event result index weight determination method as described above, or a storage medium storing a program suitable for target event result influence factor value determination to implement the target event result influence factor value determination method as described in each embodiment.
To solve the above problem, an embodiment of the present invention provides an apparatus, including at least one memory and at least one processor; the memory stores a program that the processor calls to perform a target event result indicator weight determination method as described above or a target event result influencing factor value determination method as described in various embodiments.
Compared with the prior art, the technical scheme of the invention has the following advantages:
the method for determining target event result index weight, the method for determining target event result influence factor value and the related device provided by the embodiment of the invention are characterized in that when the target event result index weight is obtained, a first training data set is firstly obtained, the first training data set is a set of data information of a plurality of data units and comprises a target event result value of each data unit and each first related index value of which the degree of correlation with a target event result meets a first degree of correlation threshold, the target event result is obtained based on a second related index value with direct correlation, then a target event result fitting model is trained according to the first related index data values of each data unit to obtain a predicted target event result, and when the predicted target event result meets the training requirement, and obtaining a fitting matrix, then obtaining a weight matrix at least according to the fitting matrix, and further obtaining each first related index weight and a symbol thereof according to each element of the weight matrix. It can be seen that, in the target event result index weight determining method provided in the embodiment of the present invention, the target event result value obtained based on the second correlation index value and the first correlation index value having an indirect influence on the target event result value are used to train the target event result fitting model to obtain the fitting matrix, and then each first correlation index weight and the sign thereof are obtained according to the fitting matrix, so that a preparation can be made for obtaining an influence factor value having an indirect influence on the target event result, and a preparation can be made for providing a reference for behavior action adjustment of an actor based on the influence factor value, which is helpful for changing the event probability influenced by the target event result toward an expected direction, so that the real event probability in the future satisfies an expectation.
In an alternative scheme, each first index further includes a first related index category labeled in advance, when target event result index weight determination is performed, a target event result value and each first related index value of each data unit of the first training data set are also required to be used, a first training data set is trained in series on a dimensionality reduction model to be trained and a target event result fitting model to be trained, a dimensionality reduction matrix and a fitting matrix are obtained, then a weight matrix is obtained according to the dimensionality reduction matrix and the fitting matrix, and each first related index weight is obtained based on a meta-absolute value and an index category absolute value of each element of the weight matrix, which correspond to the same first related index category. Therefore, on one hand, an influence relation can be constructed by utilizing a large number of first correlation indexes and a large number of first correlation index values and a target event result, and the constructed influence relation is ensured to be more accurate; on the other hand, the first relevant index and the first relevant index value are converted into the influence factor values with fewer dimensions, so that the obtained influence factors are more concentrated, the problem that the accuracy of the behavior to be adjusted is not facilitated for the actor due to the fact that the number of the first relevant indexes is too large, and the behavior to be adjusted of the actor can be more conveniently determined on the basis that the established influence relationship has higher accuracy, so that the probability of the event influenced by the target event result is changed towards the expected direction.
Drawings
Fig. 1 is a schematic flow chart of a method for determining target event result indicator weight according to an embodiment of the present invention;
fig. 2 is a schematic diagram illustrating a flow of obtaining a target event result value of the target event result index weight determining method according to the embodiment of the present invention;
fig. 3 is a schematic flow chart illustrating a process of obtaining a second relevant index weight of the target event result index weight determination method according to the embodiment of the present invention;
fig. 4 is a schematic diagram of a flow of acquiring a first training data set of a target event result index weight determining method according to an embodiment of the present invention;
fig. 5 is a schematic diagram illustrating obtaining of a first correlation index of a target event result index weight determination method according to an embodiment of the present invention;
fig. 6 is another schematic flow chart illustrating a method for determining target event result indicator weights according to an embodiment of the present invention;
fig. 7 is a schematic flow chart illustrating a process of obtaining a first relevant index weight according to the target event result index weight determination method provided in the embodiment of the present invention;
fig. 8 is a flowchart illustrating a method for determining a target event result influence factor value according to an embodiment of the present invention;
fig. 9 is another schematic flow chart illustrating a method for determining a target event result influencing factor value according to an embodiment of the present invention;
FIG. 10 is a block diagram of a target event result indicator weight determination apparatus according to an embodiment of the present invention;
fig. 11 is a block diagram illustrating a structure of a target event result influencing factor value determining apparatus according to an embodiment of the present invention;
fig. 12 is an alternative hardware device architecture of the device provided by the embodiment of the present invention.
Detailed Description
In the prior art, it is difficult to determine an influence factor value which indirectly influences a target event result, so that a basis is difficult to provide for behavior and action adjustment of an agent.
In order to determine an influence factor value that indirectly influences a target event result, an embodiment of the present invention provides a target event result index weight determining method, including:
acquiring a first training data set, wherein the first training data set comprises a target event result value and each first correlation index value of each data unit, the first correlation index value is a numerical value of each first index of which the correlation degree with the target event result meets a first correlation degree threshold, the target event result value is acquired at least based on each second correlation index value and a second correlation index weight of the data unit, the second correlation index is an index associated with the target event result, and the second correlation index value directly reflects the target event result, and the first correlation index value indirectly influences the target event result value;
training a target event result fitting model to be trained by using the target event result value and each first correlation index value of each data unit of the first training data set until the target event result fitting model meeting the training requirement is obtained, and acquiring a fitting matrix;
obtaining a weight matrix at least according to the fitting matrix, wherein each element of the weight matrix corresponds to each first relevant index;
and acquiring each first relevant index weight and the sign of each first relevant index weight according to each element of the weight matrix.
It can be seen that, in the method for determining target event result index weight provided by the embodiment of the present invention, when obtaining target event result index weight, first acquiring a first training data set, the first training data set being a set of data information of a plurality of data units, including a target event result value of each data unit and each first correlation index value whose correlation with the target event result satisfies a first correlation threshold, the target event result being acquired based on a second correlation index value having a direct correlation, then training a target event result fitting model according to the first relevant index data values of all the data units to obtain a predicted target event result, obtaining a fitting matrix when the predicted target event result meets the training requirement, and then, a weight matrix is obtained at least according to the fitting matrix, and further, each first related index weight and a symbol thereof are obtained according to each element of the weight matrix.
It can be seen that, in the target event result index weight determining method provided in the embodiment of the present invention, the target event result index weight is determined, a target event result value obtained based on the second correlation index value and the first correlation index value having an indirect influence on the target event result value are used to train the target event result fitting model, so as to obtain a fitting matrix, and further, each first correlation index weight and its sign are obtained according to the fitting matrix, so as to prepare for obtaining an influence factor value having an indirect influence on the target event result, and also prepare for providing a reference for behavior action adjustment of an actor based on the influence factor value, thereby facilitating to change the event probability influenced by the target event result toward an expected direction, so as to enable a real event probability in the future to meet an expectation.
For convenience of understanding, some words in the present invention are explained first:
event probability, probability of something happening, such as: the rate of continuous reporting and the rate of return rate in the education field; audience rating, watching rate, etc. in the movie and television domain, and in other domains, the probability of other events occurring; for ease of understanding and description, the following embodiments of the invention are described by way of example in the field of education:
the event result type, which is a description of an event situation and can be directly a third related index or a cluster type labeled in advance for each third related index, and includes a target event result and at least one non-target event result; the target event result is an event result that needs to be subjected to index weight determination in each event result category that directly affects the occurrence of the event probability, and the non-target event result is another target event result excluding the target event result in each event result category, such as: in the education field, the event results directly influencing the resubmission rate can include student conferences, student attention, student likes and the like, any one of the event results can be used as a target event result, and when the student conferences are used as the target event result, the student attention and the student like courses are non-target event results in the event results, of course, in other embodiments, the student attention or the student like courses can also be used as the target event result; event result category values, specific numerical values of each event result category;
the first indexes are indexes that indirectly affect the result value of the target event and further indirectly affect the probability of the event, such as: in the education field, the first indexes influencing the student schooling can comprise lesson preparation time of a teacher, random calling utilization rate of the classroom, the number of times of all the initiations of showing on the wall and the like, and therefore, in the education field, the first indexes are indexes of the teacher end, the first relevant indexes are obtained through correlation calculation, and the correlation degree of the target event result meets each first index of a first correlation degree threshold;
the second indexes are indexes that directly reflect the result value of the target event and directly influence the probability of the event, such as: the second indexes directly reflecting the student's learning may include the correct rate of the selected question answer, the understandability rate, and the like, the second related indexes are obtained by calculating the correlation degree, the indexes related to the target event result in each third related index whose correlation degree with the event probability satisfies a third correlation threshold, and the indexes related to the target event result but not the non-target event result may be included in each third related index whose correlation degree with the event probability satisfies the third correlation threshold by calculating the correlation degree;
the third index is an index that directly reflects event result values (including target event result values and non-target event result values) and directly affects event probabilities, such as: the method reflects the learning of students and comprises the steps of selecting question answering accuracy, understanding rate and the like, and reflects the favorite of students and comprises the following steps: like degree ratio, robbing red envelope participation rate, etc. influence what the student is concerned about including: the number of times of being raised, the rate of personal show and like, wherein the third related indexes are respective third indexes whose correlation with the event probability satisfies a third correlation threshold, and the second related indexes are parts of the third related indexes related to the target event result, so that in the field of education, the second related indexes and the third indexes are both indexes of students.
First correlation index category: the category of the aggregated first relevant indexes in the education field may include: pre-class preparation, classroom teaching, classroom ambience, motivation for students, post-class service, attention to students, etc., each first correlation index category may contain a plurality of first correlation indexes.
Influence factor value: and when the first relevant indexes are clustered, the first relevant indexes can also be classified into the first relevant index categories after the first relevant indexes are clustered.
Therefore, in the embodiment of the present invention, the target event result is obtained based on the second correlation index, the weight of each first correlation index is obtained based on the target event result and each first correlation index, and then the influence factor value is obtained.
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
Referring to fig. 1, fig. 1 is a flow chart illustrating a method for determining a target event result indicator weight according to an embodiment of the present invention.
As shown in the figure, the method for determining target event result index weight provided by the embodiment of the present invention includes the following steps:
step S10: and acquiring a first training data set, wherein the first training data set comprises target event result values and first related index values of all the data units.
It is easy to understand that the target event result index weight actually means the influence of different indexes on the generation of a certain target event result, and needs to be constructed based on a large amount of data, so in order to achieve the acquisition of the target event result index weight, it is necessary to acquire the data set first, i.e. the aforementioned first training data set, involving a large number of data units.
In order to obtain a target event result index weight, the first training data set needs to include a target event result value and a first correlation index value of each data unit, wherein the target event result value is obtained at least based on a second correlation index value and a second correlation index weight of the data unit, the second correlation index is an index associated with the target event result, and the second correlation index value directly reflects the target event result value; and the first correlation index value is an index value indirectly affecting the target event result value, and the first correlation index value is a numerical value of each first index whose correlation with the target event result satisfies a first correlation threshold.
The data unit refers to a group of corresponding target event result values and each first correlation index value, each second correlation index value, and a unit to which the event probability belongs, such as: taking a class as an example, the target event result value corresponding to the class may be a student session value, a student favorite course value, and a student attention value, when the target event result value is determined to be the student session value, the first related index value may be a teacher lesson preparation function usage rate, an average lesson preparation time, a lesson preparation time standard deviation, a random roll call usage rate, a class average random roll call frequency, a flying wall class average start frequency, and the like, and the second related index value may be a topic accuracy rate, a topic participation rate, an understanding rate, and the like.
It is easily understood that the target event result value needs to be acquired at least from the second correlation index value and the second correlation index weight, and the first correlation index value can be acquired by counting the statistics of the respective first index values of the data units and calculating the correlation with the target event result, and therefore, the target event result value needs to be acquired first.
Firstly, the acquisition timing of the target event result value can be selected:
in an embodiment, the target event result value may be obtained in advance before the target event result index weight determination method provided by the embodiment of the present invention is executed, and is directly used when the target event result index weight determination method provided by the embodiment of the present invention is executed; in another specific embodiment, the target event result value may also be calculated according to the second correlation index value and the second correlation index weight when the target event result index weight is determined.
When the target event result value is obtained, the method is within the protection scope of the target event result index weight determination method provided by the embodiment of the invention.
The method for acquiring the result value of the target event may be:
to obtain a target event result value conveniently, in a specific embodiment, please refer to fig. 2, and fig. 2 is a schematic diagram illustrating a flow of obtaining the target event result value of the target event result index weight determining method according to an embodiment of the present invention.
As shown in the figure, the method for acquiring the target event result value may include:
step S100: determining respective second correlation indicators for respective ones of the data units that are correlated with the target event result.
In order to obtain the target event result value, first, the second related indicator is determined according to the target event result, where the target event result is one of event result categories, and in a specific embodiment, the event result category may be determined by clustering the respective third related indicators, so that when the target event result is determined, the second related indicator in the respective third related indicators may be determined by searching.
Such as: when the student learns the target event result, the question accuracy and the comprehension rate can be determined as second related indexes.
Step S101: and acquiring the target event result value of each data unit according to a second correlation index weight corresponding to each second correlation index and the second correlation index value.
After the second correlation index is determined, the second correlation index value and the second correlation index weight are obtained, the target event result value can be obtained through the operation of the second correlation index value and the second correlation index weight, and the obtaining process is simple.
Of course, it is easily understood that, in order to obtain the target event result value, the second correlation index value and the second correlation index weight also need to be determined first.
First, for the second correlation index value, since the second correlation index is a part of the third correlation index and thus is also a part of the third correlation index value, the screening of the third correlation index value by the target event result can be obtained, and therefore, the acquisition of the second correlation index value is converted into the acquisition of the third correlation index value, and since the third correlation index is a third index whose correlation with the event probability satisfies the third correlation threshold, the third correlation index value can be obtained based on the correlation calculation of the third index value and the event probability that can be obtained by statistics.
First, a method for acquiring the third correlation index value is described, and in one embodiment, the method for acquiring the third correlation index value is as follows:
since the indexes that can be obtained and may be related to the event probability are not necessarily related to the event probability, it is necessary to first process the obtained third index values of the respective data units and determine the respective third indexes whose correlation degree with the event probability satisfies the third correlation threshold.
However, the third correlation index is obtained depending on a large number of data units, the sets of data units are different, and the third index having the event probability correlation satisfying the third correlation threshold is likely to be different, taking the data unit as a class unit as an example:
when the data of the data units of all classes are combined together as a set, the index of which the obtained correlation satisfies the third correlation threshold may be A, B, C, and when the classes are classified, the index of which the correlation satisfies the third correlation threshold may be B, C, D for the class data unit of the elementary school, and C, D, E for the class data unit of the middle school; when data units of different grades are selected, the index of which the correlation degree of the lower grade of the primary school meets the third correlation threshold value may be different from the index of which the correlation degree of the upper grade of the primary school meets the third correlation threshold value, and when the prediction of the event probability is specifically performed, the indexes of which the correlation degrees meet the third correlation threshold value are respectively obtained according to different classification modes, and then different predictions are respectively performed, so that the classification mode used by the subsequent prediction and the indexes of which the correlation degrees under different classification modes meet the third correlation threshold value need to be determined.
Therefore, in order to determine an accurate third correlation index, it is necessary to first acquire an original second training data set, then acquire a target classification dimension and a third index corresponding to the target classification dimension,
and finally, acquiring a pre-classification data set classified according to the target classification dimension, and screening the predicted third relevant index of a third index corresponding to the target classification dimension.
It is readily understood that the original second training data set is an unprocessed training data set directly acquired, the original second training data set comprising the actual probabilities corresponding to the respective data units and the expected associated third index values being the values of the respective third indices expected to be associated with the event probability, the original second training data set being obtainable by corresponding statistical software.
In addition, it should be noted that the target classification dimension described herein refers to a classification dimension determined by comparison and having a good prediction effect on event probability, and in a specific embodiment, in order to obtain the target classification dimension, the target classification dimension may be determined by performing correlation calculation after data classification on a plurality of classification dimensions.
For this purpose, the original second training data set is first classified according to each predetermined classification dimension to obtain a pre-classification data set.
Wherein the predetermined classification dimensions are possible classification dimensions according to characteristics of the data units. Such as: for pedagogically relevant data units, a predetermined classification dimension may be determined as: grade, school, class type, whole, etc.
And classifying the original data sets according to different preset classification dimensions to obtain each pre-classified data set.
Then, through a correlation calculation algorithm, by using the predicted relevant third index values and the actual probabilities of the data units of the pre-classified data sets, obtaining the correlation degree of each predicted relevant third index value and the actual probabilities, obtaining each third relevant index of which the correlation degree meets a third correlation threshold, determining a predetermined classification dimension of which the number of the third relevant indexes is the largest, and a third index corresponding to the predetermined classification dimension of which the number of the third relevant indexes is the largest, and obtaining the target classification dimension and the third relevant index corresponding to the target classification dimension, wherein the number of the third relevant indexes is the number of the third indexes of which the correlation degree meets the third correlation threshold.
It should be noted that the correlation may be a positive value indicating that the indicator and the event probability are in positive correlation, or may be a negative value indicating that the indicator and the event probability are in negative correlation, where the correlation satisfies the third correlation threshold, which means that the absolute value of the correlation is greater than or equal to the third correlation threshold.
Since the third correlation threshold is too high, the number of the indexes meeting the third correlation threshold is too small, which is not favorable for obtaining the subsequent second correlation weight indexes, and the third correlation threshold is too low, which also causes the number of the third indexes meeting the third correlation threshold to be too large, and increases the operation amount for obtaining the second correlation weight indexes, in a specific embodiment, the third correlation threshold may be selected to be 0.2, that is, the third index having the absolute value of the correlation greater than or equal to 0.2 is determined as the third correlation index, thereby simultaneously considering the accuracy and the operation amount.
When the correlation calculation is performed, different correlation calculation algorithms can be respectively selected because the data unit quantities of the pre-classification data sets are different after the second original second training data set is classified according to the predetermined classification dimensions.
Specifically, a spearman-level correlation coefficient (sperman correlation coefficient) calculation algorithm and a kendall-level correlation coefficient (Kendal' stau-b correlation coefficient) calculation algorithm may be selected.
The requirement of the spearman rank correlation coefficient (spearman correlation coefficient) calculation algorithm on the data set neither requires that the spearman rank correlation coefficient (spearman correlation coefficient) is in accordance with normal distribution, and is applicable to any relation defined by a monotonous function, and when the sample size is relatively large, the spearman correlation coefficient calculation algorithm is selected to be better; when the sample size is relatively small, the Kendall's tau-b correlation coefficient calculation algorithm is less sensitive to errors and more accurate.
For this purpose, in the determination of the relevance indicator, it is also possible to first select a specific calculation algorithm depending on the amount of data units of the pre-classified data set.
Specifically, the data unit quantity of the pre-classified data set is obtained firstly, then whether the data unit quantity exceeds a first data quantity threshold value or not is judged, if yes, a spearman grade correlation coefficient calculation algorithm is selected for calculation, and if not, a kender grade correlation coefficient calculation algorithm can be selected.
In order to avoid the uncertain influence on the determination of the index meeting the correlation threshold value due to the fact that the data volume is too small, when the data unit volume is judged not to exceed the first data volume threshold value, whether the data unit volume exceeds the second data volume threshold value or not can be further judged, if yes, a Kendell-grade correlation coefficient calculation algorithm can be selected, and if not, the Kendel-grade correlation coefficient calculation algorithm is directly discarded.
The specific values of the first data amount threshold and the second data amount threshold can be determined as needed, and it is easily understood that the first data amount threshold is larger than the second data amount threshold, such as: the first data amount threshold is selected to be 150, 200, etc., and the second data amount threshold may be selected to be 10, 8, etc.
After the correlation calculation of each pre-classified data set, each third index meeting a third correlation threshold corresponding to each pre-classified data set can be obtained, and then a target classification dimension needs to be selected from the predetermined classification dimensions according to each third index meeting the third correlation threshold corresponding to each pre-classified data set.
In order to ensure the accuracy of obtaining the weight of the third relevant index, the predetermined classification dimension which satisfies the threshold of the third relevance degree and has the largest number of the third indexes may be selected as the target classification dimension, so that each third index corresponding to the target classification dimension, that is, the third relevant index, may also be obtained.
In this way, a suitable target classification dimension, and a third correlation index value corresponding to the target classification dimension can be obtained.
Then, based on the determined target event result, a second correlation index value can be obtained by screening from the third correlation index values.
Of course, the third correlation index may be determined in advance before the target event result index weight determination method provided by the embodiment of the present invention is executed, and the second correlation index value may be obtained by screening the target event result.
For the second related index weight, the second related index weight may be obtained before the target event result index weight determination method provided in the embodiment of the present invention is executed, and may be directly used by searching, or may be obtained based on the event probability and the third related index before the target event result is obtained in the process of executing the target event result index weight determination method provided in the embodiment of the present invention. Of course, both of the foregoing two manners are within the protection scope of the target event result index weight determination method provided in the embodiment of the present invention.
Since the second related indicator is a part of the third related indicator, the second related indicator weight may be obtained by obtaining the third related indicator weight and performing a screening process, and therefore the third related indicator weight needs to be obtained.
In order to ensure the acquisition of the weight of the third relevant index (including the weight of the second relevant index of course) and improve the accuracy of the acquired weight of the third relevant index, a large number of third indexes can be selected, and of course, more third relevant indexes can be obtained, and in order to reduce the dispersion degree, the dimension of the third relevant index can be reduced by a clustering method. In order to obtain the second correlation index weight, an embodiment of the present invention provides a method for obtaining the second correlation index weight, please refer to fig. 3, and fig. 3 is a schematic diagram illustrating a flow of obtaining the second correlation index weight of the target event result index weight determining method according to the embodiment of the present invention.
As shown in the figure, the step of obtaining the second correlation index weight provided by the embodiment of the present invention may:
step S1010: and acquiring a second training data set, wherein the second training data set comprises the actual probability of each data unit and each third correlation index value, and the third correlation index value is the value of each third index of which the correlation degree with the actual probability meets a third correlation threshold value.
It should be noted that the data units included in the first training data set described herein and the data units included in the second training data set described herein at least include the same data units, so that the target event result value obtained based on the obtained second relevant index weight can be used for determining the target event result index weight, and of course, the second training data set may also be completely identical to the data units included in the first training data set; the index set of each third index includes each second index, and the event result category labeled in advance for each third related index at least includes the target event result, the actual probability is the true probability of the event occurrence of the data unit, and the numerical value of the actual probability is affected by the target event result value and the non-target event result value.
It is easy to understand that the acquisition of the third correlation index value may be directly acquired according to the third correlation index determined in the foregoing manner.
In addition, for convenience of understanding, the event result category of the third relevant index is now described with reference to the case in the foregoing teaching scenario:
for each third correlation index: topic accuracy (including mean and standard deviation), topic participation rate (including mean and standard deviation), comprehension rate (including mean and standard deviation), curriculum preference (including mean and standard deviation), rushing to the bonus round (including mean and standard deviation), points (including mean and standard deviation), number of praise events (including mean and standard deviation), individual show viewing rate (including mean and standard deviation), individual show like rate (including mean and standard deviation), labeling of event result categories, such as:
third correlation index event result category
Subject accuracy, subject participation and comprehension
The course liking degree, the red envelope participation rate and the score liking
The number of times of showing and the viewing rate of the personal show are concerned
It is easily understood that the actual probability is the real value of the event probability of each of the data units, and the third correlation index value is the numerical value of each of the third indexes whose correlation with the event probability satisfies the third correlation threshold.
After the target classification dimension is obtained based on the obtaining method of the third correlation index, a pre-classification data set classified according to the target classification dimension is obtained, and the data value of the third correlation index of each data unit in the pre-classification data set is selected to obtain each second training data set.
The resulting second training data set thus comprises the actual probabilities and the third correlation index values corresponding to the respective data units.
It is easy to understand that, because the second training data set is obtained by classifying the original second training data set according to the target classification dimension, the second training data set has a plurality of second training data sets, and based on different second training data sets, the trained third related index weights (including the second related index weights) are likely to be different, so that when the target event result is obtained, based on the class to which the specific data unit belongs, the corresponding third related index weights (including the second related index weights) can be selected.
Step S1011: and performing series training on a third dimension reduction model and a probability prediction model by using the actual probability and the third relevant index values of each data unit until the third dimension reduction model and the probability prediction model meeting a preset target are obtained, so as to obtain a third dimension reduction matrix, wherein each row of the third dimension reduction matrix corresponds to each third relevant index.
After the second training data set is obtained, in order to obtain the second related index weight, a third dimension reduction matrix is firstly obtained, for this purpose, the actual probability and the third related index value of each data unit in the same second training data set are utilized to carry out series training on a third dimension reduction model and a probability prediction model until the obtained prediction probability and the loss of the actual probability meet the requirement, the third dimension reduction model and the probability prediction model meeting the preset target are obtained, and a third dimension reduction matrix is obtained.
Based on the arrangement of the third correlation index values, the third correlation indexes respectively corresponding to each row of the third dimension reduction matrix can be obtained.
It is easily understood that the third dimension reduction model and the probability prediction model may be constructed in advance, wherein the third dimension reduction model may be a RBM (Restricted Boltzmann Machines) model or a PCA (Principal components analysis) model, and the probability prediction model may be a polynomial fitting model, a regression tree model, a random forest regression model, or the like.
Step S1012: and obtaining each third relevant index weight according to the third dimension reduction matrix, and screening from each third relevant index weight according to the second relevant index to obtain each second relevant index weight.
After the third dimension reduction matrix is obtained, further acquiring each third relevant index weight, so that:
firstly, adjusting the third dimension reduction matrix by using the index correlation direction value corresponding to each third correlation index and the event result category to obtain a weight square matrix with the number of rows and columns equal to the number of the event result categories, and acquiring each target element of the weight square matrix, which is used for representing the event result category;
then, determining each dimension reduction target element corresponding to each target element in the third dimension reduction matrix according to the position of each target element in the weight square matrix;
and finally, acquiring the weight of each third correlation index by using the third correlation index corresponding to each dimension reduction target element.
Wherein the index correlation direction value is a correlation direction value of the correlation degree of the third correlation index and the actual probability, if the correlation degree is a positive value, the index is positively correlated with the event probability, the correlation direction value is 1, if the correlation degree is a negative value, the index is negatively correlated with the event probability, and the correlation direction value is-1.
In the actual operation process, because the association relationship between the event probability to be raised (such as the rate of reporting again) or the event probability to be lowered (such as the rate of rating back) and the index is different, it is necessary to determine whether to perform the negation operation on the index correlation direction value of the correlation degree obtained in the correlation calculation process according to which type of the event probability to be raised: if the probability (such as the follow-up rate) is the probability to be improved, the third dimension reduction matrix is adjusted by directly utilizing the index correlation direction value of the correlation degree and the event result category obtained by correlation calculation to obtain a weight square matrix with the number of rows and columns equal to the number of the index categories; if the probability (such as the refuge rate) is the probability which is desired to be reduced, firstly, the index correlation direction value of the correlation degree obtained by the correlation calculation is subjected to one-time negation operation, and then the third dimensionality reduction matrix is adjusted by utilizing the index correlation direction value and the event result category obtained by the negation operation to obtain a weight square matrix with the number of rows and columns equal to the number of the index categories.
In this way, it can be ensured that the event result category values obtained by the subsequent operations are all positive values, which represent good directions, for example, in combination with the foregoing cases: the higher the learned score, the lower the refund rate.
1) When the weighting matrix is obtained, obtaining corresponding index correlation direction values according to third correlation indexes represented by all rows of a third dimensionality reduction matrix, obtaining all elements of all the third correlation indexes corresponding to the same event result type in the same column according to the corresponding relation between the third correlation indexes and the event result type, taking the index correlation direction values corresponding to all the elements as the weights of all the elements for weighting and summing, obtaining the elements of corresponding positions of the weighting matrix, and obtaining the weighting matrix.
To facilitate understanding of the specific method for obtaining the weight square matrix based on the third dimension reduction matrix, the following is given as an example:
the third dimension reduction matrix, the third correlation indexes represented by each row of the third dimension reduction matrix and the event result categories corresponding to the third correlation indexes are as follows:
the index correlation direction values of the third correlation indexes (of course, if the event probability is the event probability to be reduced, the index correlation direction values are the index correlation direction values which have undergone the negation operation) are respectively (1, -1, 1, -1, 1, -1), that is, the index correlation direction values of the correct rate mean value, the likeness mean value, and the raise number mean value are 1, and the index correlation direction values of the other three indexes are-1.
During the adjustment, the following calculations are performed:
……
a weight matrix is thus obtained, as follows:
it is easy to understand that, in the weight matrix, the event result category corresponding to the first row is a schooling, the event result category corresponding to the second row is a liking, and the event result category corresponding to the third row is a focused event result category.
2) And after the weight square matrix is obtained, further determining each target element in the weight square matrix.
The target elements in the weight matrix are determined, so that the dimension reduction target elements can be determined, and the third correlation index weight is obtained.
In a specific embodiment, the maximum value element with the largest value among the elements of the weight square matrix may be obtained first, a target element is obtained, and a row where the target element is located and a column where the target element is located are obtained.
In the weight matrix in the above example, it is assumed that m22 is the maximum value element with the largest value, so that m22 is the first target element, and the row and the column of the target element are the second row and the second column, respectively.
Then, each element in the row where the target element is located and each element in the column where the target element is located in the weight square matrix are ignored, and an adjustment square matrix is obtained.
Ignoring the elements of the second row and second column as in the weight matrix in the above example, the adjusted matrix is obtained as follows:
and further, taking the adjusting square matrix as a new weight square matrix, and acquiring new target elements until all the target elements are obtained.
Continuing with the above example, obtaining the maximum value element in the adjustment matrix, assuming that m13, so as to obtain that m13 is the second target element, where the row where the target element of the target element is located and the column where the target element is located are the first row and the third column, respectively.
Then, ignoring again, the elements of the first row and the third column, resulting in a new adjustment square:
and then acquiring a maximum value element in the new adjustment square matrix: m31, so that m31 is the third target element, and the row and column of the target element are the third row and the first column, respectively.
Since the weight square matrix is a 3-dimensional square matrix, all target elements are obtained.
In this way, each row of the obtained weight square matrix corresponds to an event result category of each third correlation index, and the number of the target elements is equal to the number of the event result categories.
It can be seen that, by the above method, the acquisition of the maximum value element of the weight square matrix and the acquisition of the adjustment square matrix can be realized conveniently, and all the acquired target elements are more reasonable, that is, the target elements are determined as the index categories with the maximum importance of the third relevant index weight; on the other hand, the complete and complete index categories reduced to the target dimension can be ensured, and the repetition of a certain index category or the loss of a certain index category can not be caused, for example, the three index categories of the scholarly, the favorite and the concerned can be all provided, and the way of the scholarly, the scholarly and the favorite can not be generated.
3) And after each target element of the weight square matrix is obtained, determining a dimension reduction target element in the third dimension reduction matrix according to the position of each target element in the matrix square matrix.
Please continue with the previous example:
wherein each target element of the weight matrix is respectively: m22, m13 and m31, wherein the dimension reduction target elements corresponding to the target elements are respectively:
m22 corresponds to w32 and w 42; m13 corresponds to w13 and w 23; m31 corresponding to w51 and w61
Therefore, the dimensionality reduction target element value corresponding to each dimensionality reduction target element can be obtained.
4) And after obtaining the value of the dimensionality reduction target element and the index corresponding to the dimensionality reduction target element, further obtaining the weight of a third relevant index.
Firstly, the sum of all dimension reduction target element values of the third relevant indexes included in all event result types is obtained, and then the proportion of all dimension reduction target element values in the sum of all dimension reduction target element values is obtained.
Please continue with the previous example, wherein: the index corresponding to w13 is a correct rate mean value, the index corresponding to w23 is a correct rate standard deviation, the index corresponding to w32 is a love degree mean value, the index corresponding to w42 is a love degree standard deviation, the index corresponding to w51 is a mean value of the number of puffs, and the index corresponding to w61 is a standard deviation of the number of puffs.
Then, calculating a third relevant index weight according to each dimensionality reduction index element value, which can be specifically performed by adopting the following formula:
the weight of the correct rate mean a1 is: w 13/(w 13+ w 23);
the weight of the standard deviation of accuracy a2 is: w 23/(w 13+ w 23);
the weight B1 of the preference degree mean value is w 32/(w 32+ w 42);
the weight of the standard deviation of the favorability B2 is: w 42/(w 32+ w 42);
the weight of the mean raise number C1 is: w 51/(w 51+ w 61);
the weight of the standard deviation of raise times C2 is: w 61/(w 51+ w 61).
Then, corresponding second index weights are selected according to the determined target event result, for example, the learned second index weights are the correct rate mean weight a1 and the correct rate standard deviation weight a 2.
Therefore, in the target event result index weight determining method provided in the embodiment of the present invention, when the third relevant index weight (including the second relevant index weight) is calculated, the meaning represented by each element in the obtained third dimension-reduction matrix cannot be determined, so that the determining of the dimension-reduction target element in the third dimension-reduction matrix and the determining of the meaning represented by the dimension-reduction target element are implemented by using the weight matrix, and further, the obtaining of the third relevant index weight (including the second relevant index weight) is implemented according to the dimension-reduction target element value in the third dimension-reduction matrix, so that the logic of converting the index into the index category is skillfully utilized to show the meaning represented by each element in the third dimension-reduction matrix that cannot be determined, thereby implementing the transparency of the black box information, and implementing the obtaining of the third relevant index weight (including the second relevant index weight), and the realization of obtaining the target event result category value based on the second related index data value can be ensured.
In the process of obtaining the weight of the third correlation index, the third dimensionality reduction matrix is obtained by utilizing a series training process, the conversion of the third correlation index to the target event result category is realized, a plurality of numerical values are converted to a smaller number of numerical values, the accuracy is improved by using a larger number of the third correlation indexes, the dispersity is reduced, and the concentration is improved.
In another embodiment, in order to obtain each third relevant index weight (including each second relevant index weight) and further improve accuracy, data units in one second training data set may be randomly extracted to obtain a plurality of second training data subsets, for this reason, the number of the third dimension reduction models is at least equal to the number of the second training data subsets, and each third relevant index weight (including each second relevant index weight) is obtained through the following steps:
respectively utilizing the actual probability and the third correlation index value of the data unit of each second training data subset to carry out series training on a third dimension reduction model and a probability prediction model corresponding to the actual probability and the third correlation index value until each third dimension reduction model and each probability prediction model meeting a preset target are obtained, and obtaining each third dimension reduction matrix;
adjusting each third dimension reduction matrix by using the index correlation direction value corresponding to each third correlation index and the event result category to obtain each weight square matrix with the number of rows and columns equal to the number of the event result categories, and acquiring each target element of each weight square matrix, which is used for representing the event result category;
determining the weight square matrixes with the same positions and the largest quantity of target elements in each weight square matrix to obtain each consistent weight square matrix, and determining each third dimension reduction matrix corresponding to each consistent weight square matrix;
determining each dimension reduction target element and each dimension reduction target element value corresponding to each target element in each third dimension reduction matrix according to the position of each target element in each consistent weight square matrix, obtaining the mean value of each dimension reduction target element value at the same position in each third dimension reduction matrix to obtain the mean value of the dimension reduction target elements, and obtaining each third related index weight by using each dimension reduction target element mean value and the index corresponding to each dimension reduction target element.
Compared with the third correlation index weight, in this embodiment, after model series training, a third dimension reduction matrix of each third dimension reduction model is obtained, so that if there are n third dimension reduction models, n third dimension reduction matrices are obtained.
And after each third dimension reduction matrix is obtained, adjusting each third dimension reduction matrix, namely adjusting each third dimension reduction matrix by using the index correlation direction value corresponding to the third correlation index and the event result type to obtain each weight square matrix with the number of rows and columns equal to the number of the index types, and obtaining each target element of each weight square matrix. Please refer to the foregoing description for a specific manner of obtaining the target element.
It is easy to understand that if n third dimension reduction matrixes are obtained through the foregoing steps, n weight square matrixes are obtained, and then n groups of target elements are obtained.
After obtaining each group of target elements corresponding to each weight square matrix, since the positions of the target elements obtained based on each weight square matrix may be different, for example, the foregoing case is continuously combined: the positions of the target elements of some weight square arrays are respectively a first row, a second row, a third row and a third row, the positions of the target elements of some weight square arrays are respectively a first row, a third row, a second row, a first column and a third row, and the like.
Determining the weight square matrixes with the same positions of the target elements according to the obtained positions of all groups of target elements, then counting the number of the weight square matrixes with the same positions of all groups of target elements, taking the group of the weight square matrixes with the largest number as consistent weight square matrixes, and then obtaining all third dimension reduction matrixes before conversion according to all the consistent weight square matrixes.
And further, determining the dimensionality reduction target elements and the dimensionality reduction target element values in the corresponding third dimensionality reduction matrix according to the positions of the target elements in the consistent weight square matrixes.
Assuming that there are k consistent weight square matrixes, k groups of dimension reduction target elements and dimension reduction target element values are determined, and then the average value of each dimension reduction target element value of a third dimension reduction matrix corresponding to each consistent weight square matrix, namely the average value of the k groups of dimension reduction target element values, is calculated to obtain the dimension reduction target element average value.
And then, acquiring the weight of each third correlation index by using the mean value of each dimension reduction target element and the third correlation index corresponding to each dimension reduction target element.
Specifically, the third relevant indexes with the same type of the labeled event result are determined to obtain the third relevant indexes with the same type;
obtaining the sum of the dimensionality reduction target element mean values of the third related indexes of the same category to obtain dimensionality reduction indexes and values;
and then, obtaining the weight of each third related index by using the dimensionality reduction index element mean value of each third related index of the same category and the dimensionality reduction index and value corresponding to the dimensionality reduction index element mean value.
Therefore, the third relevant index weight can be conveniently obtained, and each second relevant index weight can be obtained in a screening mode.
In this way, a plurality of third dimension reduction matrixes are obtained through a plurality of third dimension reduction models, a plurality of weight square matrixes are further obtained, then, the third dimension reduction matrixes for calculating the weights of the third relevant indexes are determined by determining and selecting the positions of the target elements of the weight square matrixes, and the weights of the third relevant indexes are obtained by using the average value, so that the accuracy of the obtained weights of the third relevant indexes can be improved, namely, the accuracy of the weights of the second relevant indexes is improved.
Of course, in another embodiment, such as: when the third correlation indexes are few, the event result type can be directly each third correlation index, at the moment, the actual probability of the second training data set and the third correlation index value can be directly used for training the probability prediction model, the model matrix meeting the training requirement of the probability prediction model is obtained, then each third correlation index weight is obtained based on the model matrix, each second correlation index weight can be obtained through screening, and dimension reduction operation is not needed.
After the second correlation index weight corresponding to the target event result is obtained, the target event result value of each data unit can be obtained according to the second correlation index weight of each second correlation index and the second correlation index value.
And after the target event result value of each data unit is obtained, acquiring a first relevant index based on the target event result value and the first index value of each data unit.
Specifically, referring to fig. 4 and fig. 5, fig. 4 is a schematic diagram illustrating a flow of acquiring a first training data set of a target event result index weight determining method according to an embodiment of the present invention; fig. 5 is a schematic diagram illustrating obtaining of a first correlation index of the target event result index weight determining method according to the embodiment of the present invention.
After the target event result value and each first index value of each data unit are obtained, a first correlation index value needs to be further obtained to obtain a first training data set.
In one embodiment, as shown in fig. 4, the step of obtaining the first training data set comprises:
step S102: an original first training data set is obtained, wherein the original first training data set comprises target event result values of all data units and all first index values, and the first indexes are all indexes expected to be relevant to the target event results.
It is easy to understand that the original first training data set is the data set after the target event result value and each first index value are obtained.
Step S103: and classifying the data units according to a preset target classification dimension to obtain a classification data set.
It is easy to understand that the predetermined target classification dimension is a target classification dimension obtained when the third correlation index is obtained, so as to classify the data units to maintain consistency.
Step S104: and obtaining the correlation degree of each first index and the target event result by using the first index value and the target event result value of each data unit of each classified data set through a correlation calculation algorithm, obtaining each first correlation index of which the correlation degree meets a first correlation threshold value, and obtaining the first training data set.
As shown in fig. 5, after the classification data set is obtained, the target event result value and the first index value of each data unit of the classification data set are input into the correlation calculation algorithm, and each first correlation index whose correlation with the target event result satisfies the first correlation threshold is obtained.
Similar to the third correlation index, the correlation obtained at this time may be a positive value indicating that the index and the target event result are positively correlated, or a negative value indicating that the index and the target event result are negatively correlated, where the correlation satisfying the first correlation threshold means that the absolute value of the correlation is greater than or equal to the correlation threshold.
Similarly, since the first correlation threshold is too high, the number of the indexes meeting the first correlation threshold is too small, which is not favorable for obtaining the weight of the result of the subsequent target event, and the number of the first indexes meeting the correlation threshold is too large, which increases the computation amount, in a specific embodiment, the first correlation threshold may be selected to be 0.2, that is, the first index having the absolute value of the correlation greater than or equal to 0.2 is determined as the first correlation index, thereby considering both the accuracy and the computation amount. Of course, the first correlation threshold may be adjusted as needed, and may be the same as or different from the third correlation threshold.
As before, after the original first training data set is classified according to the target classification dimension, the data unit quantity of the original first training data set is affected, and the data unit quantity of each classification data set is different, so that different correlation calculation algorithms can be selected respectively.
Specifically, a spearman-level correlation coefficient (sperman correlation coefficient) calculation algorithm and a kendall-level correlation coefficient (Kendal' stau-b correlation coefficient) calculation algorithm may be selected.
The requirement of the spearman rank correlation coefficient (spearman correlation coefficient) calculation algorithm on the data set does not need to conform to normal distribution, the spearman rank correlation coefficient calculation algorithm is applicable to any relation defined by a monotonous function, and the spearman correlation coefficient calculation algorithm is selected to be better when the sample size is relatively large; when the sample size is relatively small, the Kendall's tau-b correlation coefficient calculation algorithm is less sensitive to errors and more accurate.
Specifically, the selection manner may refer to the foregoing description, which is not repeated herein, and only when the correlation indicator is determined, a specific calculation algorithm may be selected according to the data unit amount of the classified data set.
And obtaining the correlation degree of each first index and the target event result, and obtaining each first correlation index of which the correlation degree meets a first correlation degree threshold value.
And acquiring data values of the first relevant indexes of the data units, and combining target event result values to obtain the first training data sets.
Of course, based on the target classification dimension, a plurality of first training data sets are obtained, so that different target event result index weights are obtained based on different first training data sets, and when a target event influence factor value is obtained in the subsequent step, a corresponding target event result index weight is selected based on the first training data set to which the data unit of the specific target event result belongs.
In addition, it is easy to understand that different target event result index weights are obtained based on different target event results, so when obtaining the target event influence factor value, the corresponding target event result index weight needs to be selected according to the type of the target event result.
Step S11: and training a target event result fitting model to be trained by using the target event result value and each first correlation index value of each data unit of the first training data set until the target event result fitting model meeting the training requirement is obtained, and acquiring a fitting matrix.
After the target event result value and the first correlation index value of each data unit are obtained, a target event result fitting model to be trained is trained by using the first training data set, a predicted target event result value is obtained, the predicted target event result value is compared with a target event result value (obtained based on the second correlation index value), loss is obtained until the loss is compared with a loss threshold value, if the loss threshold value is met, the target event result fitting model meeting the training requirements is obtained, and a fitting matrix is obtained.
It is easy to understand that the target event result fitting model to be trained is also constructed in advance, and can be various fitting models such as a polynomial fitting model, a regression tree model or a random forest regression model.
Step S12: and obtaining a weight matrix at least according to the fitting matrix, wherein each element of the weight matrix corresponds to each first relevant index respectively.
After the fitting matrix is obtained, the weight matrix is further obtained, it should be noted that, if only the fitting matrix is obtained in step S11, the fitting matrix is the weight matrix, and of course, each element of the weight matrix corresponds to each first correlation index.
Step S13: and acquiring each first relevant index weight and the sign of each first relevant index weight according to each element of the weight matrix.
Obtaining a weight matrix, further obtaining each first related index weight according to each element of the weight matrix, and determining a sign of the first related index weight according to a sign of each element of the weight matrix.
Specifically, in order to obtain the weights of the first relevant indexes, the absolute values of the elements of the weight matrix may be first obtained to obtain element absolute values, then the sum of the absolute values of the elements is obtained, then the ratio of the absolute values of the elements to the sum of the absolute values is obtained to obtain the weights of the first relevant indexes, and preparation is made for subsequently obtaining the influence factor values of the target event result.
It can be seen that, in the target event result index weight determining method provided in the embodiment of the present invention, the target event result value obtained based on the second correlation index value and the first correlation index value having an indirect influence on the target event result value are used to train the target event result fitting model to obtain the fitting matrix, and then each first correlation index weight and the sign thereof are obtained according to the fitting matrix, so that a preparation can be made for obtaining an influence factor value having an indirect influence on the target event result, and a preparation can be made for providing a reference for behavior action adjustment of an actor based on the influence factor value, which is helpful for changing the event probability influenced by the target event result toward an expected direction, so that the real event probability in the future satisfies an expectation.
However, in order to improve the accuracy of the obtained target event result index weight, more first indexes need to be used, so that the obtained first correlation index has a larger number, which results in a larger weight of the obtained first correlation index, and also results in a larger number of subsequently obtained influencing factors for the target event result, which is not beneficial to the subsequent behavior adjustment of the actor, therefore, in another specific implementation manner, an embodiment of the present invention further provides a target event result index weight determining method, so that the obtained target event result weight can meet the requirement of clustering the first related index, realize that the influence factors of the subsequently obtained target event result are more concentrated, and may determine the effects hierarchically to facilitate behavioral modification of an agent, see figure 6, fig. 6 is another flowchart illustrating a method for determining target event result indicator weights according to an embodiment of the present invention.
As shown in the figure, the method for determining target event result index weight provided by the embodiment of the present invention includes:
step S20: acquiring a first training data set, wherein the first training data set comprises target event result values of all data units and first related index values, each first related index comprises a pre-labeled first related index type, and the number of the first related index types is smaller than that of the first related indexes.
For details of the step S20, please refer to the detailed description of the step S10 in fig. 1-5, which is not repeated herein.
It should be added that, in this embodiment, in order to implement the determination of the target event result index weight, and on the basis of ensuring the accuracy, improve the concentration of the influencing factors of the finally obtained target event result, and more conveniently determine the behavior to be adjusted of the actor, in this embodiment, the dimension reduction processing may be performed on the first correlation index whose correlation satisfies the first correlation threshold.
In order to ensure the implementation of the dimension reduction, after the first relevant index is obtained, it is required to determine that the first relevant index before the dimension reduction corresponds to the first relevant index category after the dimension reduction, in a specific embodiment, in order to implement the correspondence between the first relevant index before the dimension reduction and the first relevant index category after the dimension reduction, the first relevant index category may be labeled in advance for each first relevant index, and the first relevant index category is labeled according to an actual meaning association between the first relevant index and the first relevant index category, that is, each first relevant index includes the first relevant index category labeled in advance. It is easily understood that the first relevance index category is labeled for achieving dimension reduction of the first relevance index, and therefore the number of the first relevance index categories is smaller than the number of the first relevance indexes.
For convenience of understanding, a description is now given, with reference to the case in the foregoing teaching scenario, of a first relevant index category of the first relevant index:
for each first correlation index: lesson preparation function utilization rate, average lesson preparation time, lesson preparation time standard deviation, controllable countdown initiation rate, controllable countdown average initiation times, controllable countdown initiation times standard deviation, random roll utilization rate, lesson average random roll times, random roll times standard deviation, on-the-fly on-the-wall class initiation times, on-the-fly on-the-wall initiation times standard deviation, on-the-fly on-the-wall cause dimension number average, on-the-fly on-the-wall cause dimension number standard deviation, on-the-fly on-the-wall initiation rate, total mark lesson average initiation quantity, total mark initiation quantity standard deviation, total mark initiation quantity on-one-stop initiation rate, total mark deduction average initiation times, total mark deduction initiation times standard deviation, subjective question remark utilization rate, subjective question remark lesson average record number, question remark average record number, number of people, total mark on average record number, total mark initiation times, total mark deduction initiation times, total mark initiation, total mark record number of the system initiation, total mark initiation, the sending rate of the class show and the completion degree of the report of the personal show are marked as the following table:
first correlation index | First correlation index Categories |
Usage rate of lesson preparation function, average lesson preparation time, standard deviation of lesson preparation time | Preparing lessons before class |
Controllable countdown initiating rate, controllable countdown class average initiating times, controllable countdown initiating time standard deviation, random roll call utilization rate, class average random roll call times and random roll call times standard Difference (D) | Classroom teaching |
The number of times of class initiation, standard deviation, mean value, standard deviation, rate and integral of the initiation Standard deviation of starting rate, total mark class starting quantity and total mark starting quantity | Student incentive |
The initiating rate of one station to the bottom, the initiating times of one station to the bottom class, and the standard deviation of the initiating times of one station to the bottom | Classroom atmosphere |
Starting rate of deduction points, starting times of deduction point class, standard deviation of starting times of deduction points, subjectiveThe utilization rate of the subject remarks, the utilization rate of the learning situation records, the number of the subjective subject remarks and the subjective subjects Number of people is recorded for remarking lessons | Student's attention |
Class show transmission rate, individual show report completion | Post-session services |
Thus, the plurality of first correlation indexes are subjected to dimension reduction to obtain 6-dimensional first correlation index categories.
Step S21: and performing series training on a first dimension reduction model to be trained and a target event result fitting model to be trained by using the target event result value and each first correlation index value of each data unit of the first training data set until the first dimension reduction model and the target event result fitting model meeting the training requirements are obtained, and acquiring a first dimension reduction matrix and the fitting matrix.
After a first training data set marked with a first relevant index category is obtained, in order to realize dimension reduction and ensure the accuracy of a dimension reduction result, a first relevant index value of a data unit is used for inputting a first dimension reduction model to be trained and a target event result fitting model to be trained, the target event result is predicted to obtain each predicted target event result value, then event result loss is obtained further according to the difference between the target event result value and the predicted target event result value until the event result loss meets an event result loss threshold value, and the first dimension reduction model and the target event result fitting model which meet training requirements, as well as a first dimension reduction matrix and a fitting matrix are obtained.
Of course, in the process of performing the tandem training, the output of the first dimension reduction model is directly input into the target event result fitting model, and the two are performed in tandem without data output.
Specifically, the first dimension reduction model may be a Restricted Boltzmann Machines (RBM) model or a Principal Component Analysis (PCA) model.
The number of rows of the first dimension reduction matrix corresponds to each first relevant index, the number of rows of the first dimension reduction matrix is the number of columns of the first relevant index after dimension reduction, the number of rows of the fitting matrix corresponds to the number of categories of the first relevant index after dimension reduction, and the number of columns of the fitting matrix is 1.
It should be noted that, the training of the first dimension reduction model and the target event result fitting model is only to obtain the first dimension reduction matrix and the fitting matrix.
Step S22: and obtaining a weight matrix according to the first dimension reduction matrix and the fitting matrix.
After the first dimension reduction matrix and the fitting matrix are obtained, corresponding operation is carried out on the first dimension reduction matrix and the fitting matrix to obtain a weight matrix, and preparation is made for obtaining the weight of each first relevant index and subsequently obtaining the category value (namely the influence factor value) of the first relevant index.
The obtained weight matrix can further reflect the influence degree of each first related index on the target event result by utilizing the fitting matrix, and the first related index is ensured to be strongly associated with the target event result.
In a specific embodiment, the weight matrix may be obtained by performing an inner product operation on the first dimension reduction matrix and the fitting matrix, so that not only the influence degree of each first correlation index on the target event result can be more accurately reflected, but also the sign of the weight of the first correlation index can be determined.
Step S23: and obtaining the absolute value of the element value of each element of the weight matrix and the sign of the element value of each element to obtain the absolute value of the element and the sign of the first correlation index weight.
After the weight matrix is obtained, in order to obtain a first correlation index weight for converting a first correlation index value into a first correlation index class value, an absolute value of an element value of each element of the weight matrix may be first obtained to obtain an element absolute value, and a sign of the element value of each element of the weight matrix may be used as a sign of each first correlation index weight.
Step S24: and acquiring the weight of each first correlation index according to each absolute value of the element corresponding to the same first correlation index category.
And after obtaining each element absolute value, further obtaining a first correlation index weight according to each element absolute value.
In order to obtain each first relevant index weight, an embodiment of the present invention further provides a method for determining a target event result index weight, as shown in fig. 7, where fig. 7 is a schematic flow chart of obtaining the first relevant index weight of the method for determining a target event result index weight provided by the embodiment of the present invention.
The step of obtaining the first correlation index weight comprises:
step S240: and acquiring the sum of the absolute values of all the elements corresponding to the same first related index category to obtain an index category absolute value.
Since the first correlation index categories of the respective first correlation indexes are not completely the same, the first correlation index categories are taken as a unit when the first correlation index weights are calculated.
And acquiring absolute values of all elements of the same index type, and then acquiring the sum of the absolute values of all elements to obtain the absolute value of each index type.
In combination with the foregoing cases, such as: firstly, acquiring each first relevant index in the weight matrix, which corresponds to a class of first relevant indexes of the class of the preparation lessons before lessons: the usage rate of the lesson preparation function, the average lesson preparation time and the absolute value of the standard deviation of the lesson preparation time are obtained, and the absolute value of the index class, namely the sum of the absolute values of the three elements, is obtained.
Step S241: and respectively obtaining the ratio of each absolute value of the element corresponding to the same first correlation index type to the absolute value of the index type to obtain the weight of each first correlation index.
And after the absolute value of the index type is obtained, the ratio of each absolute value of the elements of the same first related index type to the absolute value of the index type is obtained, and the weight of the first related index is obtained.
And continuing to combine the cases, respectively carrying out ratio calculation on the meta-absolute value corresponding to the class preparation function utilization rate, the meta-absolute value corresponding to the average class preparation time and the meta-absolute value corresponding to the class preparation time standard deviation in the weight matrix and the index category absolute value corresponding to the class preparation before the class preparation to obtain the first related index weight of each first related index, wherein the first related index weights corresponding to other first related index categories are also calculated in the same way.
Therefore, on one hand, the first relevant index weight can be simply obtained according to the weight matrix, and on the other hand, the obtained first relevant index weight corresponds to the ratio of each first relevant index in the first relevant index category, so that the accuracy of the subsequently calculated influence factor value corresponding to the first relevant index category is ensured.
In addition, it can be seen that, by using the method for reducing the dimension of the first relevant index quantity by using the first relevant index category to further obtain the weight of the first relevant index, on one hand, an influence relationship can be constructed by using a large quantity of first relevant indexes, first relevant index values and target event results, and the constructed influence relationship is ensured to be more accurate; on the other hand, the first relevant index and the first relevant index value are converted into the influence factor values with fewer dimensions, so that the obtained influence factors are more concentrated, the problem that the accuracy of the behavior to be adjusted is not facilitated for the actor due to the fact that the number of the first relevant indexes is too large, and the behavior to be adjusted of the actor can be more conveniently determined on the basis that the established influence relationship has higher accuracy, so that the probability of the event influenced by the target event result is changed towards the expected direction.
In order to further improve the accuracy of obtaining the first relevant index weights, in another specific embodiment, in order to obtain each first relevant index weight and further improve the accuracy, data units in one first training data set may be randomly extracted to obtain a plurality of first training data subsets, for this reason, the number of the first dimension reduction models is at least equal to the number of the first training data subsets, and each first relevant index weight is obtained through the following steps:
respectively utilizing the target event result value and the first correlation index value of the data unit of each first training data subset to carry out series training on a corresponding first dimension reduction model and a corresponding target event result fitting model until each first dimension reduction model and each target event result fitting model meeting the training requirement are obtained, and obtaining each first dimension reduction matrix and each fitting matrix;
obtaining initial weight matrixes according to the first dimension reduction matrix and the fitting matrix obtained by using the same first training data subset, and obtaining each initial weight matrix;
and acquiring the weight matrix according to each initial weight matrix.
Compared with the first correlation index weight, in this embodiment, after model series training, each first dimension reduction matrix and each fitting matrix are obtained.
And after obtaining each first dimension reduction matrix and each fitting matrix, obtaining an initial weight matrix by using the first dimension reduction matrix and the fitting matrix obtained by the same first training data subset. For a specific way of obtaining the initial weight matrix, please refer to the description of the weight matrix.
And then further acquiring a weight matrix according to the initial weight matrix.
In one embodiment, the weight matrix may be obtained by obtaining an average value of element values of each of the initial weight matrices.
Therefore, a plurality of initial weight matrixes are obtained through the plurality of first dimension reduction matrixes and the plurality of fitting matrixes, and the weight matrixes are obtained through the plurality of initial weight matrixes, so that the accuracy of the obtained first relevant index weight can be improved.
In order to determine the target event result influence factor value and further implement guidance on behavior adjustment of an agent, an embodiment of the present invention further provides a method for determining a target event result influence factor value, please refer to fig. 8, where fig. 8 is a flowchart illustrating the method for determining a target event result influence factor value according to the embodiment of the present invention.
As shown in the figure, the method for determining the target event result influence factor value according to the embodiment of the present invention includes:
step S30: the method includes obtaining a first correlation index weight of each first correlation index of a data unit determined by a target event result index weight determination method as described above and a sign of the first correlation index weight, and obtaining a first correlation index value of each first correlation index of the data unit.
It is easily understood that although the first index of the data unit is many, it is known through the target event result index weight determination process that only the first correlation index value, that is, the first correlation index value of the first index whose correlation satisfies the first correlation threshold value, obtained through the correlation calculation, needs to be obtained here.
The data unit for obtaining the target event result influence factor value may be a data unit used in the target event result index weight determination process, or may be a data unit unused in the target event result index weight determination process, and of course, the first correlation index weight of each obtained first correlation index and the sign of the first correlation index weight are only related to the first correlation index, and are not related to whether the data unit is used in the target event result index weight determination process. In the method for determining the target event result index weight, the data units are classified according to the target classification dimension, the obtained first relevant index weight and the sign of the first relevant index weight are also relevant to the target classification dimension, and when the target event result influence factor value is obtained, the corresponding first relevant index weight and the sign of the first relevant index weight are also selected according to the data set where the data units are located.
Step S31: and obtaining each influence factor value by using the first correlation index value, the first correlation index weight and the sign of the first correlation index weight which correspond to each other.
And further acquiring the influence factor value after obtaining the first correlation index value, the first correlation index weight and the sign of the first correlation index weight.
Specifically, when the first weight index sign is a positive value, the influence factor value may be: a product of the first correlation index value and the first correlation index weight; when the first weight index sign is negative, the influence factor value may be: and after the difference value of the 1-first correlation index value is obtained, the product calculation is carried out with the first correlation index weight.
It can be seen that, in the method for determining the target event result influence factor value provided in the embodiment of the present invention, the influence factor value that indirectly affects the target event result can be obtained through the first relevant index weight, the sign thereof, and the first relevant index value, so that the action of the agent can be adjusted based on the influence factor value, which is helpful for changing the event probability that affects the target event result to an expected direction, so as to enable the future true event probability to meet an expectation.
In order to determine the target event result influence factor value and further implement guidance on behavior adjustment of an agent, an embodiment of the present invention further provides another method for determining the target event result influence factor value, please refer to fig. 9, where fig. 9 is another schematic flow chart of the method for determining the target event result influence factor value according to the embodiment of the present invention.
As shown in the figure, a method for determining a target event result influence factor value according to another embodiment of the present invention includes:
step S40: the first correlation index weight of each first correlation index of a data unit determined by the target event result index weight determination method as described above and the sign of the first correlation index weight are acquired, and the first correlation index value of each first correlation index of the data unit is acquired.
For details of step S40, please refer to the related description of step S30, which is not repeated herein.
It should be noted that the first correlation index weight of each first correlation index obtained at this time and the sign of the first correlation index weight are obtained based on the first dimension reduction matrix and the fitting matrix.
Step S41: and obtaining the influence value of each first relevant index category according to each first relevant index value, the first relevant index weight and the sign of the first relevant index weight corresponding to the same first relevant index category, so as to obtain the influence factor value.
After a first correlation index value, a first correlation index weight and a sign of the first correlation index weight are obtained, an index influence value of each first correlation index is obtained based on each first correlation index value, the first correlation index weight and the sign of the first correlation index weight corresponding to the same first correlation index category, and the influence factor value is obtained through the index influence value.
Specifically, when the first weight index sign is a positive value, the index influence value may be: a product of the first correlation index value and the first correlation index weight; when the first weight index sign is a negative value, the index impact value may be: and after the difference value of the 1-first correlation index value is obtained, the product operation is carried out on the obtained difference value and the first correlation index weight, and then the influence values of all indexes corresponding to the same first correlation index type are obtained and added to obtain the influence factor values corresponding to all the first correlation index types.
It can be seen that the method for determining the target event result influence factor value provided in the embodiment of the present invention can reduce the problem that the accuracy of the behavior to be adjusted is not favorable for the actor to acquire an accurate behavior to be adjusted due to the dispersion of the first related indexes caused by the excessive number of the first related indexes on the basis of utilizing more first related indexes and improving the accuracy of the influence, and can ensure that the obtained influence factor value has higher accuracy and more conveniently determine the behavior to be adjusted of the actor, so as to change the probability of the event influenced by the target event result toward an expected direction.
In the following, the target event result index weight determining apparatus and the target event result influencing factor value determining apparatus provided in the embodiments of the present invention are introduced, and the target event result index weight determining apparatus and the target event result influencing factor value determining apparatus described below may be considered as a functional module architecture that is required to be set by an electronic device (e.g., a PC) to respectively implement the target event result index weight determining method and the target event result influencing factor value determining method provided in the embodiments of the present invention. The contents of the target event result index weight determination means and the target event result influencing factor value determination means described below may be referred to in correspondence with the contents of the target event result index weight determination method and the target event result influencing factor value determination method described above, respectively.
Fig. 10 is a block diagram of a target event result index weight determining apparatus provided in an embodiment of the present invention, where the target event result index weight determining apparatus is applicable to both a client and a server, and referring to fig. 10, the target event result index weight determining apparatus includes:
a first training data set obtaining unit 100 adapted to obtain a first training data set, wherein the first training data set includes a target event result value and each first correlation index value of each data unit, the first correlation index value is a numerical value of each first index whose correlation with the target event result satisfies a first correlation threshold, the target event result value is obtained based on at least the data unit and each second correlation index value and a second correlation index weight, the second correlation index is an index associated with the target event result, and the second correlation index value directly reflects the target event result value, the first correlation index value indirectly affects the target event result value;
a fitting matrix obtaining unit 110, adapted to train a target event result fitting model to be trained by using the target event result value and each first correlation index value of each data unit of the first training data set until the target event result fitting model meeting the training requirement is obtained, and obtain a fitting matrix;
a weight matrix obtaining unit 120, adapted to obtain a weight matrix at least according to the fitting matrix, where each element of the weight matrix corresponds to each of the first correlation indexes;
an index weight obtaining unit 130 adapted to obtain each first relevant index weight and a sign of each first relevant index weight according to each element of the weight matrix.
It is easy to understand that the target event result index weight actually means the influence of different indexes on the generation of a certain target event result, and needs to be constructed based on a large amount of data, so in order to achieve the acquisition of the target event result index weight, it is necessary to acquire the data set first, i.e. the aforementioned first training data set, involving a large number of data units.
In order to obtain a target event result index weight, the first training data set needs to include a target event result value and a first correlation index value of each data unit, wherein the target event result value is obtained at least based on a second correlation index value and a second correlation index weight of the data unit, the second correlation index is an index associated with the target event result, and the second correlation index value directly reflects the target event result value; and the first correlation index value is an index value indirectly affecting the target event result value, and the first correlation index value is a numerical value of each first index whose correlation with the target event result satisfies a first correlation threshold.
The data unit refers to a group of corresponding target event result values and each first correlation index value, each second correlation index value, and a unit to which the event probability belongs, such as: taking a class as an example, the target event result value corresponding to the class may be a student session value, a student favorite course value, and a student attention value, when the target event result value is determined to be the student session value, the first related index value may be a teacher lesson preparation function usage rate, an average lesson preparation time, a lesson preparation time standard deviation, a random roll call usage rate, a class average random roll call frequency, a flying wall class average start frequency, and the like, and the second related index value may be a topic accuracy rate, a topic participation rate, an understanding rate, and the like.
Optionally, the first training data set obtaining unit 100 is adapted to obtain a first training data set, wherein the first training data set includes the target event result value and each first correlation index value of each data unit, and includes:
determining respective second correlation indicators for respective ones of the data units that are correlated with the target event result;
and acquiring the target event result value of each data unit according to a second correlation index weight corresponding to each second correlation index and the second correlation index value.
Optionally, the first training data set obtaining unit 100 is adapted to obtain the target event result value of each data unit according to a second correlation index weight corresponding to each second correlation index and the second correlation index value, where the obtaining of the second correlation index weight includes:
acquiring a second training data set, wherein the second training data set comprises an actual probability and a third correlation index value of each data unit, the third correlation index value is a numerical value of each third index of which the correlation with the actual probability meets a third correlation threshold, the index set of each third index comprises each second index, a pre-labeled event result type of each third correlation index at least comprises the target event result, the actual probability is a real probability of an event occurrence of the data unit, and the numerical value of the actual probability is influenced by the target event result value and the non-target event result value;
performing series training on a third dimension reduction model and a probability prediction model by using the actual probability and the third relevant index values of each data unit until the third dimension reduction model and the probability prediction model meeting a preset target are obtained, so as to obtain a third dimension reduction matrix, wherein each row of the third dimension reduction matrix corresponds to each third relevant index;
and obtaining each third relevant index weight according to the third dimension reduction matrix, and screening from each third relevant index weight according to the second relevant index to obtain each second relevant index weight.
Optionally, the first training data set obtaining unit 100, adapted to obtain each third relevant index weight according to the third dimension reduction matrix, includes:
adjusting the third dimension reduction matrix by using an index correlation direction value and the event result category corresponding to each third correlation index to obtain a weight square matrix with the number of rows and columns equal to the number of the event result categories, and acquiring each target element of the weight square matrix, which is used for representing the event result categories, wherein the index correlation direction value is a correlation direction value of the correlation degree between the third correlation index and the actual probability, each row of the weight square matrix corresponds to the event result category of each third correlation index, and the number of the target elements is equal to the number of the event result categories;
determining each dimension reduction target element corresponding to each target element in the third dimension reduction matrix according to the position of each target element in the weight square matrix;
and acquiring the weight of each third relevant index by using the third relevant index corresponding to each dimension reduction target element.
In a specific embodiment, the second training data set includes respective second training data subsets, and the number of the third dimension reduction models is at least equal to the number of the second training data subsets; optionally, the first training data set obtaining unit 100 is adapted to obtain the target event result value of each of the data units according to a second correlation index weight corresponding to each of the second correlation indexes and the second correlation index value, and the obtaining of the second correlation index weight includes:
respectively utilizing the actual probability and the third correlation index value of the data unit of each second training data subset to carry out series training on a third dimension reduction model and a probability prediction model corresponding to the actual probability and the third correlation index value until each third dimension reduction model and each probability prediction model meeting a preset target are obtained, and obtaining each third dimension reduction matrix;
adjusting each third dimension reduction matrix by using the index correlation direction value corresponding to each third correlation index and the event result category to obtain each weight square matrix with the number of rows and columns equal to the number of the event result categories, and acquiring each target element of each weight square matrix, which is used for representing the event result category;
determining the weight square matrixes with the same positions and the largest quantity of target elements in each weight square matrix to obtain each consistent weight square matrix, and determining each third dimension reduction matrix corresponding to each consistent weight square matrix;
determining each dimension reduction target element and each dimension reduction target element value corresponding to each target element in each third dimension reduction matrix according to the position of each target element in each consistent weight square matrix, obtaining the mean value of each dimension reduction target element value at the same position in each third dimension reduction matrix to obtain the mean value of the dimension reduction target elements, and obtaining each third related index weight by using each dimension reduction target element mean value and the index corresponding to each dimension reduction target element.
Optionally, the obtaining, by the first training data set obtaining unit 100, a third correlation index weight by using the mean value of each dimension-reducing target element and an index corresponding to each dimension-reducing target element includes:
determining each third relevant index with the same event result type to obtain each same type index;
obtaining the sum of the dimensionality reduction target element average values of all the indexes of the same category to obtain dimensionality reduction indexes and values;
and obtaining the weight of each third related index by using the dimensionality reduction index metamean value of each index of the same category and the dimensionality reduction index and value corresponding to the dimensionality reduction index metamean value.
Optionally, the first training data set obtaining unit 100, adapted to obtain each target element of the weight matrix, which is used for representing the event result category, includes:
obtaining a maximum value element with the maximum value in each element of the weight square matrix to obtain a target element, and obtaining a row where the target element is located and a column where the target element is located;
and neglecting each element of the line where the target element is located and each element of the column where the target element is located in the weight square matrix to obtain an adjustment square matrix, and taking the adjustment square matrix as a new weight square matrix to obtain a new target element until all the target elements are obtained.
After the target event result value and the first correlation index value of each data unit are obtained, the fitting matrix obtaining unit 110 trains the target event result fitting model to be trained by using the first training data set, obtains a predicted target event result value, compares the predicted target event result value with the target event result value (obtained based on the second correlation index value), obtains a loss until the loss is compared with a loss threshold, and if the loss threshold is met, obtains the target event result fitting model meeting the training requirement, and obtains a fitting matrix.
It is easy to understand that the target event result fitting model to be trained is also constructed in advance, and can be various fitting models such as a polynomial fitting model, a regression tree model or a random forest regression model.
After obtaining the fitting matrix, the weight matrix obtaining unit 120 further obtains the weight matrix, it should be noted that if only the fitting matrix is obtained, the fitting matrix is the weight matrix, and of course, each element of the weight matrix corresponds to each first relevant index.
The weight matrix is obtained, and the index weight obtaining unit 130 further obtains each first related index weight according to each element of the weight matrix, and may determine the sign of the first related index weight according to the sign of each element of the weight matrix.
Specifically, in order to obtain the weights of the first relevant indexes, the absolute values of the elements of the weight matrix may be first obtained to obtain element absolute values, then the sum of the absolute values of the elements is obtained, then the ratio of the absolute values of the elements to the sum of the absolute values is obtained to obtain the weights of the first relevant indexes, and preparation is made for subsequently obtaining the influence factor values of the target event result.
It can be seen that, in the target event result index weight determining apparatus provided in the embodiment of the present invention, the target event result value obtained based on the second correlation index value and the first correlation index value having an indirect influence on the target event result value are used to train the target event result fitting model to obtain the fitting matrix, and then each first correlation index weight and its symbol are obtained according to the fitting matrix, so that a preparation may be made for obtaining an influence factor value having an indirect influence on the target event result, and a preparation may be made for providing a reference for behavior action adjustment of an actor based on the influence factor value, which is helpful for changing the event probability influenced by the target event result toward an expected direction, so that the real event probability in the future satisfies an expectation.
In another specific embodiment, in order to improve the accuracy of the obtained target event result index weight, a larger number of first indexes need to be used, so that the number of obtained first related indexes is also larger, which may result in that the weight of the obtained first related indexes is also larger, and also result in that the number of subsequently obtained influencing factors on the target event result is also larger, which is not beneficial to the subsequent behavior adjustment on an agent.
In this embodiment, the first training data set obtaining unit 100 of the target event result index weight determining apparatus provided in the embodiment of the present invention is adapted to obtain a first training data set, where each obtained first relevant index includes a first pre-labeled relevant index class, and the number of the first relevant index classes is smaller than the number of the first relevant indexes;
the fitting matrix obtaining unit 110 is adapted to train a target event result fitting model to be trained by using the target event result value and each first correlation index value of each data unit of the first training data set until the target event result fitting model meeting the training requirement is obtained, and obtaining the fitting matrix includes:
and performing series training on a first dimension reduction model to be trained and a target event result fitting model to be trained by using the target event result value and each first correlation index value of each data unit of the first training data set until the first dimension reduction model and the target event result fitting model meeting the training requirements are obtained, and acquiring a first dimension reduction matrix and the fitting matrix.
The weight matrix obtaining unit 120, adapted to obtain the weight matrix according to at least the fitting matrix, includes:
obtaining a weight matrix according to the first dimension reduction matrix and the fitting matrix;
the index weight obtaining unit 130, adapted to obtain each first relevant index weight and the sign of each first relevant index weight according to each element of the weight matrix, includes:
obtaining the absolute value of the element value of each element of the weight matrix and the sign of the element value of each element to obtain the absolute value of the element and the sign of the first correlation index weight;
and acquiring the weight of each first correlation index according to each absolute value of the element corresponding to the same first correlation index category.
In order to ensure the implementation of the dimension reduction, after the first relevant index is obtained, it is required to determine that the first relevant index before the dimension reduction corresponds to the first relevant index category after the dimension reduction, in a specific embodiment, in order to implement the correspondence between the first relevant index before the dimension reduction and the first relevant index category after the dimension reduction, the first relevant index category may be labeled in advance for each first relevant index, and the first relevant index category is labeled according to an actual meaning association between the first relevant index and the first relevant index category, that is, each first relevant index includes the first relevant index category labeled in advance. It is easily understood that the first relevance index category is labeled for achieving dimension reduction of the first relevance index, and therefore the number of the first relevance index categories is smaller than the number of the first relevance indexes.
After the first training data set marked with the first relevant index category is obtained, in order to implement dimension reduction and ensure the accuracy of a dimension reduction result, the fitting matrix obtaining unit 110 inputs the first dimension reduction model to be trained and the target event result fitting model to be trained by using the first relevant index value of the data unit, predicts a target event result to obtain each predicted target event result value, and then further obtains an event result loss according to the difference between the target event result value and the predicted target event result value until the event result loss meets an event result loss threshold value, so as to obtain the first dimension reduction model and the target event result fitting model, and the first dimension reduction matrix and the fitting matrix which meet the training requirements.
Of course, in the process of performing the tandem training, the output of the first dimension reduction model is directly input into the target event result fitting model, and the two are performed in tandem without data output.
Specifically, the first dimension reduction model may be a Restricted Boltzmann Machines (RBM) model or a Principal Component Analysis (PCA) model.
The number of rows of the first dimension reduction matrix corresponds to each first relevant index, the number of rows of the first dimension reduction matrix is the number of columns of the first relevant index after dimension reduction, the number of rows of the fitting matrix corresponds to the number of categories of the first relevant index after dimension reduction, and the number of columns of the fitting matrix is 1.
It should be noted that, the training of the first dimension reduction model and the target event result fitting model is only to obtain the first dimension reduction matrix and the fitting matrix.
After obtaining the first dimension reduction matrix and the fitting matrix, the weight matrix obtaining unit 120 is adapted to perform corresponding operations on the first dimension reduction matrix and the fitting matrix to obtain a weight matrix, and prepare for obtaining weights of the first relevant indexes and subsequently obtaining category values (i.e., influence factor values) of the first relevant indexes. The obtained weight matrix can further reflect the influence degree of each first related index on the target event result by utilizing the fitting matrix, and the first related index is ensured to be strongly associated with the target event result.
In a specific embodiment, the weight matrix may be obtained by performing an inner product operation on the first dimension reduction matrix and the fitting matrix, so that not only the influence degree of each first correlation index on the target event result can be more accurately reflected, but also the sign of the weight of the first correlation index can be determined.
After obtaining the weight matrix, in order to obtain a first correlation index weight for converting a first correlation index value into a first correlation index class value, the index weight obtaining unit 130 may first obtain an absolute value of an element value of each element of the weight matrix to obtain an element absolute value, and meanwhile, obtain a sign of the element value of each element of the weight matrix as a sign of each first correlation index weight, and then obtain each first correlation index weight according to each absolute value of each element corresponding to the same first correlation index class.
After obtaining the absolute values of the elements, the index weight obtaining unit 130 is adapted to obtain a first related index weight, including:
acquiring the sum of the absolute values of all the elements corresponding to the same first related index category to obtain an index category absolute value;
and respectively obtaining the ratio of each absolute value of the element corresponding to the same first correlation index type to the absolute value of the index type to obtain the weight of each first correlation index.
Since the first correlation index categories of the respective first correlation indexes are not completely the same, the first correlation index categories are taken as a unit when the first correlation index weights are calculated.
And acquiring absolute values of all elements of the same index type, and then acquiring the sum of the absolute values of all elements to obtain the absolute value of each index type.
In combination with the foregoing cases, such as: firstly, acquiring each first relevant index in the weight matrix, which corresponds to a class of first relevant indexes of the class of the preparation lessons before lessons: the usage rate of the lesson preparation function, the average lesson preparation time and the absolute value of the standard deviation of the lesson preparation time are obtained, and the absolute value of the index class, namely the sum of the absolute values of the three elements, is obtained.
And after the absolute value of the index type is obtained, the ratio of each absolute value of the elements of the same first related index type to the absolute value of the index type is obtained, and the weight of the first related index is obtained.
And continuing to combine the cases, respectively carrying out ratio calculation on the meta-absolute value corresponding to the class preparation function utilization rate, the meta-absolute value corresponding to the average class preparation time and the meta-absolute value corresponding to the class preparation time standard deviation in the weight matrix and the index category absolute value corresponding to the class preparation before the class preparation to obtain the first related index weight of each first related index, wherein the first related index weights corresponding to other first related index categories are also calculated in the same way.
Therefore, on one hand, the first relevant index weight can be simply obtained according to the weight matrix, and on the other hand, the obtained first relevant index weight corresponds to the ratio of each first relevant index in the first relevant index category, so that the accuracy of the subsequently calculated influence factor value corresponding to the first relevant index category is ensured.
In addition, it can be seen that, by using the method for reducing the dimension of the first relevant index quantity by using the first relevant index category to further obtain the weight of the first relevant index, on one hand, an influence relationship can be constructed by using a large quantity of first relevant indexes, first relevant index values and target event results, and the constructed influence relationship is ensured to be more accurate; on the other hand, the first relevant index and the first relevant index value are converted into the influence factor values with fewer dimensions, so that the obtained influence factors are more concentrated, the problem that the accuracy of the behavior to be adjusted is not facilitated for the actor due to the fact that the number of the first relevant indexes is too large, and the behavior to be adjusted of the actor can be more conveniently determined on the basis that the established influence relationship has higher accuracy, so that the probability of the event influenced by the target event result is changed towards the expected direction.
In another specific embodiment, in order to obtain each first relevant index weight and further improve accuracy, data units in one first training data set may be randomly extracted to obtain a plurality of first training data subsets, and for this reason, the number of the first dimension reduction models is at least equal to the number of the first training data subsets:
a fitting matrix obtaining unit 110, adapted to perform series training on a first dimension reduction model and a target event result fitting model corresponding to the target event result value and the first correlation index value of the data unit of each first training data subset respectively until each first dimension reduction model and each target event result fitting model meeting training requirements are obtained, so as to obtain each first dimension reduction matrix and each fitting matrix;
a weight matrix obtaining unit 120, adapted to obtain initial weight matrices according to the first dimensionality reduction matrix and the fitting matrix obtained by using the same first training data subset, so as to obtain each initial weight matrix;
and acquiring the weight matrix according to each initial weight matrix.
Compared with the first correlation index weight, in this embodiment, after model series training, each first dimension reduction matrix and each fitting matrix are obtained.
And after obtaining each first dimension reduction matrix and each fitting matrix, obtaining an initial weight matrix by using the first dimension reduction matrix and the fitting matrix obtained by the same first training data subset. For a specific way of obtaining the initial weight matrix, please refer to the description of the weight matrix.
And then further acquiring a weight matrix according to the initial weight matrix.
In one embodiment, the weight matrix may be obtained by obtaining an average value of element values of each of the initial weight matrices.
Therefore, a plurality of initial weight matrixes are obtained through the plurality of first dimension reduction matrixes and the plurality of fitting matrixes, and the weight matrixes are obtained through the plurality of initial weight matrixes, so that the accuracy of the obtained first relevant index weight can be improved.
In order to determine the target event result influence factor value and further implement guidance on behavior adjustment of an agent, an embodiment of the present invention further provides a target event result influence factor value determining apparatus, please refer to fig. 11, where fig. 11 is a block diagram of the target event result influence factor value determining apparatus provided in the embodiment of the present invention.
As shown in the figure, the target event result influence factor value device provided in the embodiment of the present invention includes:
a data unit data value obtaining unit 200 adapted to obtain a first correlation index weight and a sign of the first correlation index weight of each first correlation index of a data unit determined by the target event result index weight determination method as described above, and obtain a first correlation index value of each first correlation index of the data unit;
an influence factor value obtaining unit 210, adapted to obtain each of the influence factor values by using the signs of the first correlation index values, the first correlation index weights, and the first correlation index weights corresponding to each other.
It is easily understood that although the first index of the data unit is many, it is known through the target event result index weight determination process that only the first correlation index value, that is, the first correlation index value of the first index whose correlation satisfies the first correlation threshold value, obtained through the correlation calculation, needs to be obtained here.
The data unit for obtaining the target event result influence factor value may be a data unit used in the target event result index weight determination process, or may be a data unit unused in the target event result index weight determination process, and of course, the first correlation index weight of each obtained first correlation index and the sign of the first correlation index weight are only related to the first correlation index, and are not related to whether the data unit is used in the target event result index weight determination process. In the method for determining the target event result index weight, the data units are classified according to the target classification dimension, the obtained first relevant index weight and the sign of the first relevant index weight are also relevant to the target classification dimension, and when the target event result influence factor value is obtained, the corresponding first relevant index weight and the sign of the first relevant index weight are also selected according to the data set where the data units are located.
After the first correlation index value, the first correlation index weight, and the sign of the first correlation index weight are obtained, the influence factor value obtaining unit further obtains the influence factor value.
Specifically, when the first weight index sign is a positive value, the influence factor value may be: a product of the first correlation index value and the first correlation index weight; when the first weight index sign is negative, the influence factor value may be: and after the difference value of the 1-first correlation index value is obtained, the product operation is carried out with the first correlation index weight.
It can be seen that, with the apparatus for determining an influence factor value of a target event result provided in the embodiment of the present invention, an influence factor value that indirectly affects a target event result can be obtained through the first relevant index weight, the sign thereof, and the first relevant index value, so that an action of an agent can be adjusted based on the influence factor value, which is helpful for changing an event probability that affects the target event result to an expected direction, so that a real event probability in the future meets an expectation.
In order to determine a target event result influence factor value and further implement guidance on behavior adjustment of an agent, an embodiment of the present invention further provides another target event result influence factor value determining apparatus, where the first relevant index is of a first relevant index category, and the influence factor value obtaining unit 210 is adapted to obtain each influence factor value by using symbols of each of the first relevant index value, the first relevant index weight, and the first relevant index weight corresponding to each other, where the obtaining includes:
and obtaining the influence value of each first relevant index category according to each first relevant index value, the first relevant index weight and the sign of the first relevant index weight corresponding to the same first relevant index category, so as to obtain the influence factor value.
In this embodiment, the first correlation index weight of each first correlation index and the sign of the first correlation index weight are obtained based on a first dimension reduction matrix and a fitting matrix.
After obtaining a first correlation index value, a first correlation index weight, and a sign of the first correlation index weight, the influence factor value obtaining unit 210 obtains an index influence value of each first correlation index based on each first correlation index value, the first correlation index weight, and the sign of the first correlation index weight corresponding to the same first correlation index category, and obtains the influence factor value through the index influence value.
Specifically, when the first weight index sign is a positive value, the index influence value may be: a product of the first correlation index value and the first correlation index weight; when the first weight index sign is a negative value, the index impact value may be: and after the difference value of the 1-first correlation index value is obtained, the product operation is carried out on the obtained difference value and the first correlation index weight, and then the influence values of all indexes corresponding to the same first correlation index type are obtained and added to obtain the influence factor values corresponding to all the first correlation index types.
It can be seen that the device for determining the target event result influence factor value provided in the embodiment of the present invention can reduce the problem that the accuracy of the behavior to be adjusted is not favorable for the actor to acquire an accurate behavior to be adjusted due to the dispersion of the first related indexes caused by the excessive number of the first related indexes on the basis of utilizing more first related indexes and improving the accuracy of the influence, and can ensure that the obtained influence factor value has higher accuracy and more conveniently determine the behavior to be adjusted of the actor, so as to change the probability of the event influenced by the target event result toward an expected direction.
Of course, the device provided in the embodiment of the present invention may load the program module architecture in a program form to implement the target event result index weight determining method or the target event result influence factor value determining method provided in the embodiment of the present invention; the hardware device can be applied to an electronic device with specific data processing capacity, and the electronic device can be: such as a terminal device or a server device.
Optionally, fig. 12 shows an optional hardware device architecture of the device provided in the embodiment of the present invention, which may include: at least one memory 3 and at least one processor 1; the memory stores a program which is called by the processor to execute the target event result index weight determination method or the target event result influence factor value determination method, in addition, at least one communication interface 2 and at least one communication bus 4; the processor 1 and the memory 3 may be located in the same electronic device, for example, the processor 1 and the memory 3 may be located in a server device or a terminal device; the processor 1 and the memory 3 may also be located in different electronic devices.
As an optional implementation of the disclosure of the embodiment of the present invention, the memory 3 may store a program, and the processor 1 may call the program to execute the target event result index weight determining method or the target event result influence factor value determining method provided by the above-described embodiment of the present invention.
In the embodiment of the present invention, the electronic device may be a tablet computer, a notebook computer, or the like, which is capable of determining the target event result index weight or determining the target event result influence factor value.
In the embodiment of the present invention, the number of the processor 1, the communication interface 2, the memory 3, and the communication bus 4 is at least one, and the processor 1, the communication interface 2, and the memory 3 complete mutual communication through the communication bus 4; it is clear that the communication connections of the processor 1, the communication interface 2, the memory 3 and the communication bus 4 shown in the figure are only an alternative;
optionally, the communication interface 2 may be an interface of a communication module, such as an interface of a GSM module;
the processor 1 may be a central processing unit CPU or a Specific Integrated circuit asic (application Specific Integrated circuit) or one or more Integrated circuits configured to implement an embodiment of the invention.
The memory 3 may comprise a high-speed RAM memory and may also comprise a non-volatile memory, such as at least one disk memory.
It should be noted that the above-mentioned apparatus may also include other devices (not shown) that may not be necessary to the disclosure of the embodiments of the present invention; these other components may not be necessary to understand the disclosure of embodiments of the present invention, which are not individually described herein.
Embodiments of the present invention further provide a computer-readable storage medium, where computer-executable instructions are stored, and when executed by a processor, the instructions may implement the target event result index weight determining method or the target event result influence factor value determining method as described above.
According to the computer executable instruction stored in the storage medium provided by the embodiment of the invention, the target event result value obtained based on the second correlation index value and the first correlation index value indirectly influencing the target event result value are used for training the target event result fitting model to obtain the fitting matrix, and then each first correlation index weight and the sign thereof are obtained according to the fitting matrix, so that preparation can be made for obtaining the influence factor value indirectly influencing the target event result, and preparation can be made for providing reference for behavior action adjustment of an actor based on the influence factor value, and the change of the event probability influenced by the target event result towards the expected direction is facilitated, so that the real event probability in the future can meet the expectation. .
The embodiments of the present invention described above are combinations of elements and features of the present invention. Unless otherwise mentioned, the elements or features may be considered optional. Each element or feature may be practiced without being combined with other elements or features. In addition, the embodiments of the present invention may be configured by combining some elements and/or features. The order of operations described in the embodiments of the present invention may be rearranged. Some configurations of any embodiment may be included in another embodiment, and may be replaced with corresponding configurations of the other embodiment. It is obvious to those skilled in the art that claims that are not explicitly cited in each other in the appended claims may be combined into an embodiment of the present invention or may be included as new claims in a modification after the filing of the present application.
Embodiments of the invention may be implemented by various means, such as hardware, firmware, software, or a combination thereof. In a hardware configuration, the method according to an exemplary embodiment of the present invention may be implemented by one or more Application Specific Integrated Circuits (ASICs), Digital Signal Processors (DSPs), Digital Signal Processing Devices (DSPDs), Programmable Logic Devices (PLDs), Field Programmable Gate Arrays (FPGAs), processors, controllers, micro-controllers, microprocessors, and the like.
In a firmware or software configuration, embodiments of the present invention may be implemented in the form of modules, procedures, functions, and the like. The software codes may be stored in memory units and executed by processors. The memory unit is located inside or outside the processor, and may transmit and receive data to and from the processor via various known means.
The previous description of the disclosed embodiments is provided to enable any person skilled in the art to make or use the present invention. Various modifications to these embodiments will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other embodiments without departing from the spirit or scope of the invention. Thus, the present invention is not intended to be limited to the embodiments shown herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.
Although the embodiments of the present invention have been disclosed, the present invention is not limited thereto. Various changes and modifications may be effected therein by one skilled in the art without departing from the spirit and scope of the invention as defined in the appended claims.
Claims (20)
1. A target event result index weight determination method is characterized by comprising the following steps:
obtaining a first training data set, wherein the first training data set includes a target event result value and each first correlation index value of each data unit, the first correlation index value is a value of each first index whose correlation with the target event result satisfies a first correlation threshold, the target event result value is obtained based on at least each second correlation index value and a second correlation index weight of the data unit, the second correlation index is an index associated with the target event result, and the second correlation index value directly reflects the target event result value, the first correlation index value indirectly affects the target event result value, wherein the target event result value is an event result value for which an index weight determination is required among event results directly affecting occurrence of an event probability, the event probability is the probability of a certain event, and comprises a back rate or a report rate, the third index is each index which directly reflects the event result value and directly influences the event probability, the first index is an index of a teacher end, the second index and the third index are indexes of a student end, each first related index comprises a first related index category which is labeled in advance, and the number of the first related index categories is smaller than that of the first related indexes;
performing series training on a first dimension reduction model to be trained and a target event result fitting model to be trained by using the target event result value and each first correlation index value of each data unit of the first training data set until the first dimension reduction model and the target event result fitting model meeting the training requirements are obtained, and acquiring a first dimension reduction matrix and a fitting matrix;
obtaining a weight matrix according to the first dimension reduction matrix and the fitting matrix, wherein each element of the weight matrix corresponds to each first relevant index;
obtaining the absolute value of the element value of each element of the weight matrix and the sign of the element value of each element to obtain the absolute value of the element and the sign of the first correlation index weight;
and acquiring the weight of each first correlation index according to each absolute value of the element corresponding to the same first correlation index category.
2. The method of claim 1, wherein the step of obtaining each of the first correlation indicator weights based on each of the meta-absolute values corresponding to the same first correlation indicator class comprises:
acquiring the sum of the absolute values of all the elements corresponding to the same first related index category to obtain an index category absolute value;
and respectively obtaining the ratio of each absolute value of the element corresponding to the same first correlation index type to the absolute value of the index type to obtain the weight of each first correlation index.
3. The method of claim 1, wherein the step of obtaining a weight matrix from the first reduced-dimension matrix and the fitting matrix comprises:
and obtaining the inner product of the first dimension reduction matrix and the fitting matrix to obtain the weight matrix.
4. The method of claim 1, wherein the first training data set includes a respective first subset of training data, the number of first dimension reduction models being at least equal to the number of first subsets of training data;
the step of performing series training on a first dimension reduction model to be trained and a target event result fitting model to be trained by using the target event result value and each first correlation index value of each data unit of the first training data set until the first dimension reduction model and the target event result fitting model meeting the training requirements are obtained, and the step of obtaining a first dimension reduction matrix and the fitting matrix comprises the steps of:
respectively utilizing the target event result value and each first correlation index value of each data unit of each first training data subset to perform series training on the corresponding first dimension reduction model and the corresponding target event result fitting model until each first dimension reduction model and each target event result fitting model meeting the training requirements are obtained, and obtaining each first dimension reduction matrix and each fitting matrix;
the step of obtaining a weight matrix according to the first dimension reduction matrix and the fitting matrix comprises:
obtaining initial weight matrixes according to the first dimension reduction matrix and the fitting matrix obtained by using the same first training data subset, and obtaining each initial weight matrix;
and acquiring the weight matrix according to each initial weight matrix.
5. The method of claim 4, wherein the step of obtaining the weight matrix based on each of the initial weight matrices comprises:
and obtaining the average value of the element values of each initial weight matrix to obtain the weight matrix.
6. The method of claim 1, wherein the step of obtaining a first training data set comprises:
acquiring an original first training data set, wherein the original first training data set comprises target event result values and first index values of all data units, and the first indexes are all indexes expected to be related to the target event results;
classifying the data units according to a preset target classification dimension to obtain a classification data set;
and obtaining the correlation degree of each first index and the target event result by using the first index value and the target event result value of each data unit of each classified data set through a correlation calculation algorithm, obtaining each first correlation index of which the correlation degree meets a first correlation threshold value, and obtaining the first training data set.
7. The target event result indicator weight determination method of claim 6, wherein the correlation calculation algorithm comprises: a spearman rank correlation coefficient calculation algorithm and a kender rank correlation coefficient calculation algorithm.
8. The target event result index weight determination method of any one of claims 1 to 7, wherein the obtaining of the target event result value for each of the data units comprises:
determining respective second correlation indicators for respective ones of the data units that are correlated with the target event result;
and acquiring the target event result value of each data unit according to a second correlation index weight corresponding to each second correlation index and the second correlation index value.
9. The target event result index weight determination method of claim 8, wherein the obtaining of the second correlation index weight comprises:
acquiring a second training data set, wherein the second training data set comprises an actual probability and a third correlation index value of each data unit, the third correlation index value is a numerical value of each third index of which the correlation with the actual probability meets a third correlation threshold, the index set of each third index comprises a second index, a pre-labeled event result type of each third correlation index at least comprises the target event result, the actual probability is a real probability of an event occurrence of the data unit, and the numerical value of the actual probability is influenced by the target event result value and a non-target event result value;
performing series training on a third dimension reduction model and a probability prediction model by using the actual probability and the third relevant index values of each data unit until the third dimension reduction model and the probability prediction model meeting a preset target are obtained, so as to obtain a third dimension reduction matrix, wherein each row of the third dimension reduction matrix corresponds to each third relevant index;
and obtaining each third relevant index weight according to the third dimension reduction matrix, and screening from each third relevant index weight according to the second relevant index to obtain each second relevant index weight.
10. The method of claim 9, wherein the step of obtaining each third associated metric weight according to the third reduced-dimension matrix comprises:
adjusting the third dimension reduction matrix by using an index correlation direction value and the event result category corresponding to each third correlation index to obtain a weight square matrix with the number of rows and columns equal to the number of the event result categories, and acquiring each target element of the weight square matrix, which is used for representing the event result categories, wherein the index correlation direction value is a correlation direction value of the correlation degree between the third correlation index and the actual probability, each row of the weight square matrix corresponds to the event result category of each third correlation index, and the number of the target elements is equal to the number of the event result categories;
determining each dimension reduction target element corresponding to each target element in the third dimension reduction matrix according to the position of each target element in the weight square matrix;
and acquiring the weight of each third relevant index by using the third relevant index corresponding to each dimension reduction target element.
11. The method of claim 10, wherein the second set of training data includes respective second subsets of training data, and the number of third dimension reduction models is at least equal to the number of second subsets of training data;
the step of performing series training on a third dimension reduction model and a probability prediction model by using the actual probability and the third relevant index value of each data unit until the third dimension reduction model and the probability prediction model meeting a predetermined target are obtained, and obtaining a third dimension reduction matrix comprises the following steps:
respectively utilizing the actual probability and the third correlation index value of the data unit of each second training data subset to carry out series training on a third dimension reduction model and a probability prediction model corresponding to the actual probability and the third correlation index value until each third dimension reduction model and each probability prediction model meeting a preset target are obtained, and obtaining each third dimension reduction matrix;
the step of adjusting the third dimensionality reduction matrix by using the index relevance direction value corresponding to each third relevance index and the event result category to obtain a weight square matrix with the number of rows and the number of columns equal to the number of the event result categories, and the step of obtaining each target element of the weight square matrix, which is used for representing the event result category, includes:
adjusting each third dimension reduction matrix by using the index correlation direction value corresponding to each third correlation index and the event result category to obtain each weight square matrix with the number of rows and columns equal to the number of the event result categories, and acquiring each target element of each weight square matrix, which is used for representing the event result category;
the step of determining each dimension-reduced target element corresponding to each target element in the third dimension-reduced matrix according to the position of each target element in the weight matrix, and obtaining the weight of each third relevant index by using the third relevant index corresponding to each dimension-reduced target element includes:
determining the weight square matrixes with the same positions and the largest quantity of target elements in each weight square matrix to obtain each consistent weight square matrix, and determining each third dimension reduction matrix corresponding to each consistent weight square matrix;
determining each dimension reduction target element and each dimension reduction target element value corresponding to each target element in each third dimension reduction matrix according to the position of each target element in each consistent weight square matrix, obtaining the mean value of each dimension reduction target element value at the same position in each third dimension reduction matrix to obtain the mean value of the dimension reduction target elements, and obtaining each third related index weight by using each dimension reduction target element mean value and the index corresponding to each dimension reduction target element.
12. The method for determining target event result index weight according to claim 11, wherein the step of obtaining each third related index weight by using each dimension-reduced target element mean value and the index corresponding to each dimension-reduced target element comprises:
determining each third relevant index with the same event result type to obtain each same type index;
obtaining the sum of the dimensionality reduction target element average values of all the indexes of the same category to obtain dimensionality reduction indexes and values;
and obtaining the weight of each third related index by using the dimensionality reduction index metamean value of each index of the same category and the dimensionality reduction index and value corresponding to the dimensionality reduction index metamean value.
13. The method of claim 10, wherein the step of obtaining each target element of the weight matrix representing the event result category comprises:
obtaining a maximum value element with the maximum value in each element of the weight square matrix to obtain a target element, and obtaining a row where the target element is located and a column where the target element is located;
and neglecting each element of the line where the target element is located and each element of the column where the target element is located in the weight square matrix to obtain an adjustment square matrix, and taking the adjustment square matrix as a new weight square matrix to obtain a new target element until all the target elements are obtained.
14. A method for determining a value of an influencing factor of a target event result is characterized by comprising the following steps:
acquiring a first correlation index weight of each first correlation index of a data unit and a sign of the first correlation index weight determined by the target event result index weight determination method according to any one of claims 1 to 13, and acquiring a first correlation index value of each first correlation index of the data unit;
and obtaining each influence factor value by using the first correlation index value, the first correlation index weight and the sign of the first correlation index weight which correspond to each other.
15. The method of claim 14, wherein the step of obtaining each of the influence factor values using the signs of the first correlation index value, the first correlation index weight, and the first correlation index weight corresponding to each other comprises:
and acquiring an index influence value of each first relevant index according to each first relevant index value, the first relevant index weight and the sign of the first relevant index weight corresponding to the same first relevant index category, and acquiring the influence factor value through the index influence value.
16. A target event result index weight determination apparatus, comprising:
a first training data set obtaining unit adapted to obtain a first training data set, wherein the first training data set includes a target event result value and a first correlation index value of each data unit, the first correlation index value is a numerical value of each first index whose correlation with the target event result satisfies a first correlation threshold, the target event result value is obtained based on at least a second correlation index value and a second correlation index weight of the data unit, the second correlation index is an index associated with the target event result, and the second correlation index value directly reflects the target event result value, the first correlation index value indirectly affects the target event result value, wherein the target event result value is an event result value for which an index weight determination is required among event results directly affecting occurrence of an event probability, the event probability is the probability of a certain event, and comprises a back rate or a report rate, the third index is each index which directly reflects the event result value and directly influences the event probability, the first index is an index of a teacher end, the second index and the third index are indexes of a student end, each first related index comprises a first related index category which is labeled in advance, and the number of the first related index categories is smaller than that of the first related indexes;
a fitting matrix obtaining unit, adapted to perform series training on a first dimension reduction model to be trained and a target event result fitting model to be trained by using the target event result value and each first correlation index value of each data unit of the first training data set until the first dimension reduction model and the target event result fitting model meeting training requirements are obtained, and obtain a first dimension reduction matrix and the fitting matrix;
a weight matrix obtaining unit, adapted to obtain a weight matrix according to the first dimension reduction matrix and the fitting matrix, where each element of the weight matrix corresponds to each first correlation index;
an index weight obtaining unit adapted to obtain an absolute value of a meta value of each of the elements of the weight matrix and a sign of the meta value of each of the elements, obtain a meta absolute value and a sign of the first correlation index weight, and obtain each of the first correlation index weights according to each of the meta absolute values corresponding to the same first correlation index category.
17. A target event result influencing factor value determining apparatus, comprising:
a data unit data value obtaining unit adapted to obtain a first correlation index weight of each first correlation index of a data unit and a sign of the first correlation index weight determined by the target event result index weight determination method according to any one of claims 1 to 13, and obtain a first correlation index value of each first correlation index of the data unit;
an influence factor value obtaining unit adapted to obtain each of the influence factor values by using the first correlation index values, the first correlation index weights, and signs of the first correlation index weights corresponding to each other.
18. The target event result influencer value determining device of claim 17, wherein the influencer value obtaining unit adapted to obtain each influencer value using each of the first relevance index values, the first relevance index weight, and a sign of the first relevance index weight that correspond to each other comprises:
and acquiring an index influence value of each first relevant index according to each first relevant index value, the first relevant index weight and the sign of the first relevant index weight corresponding to the same first relevant index category, and acquiring the influence factor value through the index influence value.
19. A storage medium storing a program adapted for target event outcome indicator weight determination to implement the target event outcome indicator weight determination method according to any one of claims 1-13, or a program adapted for target event outcome influencer value determination to implement the target event outcome influencer value determination method according to claim 14 or 15.
20. An electronic device comprising at least one memory and at least one processor; the memory stores a program that the processor calls to execute the target event result indicator weight determination method according to any one of claims 1 to 13 or the target event result influencing factor value determination method according to claim 14 or 15.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110050453.5A CN112365384B (en) | 2021-01-14 | 2021-01-14 | Target event result index weight, influence factor value determination method and related device |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110050453.5A CN112365384B (en) | 2021-01-14 | 2021-01-14 | Target event result index weight, influence factor value determination method and related device |
Publications (2)
Publication Number | Publication Date |
---|---|
CN112365384A CN112365384A (en) | 2021-02-12 |
CN112365384B true CN112365384B (en) | 2021-08-27 |
Family
ID=74535000
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202110050453.5A Active CN112365384B (en) | 2021-01-14 | 2021-01-14 | Target event result index weight, influence factor value determination method and related device |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN112365384B (en) |
Families Citing this family (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113176769B (en) * | 2021-06-29 | 2021-09-03 | 浙江大胜达包装股份有限公司 | Corrugated paper process control optimization method and system based on application demand data model |
CN116258373B (en) * | 2023-03-15 | 2024-02-09 | 杭州盈禾嘉田科技有限公司 | Disease and pest detection, prediction and early warning system and method based on big data |
Family Cites Families (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
KR102570278B1 (en) * | 2017-07-31 | 2023-08-24 | 삼성전자주식회사 | Apparatus and method for generating training data used to training student model from teacher model |
CN108491817B (en) * | 2018-03-30 | 2021-02-26 | 国信优易数据股份有限公司 | Event detection model training method and device and event detection method |
CN109472412A (en) * | 2018-11-09 | 2019-03-15 | 百度在线网络技术(北京)有限公司 | A kind of prediction technique and device of event |
CN110414627A (en) * | 2019-08-07 | 2019-11-05 | 北京嘉和海森健康科技有限公司 | A kind of training method and relevant device of model |
CN112101516A (en) * | 2020-07-30 | 2020-12-18 | 鹏城实验室 | Generation method, system and device of target variable prediction model |
-
2021
- 2021-01-14 CN CN202110050453.5A patent/CN112365384B/en active Active
Also Published As
Publication number | Publication date |
---|---|
CN112365384A (en) | 2021-02-12 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Zhou et al. | Modeling context-aware features for cognitive diagnosis in student learning | |
WO2021180249A1 (en) | Occupation recommendation method and apparatus, and device and medium | |
CN109299344A (en) | The generation method of order models, the sort method of search result, device and equipment | |
CN112365384B (en) | Target event result index weight, influence factor value determination method and related device | |
Yao et al. | New fairness metrics for recommendation that embrace differences | |
CN109816265B (en) | Knowledge characteristic mastery degree evaluation method, question recommendation method and electronic equipment | |
US20230353828A1 (en) | Model-based data processing method and apparatus | |
CN109690581B (en) | User guidance system and method | |
WO2021208535A1 (en) | Recommendation method and device based on automatic feature grouping | |
Alipourfard et al. | Using Simpson’s paradox to discover interesting patterns in behavioral data | |
CN111798138B (en) | Data processing method, computer storage medium and related equipment | |
CN112231516B (en) | Training method of video abstract generation model, video abstract generation method and device | |
US20150178659A1 (en) | Method and System for Identifying and Maintaining Gold Units for Use in Crowdsourcing Applications | |
CN115588485B (en) | Self-adaptive intervention method, system, device and medium based on social story training | |
CN110766438A (en) | Method for analyzing user behaviors of power grid users through artificial intelligence | |
US20210390263A1 (en) | System and method for automated decision making | |
CN112529750A (en) | Learning event recommendation method and system based on graph neural network model | |
US20240265410A1 (en) | Method for analyzing public satisfaction, storage medium and electronic device | |
CN114021029A (en) | Test question recommendation method and device | |
CN109409670A (en) | Personnel's matching process, device, system and block chain node device | |
CN116662497A (en) | Visual question-answer data processing method, device and computer equipment | |
CN112995719B (en) | Bullet screen text-based problem set acquisition method and device and computer equipment | |
CN116933800B (en) | Template-based generation type intention recognition method and device | |
CN111369063B (en) | Test paper model training method, test paper combining method and related device | |
CN112381338B (en) | Event probability prediction model training method, event probability prediction method and related device |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |