CN112381338B - Event probability prediction model training method, event probability prediction method and related device - Google Patents

Event probability prediction model training method, event probability prediction method and related device Download PDF

Info

Publication number
CN112381338B
CN112381338B CN202110049876.5A CN202110049876A CN112381338B CN 112381338 B CN112381338 B CN 112381338B CN 202110049876 A CN202110049876 A CN 202110049876A CN 112381338 B CN112381338 B CN 112381338B
Authority
CN
China
Prior art keywords
index
event probability
value
weight
prediction model
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202110049876.5A
Other languages
Chinese (zh)
Other versions
CN112381338A (en
Inventor
傅云凤
童洋
易善鸿
刘慧军
闫智慧
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Xintang Sichuang Educational Technology Co Ltd
Original Assignee
Beijing Xintang Sichuang Educational Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Xintang Sichuang Educational Technology Co Ltd filed Critical Beijing Xintang Sichuang Educational Technology Co Ltd
Priority to CN202110049876.5A priority Critical patent/CN112381338B/en
Publication of CN112381338A publication Critical patent/CN112381338A/en
Application granted granted Critical
Publication of CN112381338B publication Critical patent/CN112381338B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/04Forecasting or optimisation specially adapted for administrative or management purposes, e.g. linear programming or "cutting stock problem"
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/02Marketing; Price estimation or determination; Fundraising
    • G06Q30/0201Market modelling; Market analysis; Collecting market data
    • G06Q30/0202Market predictions or forecasting for commercial activities
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Systems or methods specially adapted for specific business sectors, e.g. utilities or tourism
    • G06Q50/10Services
    • G06Q50/20Education
    • G06Q50/205Education administration or guidance

Abstract

The embodiment of the invention provides an event probability prediction model training method, an event probability prediction method and a related device, wherein the training method comprises the following steps: acquiring a training data set; and acquiring a predicted event probability value according to the relevant index data value of each data unit by using the event probability prediction model to be trained, acquiring probability loss according to the predicted event probability value and the actual event probability value, adjusting parameters of the event probability prediction model according to the probability loss until the probability loss meets a loss threshold value, and acquiring the trained event probability prediction model and an event probability weight matrix, wherein each element of the event probability weight matrix is the weight of each corresponding index. The event probability prediction model training method, the event probability prediction method and the related device provided by the embodiment of the invention can realize the prediction of the event probability and determine the influence factor value influencing the event probability.

Description

Event probability prediction model training method, event probability prediction method and related device
Technical Field
The embodiment of the invention relates to the field of computers, in particular to an event probability prediction model training method, an event probability prediction method and a related device.
Background
With the development of computer technology and deep learning technology, the demand for predicting the probability of an event occurring in the future is realized to a certain extent through the technology.
Such as: in an educational scene, in order to ensure the stability of a biogenesis, the continuous report rate of an existing student needs to be improved, and the back rate of the existing student needs to be reduced, so that the future continuous report rate or the back rate needs to be obtained according to the basic situation of the existing student, and the aim of improving the continuous report rate or reducing the back rate by influencing the adjustment of the existing teaching link is further achieved.
Therefore, how to predict the event probability and determine the influence factor value influencing the event probability becomes an urgent technical problem to be solved.
Disclosure of Invention
The embodiment of the invention provides an event probability prediction model training method, an event probability prediction method and a related device, which are used for realizing prediction of event probability and determining an influence factor value influencing the event probability.
In order to solve the above problem, an embodiment of the present invention provides an event probability prediction model training method, including:
acquiring a training data set, wherein the training data set comprises an actual event probability value and a related index data value corresponding to each data unit, the actual event probability value is a real value of the event probability of each data unit, and the related index data value is a numerical value of each index of which the degree of correlation with the event probability meets a correlation threshold;
and acquiring a predicted event probability value according to the relevant index data value of each data unit by using the event probability prediction model to be trained, acquiring probability loss according to the predicted event probability value and the actual event probability value, adjusting parameters of the event probability prediction model according to the probability loss until the probability loss meets a loss threshold value, and acquiring the trained event probability prediction model and an event probability weight matrix, wherein each element of the event probability weight matrix is the weight of each corresponding index.
In order to solve the above problem, an embodiment of the present invention further provides an event probability prediction method, including:
acquiring a prediction related index data value of a data unit to be subjected to event probability prediction;
and obtaining the influence factor value of each index according to the data value of the prediction related index and the event probability weight matrix obtained by the training method of the event probability prediction model.
In order to solve the above problem, an embodiment of the present invention further provides an event probability prediction method, including:
acquiring a prediction related index data value of a data unit to be subjected to event probability prediction;
acquiring a prediction index class value of the data unit by using the index weight obtained by the event probability prediction model training method and a prediction related index data value corresponding to the index weight;
the event probability prediction model obtained by the event probability prediction model training method according to any one of the preceding claims is used for obtaining the predicted event probability according to the index class value, and the influence factor value of each index class is obtained according to the index class value and the event probability weight matrix obtained by the event probability prediction model training method according to any one of the preceding claims.
In order to solve the above problem, an embodiment of the present invention further provides an event probability prediction model training device, including:
the data set acquisition unit is suitable for acquiring a training data set, wherein the training data set comprises an actual event probability value and a related index data value, the actual event probability value corresponds to each data unit, the actual event probability value is a real value of an event probability of each data unit, and the related index data value is a numerical value of each index, the correlation degree of the event probability meets a correlation degree threshold value;
an event probability prediction model and event probability weight matrix obtaining unit, adapted to obtain a predicted event probability value according to the relevant index data value of each data unit by using the event probability prediction model to be trained, obtain a probability loss according to the predicted event probability value and the actual event probability value, adjust parameters of the event probability prediction model according to the probability loss until the probability loss meets a loss threshold, and obtain a trained event probability prediction model and an event probability weight matrix, wherein each element of the event probability weight matrix is a weight of each corresponding index.
In order to solve the above problem, an embodiment of the present invention further provides an event probability prediction apparatus, including:
the device comprises a prediction related index data value acquisition unit, a prediction related index data value acquisition unit and a prediction processing unit, wherein the prediction related index data value acquisition unit is suitable for acquiring a prediction related index data value of a data unit to be subjected to event probability prediction;
the unit for obtaining the data value of the prediction related index and the value of the influence factor is suitable for obtaining the probability of the predicted event according to the data value of the prediction related index by using the event probability prediction model obtained by the training method of the event probability prediction model, and obtaining the value of the influence factor of each index according to the data value of the prediction related index and the event probability weight matrix obtained by the training method of the event probability prediction model.
In order to solve the above problem, an embodiment of the present invention further provides an event probability prediction apparatus, including:
the device comprises a prediction related index data value acquisition unit, a prediction related index data value acquisition unit and a prediction processing unit, wherein the prediction related index data value acquisition unit is suitable for acquiring a prediction related index data value of a data unit to be subjected to event probability prediction;
a prediction index class value obtaining unit, adapted to obtain a prediction index class value of the data unit by using the index weight obtained by the event probability prediction model training method according to any one of the preceding claims and a prediction-related index data value corresponding to the index weight;
the unit for obtaining the data value of the prediction-related index and the value of the influence factor is suitable for obtaining the probability of the predicted event according to the index class value by using the event probability prediction model obtained by the event probability prediction model training method in any item of the above items, and obtaining the value of the influence factor of each index class according to the index class value and the event probability weight matrix obtained by the event probability prediction model training method in any item of the above items.
To solve the above problem, an embodiment of the present invention provides a storage medium storing a program adapted to training an event probability prediction model to implement the event probability prediction model training method as described above, or storing a program adapted to predicting an event probability to implement the event probability prediction method as described above.
To solve the above problem, an embodiment of the present invention provides an apparatus, including at least one memory and at least one processor; the memory stores a program that the processor invokes to perform an event probability prediction model training method as described above or an event probability prediction method as described above.
Compared with the prior art, the technical scheme of the invention has the following advantages:
the event probability prediction model training method, the event probability prediction method and the related device provided by the embodiment of the invention are characterized in that when the event probability prediction model training is carried out, firstly, a training data set is obtained, wherein the training data set is a set of data information of a plurality of data units and comprises actual probability values and related index data values of the data units, then, the predicted event probability is obtained according to the related index data values of the data units, the probability loss is obtained by utilizing the predicted event probability and the actual event probability, further, the adjustment of event probability prediction model parameters is realized, and when the probability loss meets a loss threshold value, the trained event probability prediction model is obtained, and meanwhile, an event probability weight matrix is also obtained, so that the weight of each index influencing the event probability is obtained. It can be seen that the event probability prediction model training method provided by the embodiment of the present invention is performed by using the obtainable relevant index data value and the actual event probability value when the event probability prediction model to be trained is trained, and a relationship between the relevant index data value and the event probability in the event probability prediction model is constructed, so as to prepare for realizing prediction of the future event probability by using the trained event probability prediction model and the index data value obtainable at the current time, and ensure the accuracy of the prediction; meanwhile, in the process of model training, the weight of each index is also obtained, when the event probability is actually predicted, the predicted event probability can be obtained, and the influence factor value of each index for obtaining the predicted event probability can be obtained according to the related index data value of each index and the weight of the index corresponding to the related index data value, so that corresponding preparation can be made in advance for achieving the predicted event probability later or avoiding achieving the predicted event probability, and the event probability really obtained in the future can meet the expectation better.
In the alternative, each index further comprises a pre-labeled index category and a pre-acquired index correlation direction value, when an event probability prediction model is trained, a dimension reduction weight matrix of the dimension reduction model is acquired by performing series training on the dimension reduction model to be trained and the event probability prediction model to be trained by using a training data set, and then each index weight is acquired according to the dimension reduction weight matrix; and further acquiring an index class value of each data unit according to the index weight corresponding to the same index and the related index data value of each data unit, and further training an event probability prediction model by using the index class value of each data unit to obtain the event probability prediction model trained by using the index class value and the weight of each index class. Therefore, on one hand, a large number of indexes and index data values can be selected, and corresponding index category values are obtained through conversion, so that more indexes can be utilized when the trained event probability prediction model is utilized to predict event probabilities, and the prediction accuracy of the event probability prediction model is improved; meanwhile, when the event probability prediction model training and the subsequent event probability prediction are carried out, the used data are index class values of data units, the data amount during the model training and the model prediction can be reduced, the operation amount is reduced, the model training efficiency and the prediction efficiency of the event probability are improved, and the accuracy and the efficiency of the event probability prediction of the model training set can be realized.
In an alternative scheme, in the event probability prediction model training method provided in the embodiment of the present invention, each index includes not only a relevant index data value but also a pre-obtained index relevance direction value, when obtaining each index weight, first, a dimensionality reduction weight matrix is subjected to matrix conversion to obtain a weight square matrix with the number of rows and columns equal to the number of index categories, and a weight square matrix is used to obtain target elements, then, each dimensionality reduction target element and each dimensionality reduction target element value in the dimensionality reduction weight matrix are determined according to the position of each target element in the weight square matrix, and each index weight is obtained by using each dimensionality reduction target element value and an index corresponding to each dimensionality reduction target element. Thus, when the index weight is calculated, because the conversion of the index to the index category is the conversion of a plurality of numerical values to a smaller number of numerical values, a dimension reduction weight matrix is obtained in the model training process, but the meaning represented by each element in the dimension reduction weight matrix can not be determined, therefore, the weight square matrix is used for realizing the determination of the dimension reduction target element in the dimension reduction weight matrix and the determination of the meaning represented by the dimension reduction target element, further, the acquisition of the index weight is realized according to the dimension reduction target element value in the dimension reduction weight matrix, therefore, the meaning represented by each element which can not determine the dimension reduction weight matrix is shown by skillfully utilizing the logic of converting the indexes into the index types, the transparency of the black box information is realized, the acquisition of the index weight is realized, and the realization of obtaining the index category value based on the related index data value can be ensured during the probability prediction of the subsequent event.
Drawings
FIG. 1 is a flow chart of a method for training an event probability prediction model according to an embodiment of the present invention;
FIG. 2 is a schematic flowchart of a training data set obtaining step of the event probability prediction model training method according to an embodiment of the present invention;
fig. 3 is a schematic diagram of a target classification dimension and an acquisition of an index corresponding to the target classification dimension of the event probability prediction model training method according to the embodiment of the present invention;
FIG. 4 is a schematic flow chart illustrating a method for training an event probability prediction model according to an embodiment of the present invention;
FIG. 5 is a schematic flowchart of a step of obtaining an index weight in the event probability prediction model training method according to an embodiment of the present invention;
FIG. 6 is a flowchart illustrating a method for predicting event probability according to an embodiment of the present invention;
FIG. 7 is a schematic flow chart illustrating a method for predicting event probability according to an embodiment of the present invention;
FIG. 8 is a block diagram of an event probability prediction model training apparatus according to an embodiment of the present invention;
FIG. 9 is another block diagram of an event probability prediction model training apparatus provided in an embodiment of the present invention;
FIG. 10 is a block diagram of an event probability prediction apparatus provided in an embodiment of the present invention;
fig. 11 is an alternative hardware device architecture of the device provided by the embodiment of the present invention.
Detailed Description
In the prior art, when the event probability prediction is carried out on a text, the influence factor value influencing the event probability is difficult to determine while the prediction of the event probability is realized.
In order to realize prediction of event probability and determine influence factor values influencing the event probability, the embodiment of the invention provides an event probability prediction model training method, which comprises the following steps:
acquiring a training data set, wherein the training data set comprises an actual event probability value and a related index data value corresponding to each data unit, the actual event probability value is a real value of the event probability of each data unit, and the related index data value is a numerical value of each index of which the degree of correlation with the event probability meets a correlation threshold;
and acquiring a predicted event probability value according to the relevant index data value of each data unit by using the event probability prediction model to be trained, acquiring probability loss according to the predicted event probability value and the actual event probability value, adjusting parameters of the event probability prediction model according to the probability loss until the probability loss meets a loss threshold value, and acquiring the trained event probability prediction model and an event probability weight matrix, wherein each element of the event probability weight matrix is the weight of each index.
It can be seen that, in the event probability prediction model training method provided in the embodiment of the present invention, when an event probability prediction model is trained, a training data set is first obtained, where the training data set is a set of data information of a plurality of data units, and includes an actual probability value and a related index data value of each data unit, then a predicted event probability is obtained according to the related index data value of each data unit, and a probability loss is obtained using the predicted event probability and the actual event probability, thereby implementing adjustment of parameters of the event probability prediction model.
It can be seen that the event probability prediction model training method provided by the embodiment of the present invention is performed by using the obtainable relevant index data value and the actual event probability value when the event probability prediction model to be trained is trained, and a relationship between the relevant index data value and the event probability in the event probability prediction model is constructed, so as to prepare for realizing prediction of the future event probability by using the trained event probability prediction model and the index data value obtainable at the current time, and ensure the accuracy of the prediction; meanwhile, in the process of model training, the weight of each index is also obtained, when the event probability is actually predicted, the predicted event probability can be obtained, and the influence factor value of each index for obtaining the predicted event probability can be obtained according to the related index data value of each index and the weight of the index corresponding to the related index data value, so that corresponding preparation can be made in advance for achieving the predicted event probability later or avoiding achieving the predicted event probability, and the event probability really obtained in the future can meet the expectation better.
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
Referring to fig. 1, fig. 1 is a flow chart of a method for training an event probability prediction model according to an embodiment of the present invention.
As shown in the figure, the method for training the event probability prediction model provided by the embodiment of the present invention includes the following steps:
step S10: a training data set is obtained.
It is easy to understand that, in order to implement training of the event probability prediction model, it is necessary to first obtain training data, that is, a training data set, where the training data set includes actual event probability values corresponding to the data units and related index data values, the actual event probability values are actual values of the event probabilities of the data units, and the related index data values are values of the indexes whose correlation degree with the event probabilities satisfies a correlation degree threshold.
The data unit refers to a group of corresponding actual event probability values and units to which related index data values belong, such as: for example, the event probability corresponding to a class may be a renewal rate, a withdrawal rate, and the indicators possibly related to the event probability corresponding to a class may include a title accuracy rate, a title participation rate, an understanding rate, a course liking rate, a rushing to red package participation rate, a score, a number of times of showing, a personal show viewing rate, a personal show approval rate, and the like.
Since the indexes that can be obtained and may be related to the event probability are not necessarily related to the event probability, before the event probability prediction model is trained, it is necessary to first process the data of each obtained data unit and determine each index whose correlation degree with the event probability satisfies the correlation degree threshold.
However, for different classes of data units, the indicators that satisfy the correlation threshold with the event probability correlation are likely to be different, continuing with the class unit as an example:
when the data of the data units of all classes are put together, the index of which the obtained correlation satisfies the correlation threshold may be A, B, C, and when the classes are classified, the index of which the correlation satisfies the correlation threshold may be B, C, D for the class data of the elementary school, and the index of which the correlation satisfies the correlation threshold may be C, D, E for the class data of the middle school; when data units of different grades are selected, an index that the correlation degree of the lower grade of the primary school meets the correlation degree threshold value and an index that the correlation degree of the higher grade of the primary school meets the correlation degree threshold value may also be different, and when the prediction of the event probability is specifically performed, the indexes that the correlation degree meets the correlation degree threshold value are not respectively obtained according to different classification modes, and then different predictions are respectively performed, so that the classification mode used by the subsequent prediction and the indexes that the correlation degree meets the correlation degree threshold value under different classification modes need to be determined.
Therefore, referring to fig. 2, fig. 2 is a flowchart illustrating a training data set obtaining step of the event probability prediction model training method according to the embodiment of the present invention.
As shown in the figure, in one embodiment, in order to obtain the training data set, the method comprises the following steps:
step S100: an original training data set is obtained.
It is readily understood that the original training data set is an unprocessed training data set directly acquired, which includes actual event probability values corresponding to respective data units and predicted relevant index data values, which are numerical values of respective indices predicted to be related to the event probability.
In particular, the original training data set may be obtained by corresponding statistical software.
Step S101: and acquiring a target classification dimension and an index corresponding to the target classification dimension.
It should be noted that the target classification dimension described herein refers to a classification dimension determined by comparison and having a better prediction effect on the event probability.
Of course, the target classification dimension needs to be obtained before the specific classification, and in a specific embodiment, in order to obtain the target classification dimension, the target classification dimension may be determined by performing correlation calculation after data classification on a plurality of classification dimensions.
Therefore, the original training data set is classified according to each dimension to be classified to obtain a pre-classification data set.
The dimension to be classified is a possible classification dimension according to the characteristics of the data unit. Such as: for pedagogically relevant data units, a predetermined classification dimension may be determined as: grade, school, class type, whole, etc.
And classifying the original data sets according to different preset classification dimensions to obtain each pre-classified data set.
Then, through a correlation calculation algorithm, by using the expected relevant index data values and the actual event probability values of the data units of the pre-classified data sets, obtaining the correlation degree between each expected relevant index data value and the actual event probability value, obtaining each index of which the correlation degree meets a correlation degree threshold value, determining the undetermined classification dimension with the largest number of relevant indexes, and the indexes corresponding to the undetermined classification dimension, and obtaining the target classification dimension and the indexes corresponding to the target classification dimension, wherein the number of the relevant indexes is the number of the indexes of which the correlation degree meets the correlation degree threshold value.
Specifically, referring to fig. 3, fig. 3 is a schematic diagram illustrating a target classification dimension and an acquisition of an index corresponding to the target classification dimension of the event probability prediction model training method according to the embodiment of the present invention.
After each pre-classified data set is obtained, the corresponding data unit, the predicted relevant index data value and the actual event probability value of each data unit are obtained according to different pre-classified data sets, and then the data values and the actual event probability values are input into a correlation calculation algorithm to obtain each index of which the correlation degree with the actual event probability value meets the correlation degree threshold value.
It should be noted that the correlation degree may be a positive value indicating that the indicator and the event probability are in positive correlation, or may be a negative value indicating that the indicator and the event probability are in negative correlation, where the correlation degree satisfying the correlation degree threshold means that the absolute value of the correlation degree is greater than or equal to the correlation degree threshold.
Because the correlation threshold is too high, the number of indexes meeting the correlation threshold is too small, the subsequent training of the model is not facilitated, the correlation threshold is too low, the number of indexes meeting the correlation threshold is too large, and the operation amount of model training is increased, therefore, in a specific implementation mode, the correlation threshold can be selected to be 0.2, namely, the index with the absolute value of the correlation greater than or equal to 0.2 is determined as the index of the event probability, so that the accuracy and the operation amount of the model training are considered at the same time.
After the original training data set is classified according to each dimension to be classified, the original training data set is influenced by the data unit quantity of the original training data set, the data unit quantity of each pre-classified data set is different, and different correlation calculation algorithms can be selected respectively.
Specifically, a spearman-level correlation coefficient (sperman correlation coefficient) calculation algorithm and a kendall-level correlation coefficient (Kendal' stau-b correlation coefficient) calculation algorithm may be selected.
The requirement of the spearman rank correlation coefficient (spearman correlation coefficient) calculation algorithm on the data set neither requires that the spearman rank correlation coefficient (spearman correlation coefficient) is in accordance with normal distribution, and is applicable to any relation defined by a monotonous function, and when the sample size is relatively large, the spearman correlation coefficient calculation algorithm is selected to be better; when the sample size is relatively small, the Kendall's tau-b correlation coefficient calculation algorithm is less sensitive to errors and more accurate.
For this purpose, in the determination of the relevance indicator, it is also possible to first select a specific calculation algorithm depending on the amount of data units of the pre-classified data set.
Specifically, the data unit quantity of the pre-classified data set is obtained firstly, then whether the data unit quantity exceeds a first data quantity threshold value or not is judged, if yes, a spearman grade correlation coefficient calculation algorithm is selected for calculation, and if not, a kender grade correlation coefficient calculation algorithm can be selected.
In order to avoid the uncertain influence on the determination of the index meeting the correlation threshold value due to the fact that the data volume is too small, when the data unit volume is judged not to exceed the first data volume threshold value, whether the data unit volume exceeds the second data volume threshold value or not can be further judged, if yes, a Kendell-grade correlation coefficient calculation algorithm can be selected, and if not, the Kendel-grade correlation coefficient calculation algorithm is directly discarded.
The specific values of the first data amount threshold and the second data amount threshold can be determined as needed, and it is easily understood that the first data amount threshold is larger than the second data amount threshold, such as: the first data amount threshold is selected to be 150, 200, etc., and the second data amount threshold may be selected to be 10, 8, etc.
After the correlation calculation of each pre-classified data set, each index meeting the correlation threshold corresponding to each pre-classified data set can be obtained, and then a target classification dimension needs to be selected from the dimensions to be classified according to each index meeting the correlation threshold corresponding to each pre-classified data set.
In order to ensure the accuracy of model training, the undetermined classification dimension with the largest number of indexes meeting the correlation threshold value can be selected as the target classification dimension, so that each index corresponding to the target classification dimension can be obtained.
Step S102: and acquiring a pre-classification data set classified according to the target classification dimension, and screening the predicted relevant index data values of the indexes corresponding to the target classification dimension to obtain the training data set.
And after the target classification dimension is obtained, acquiring a pre-classification data set classified according to the target classification dimension, and then selecting the data value of the index of each data unit in the pre-classification data set, which corresponds to the target classification dimension, to obtain a training data set.
Therefore, the obtained training data set includes an actual event probability value corresponding to each data unit and a related index data value, where the actual event probability value is a true value of the event probability of each data unit, and the related index data value is a value of each index whose degree of correlation with the event probability satisfies a correlation threshold.
It is easy to understand that, because the training data sets are obtained by classifying the original training data sets according to the target classification dimensions, there are a plurality of training data sets, and based on different training data sets, the trained event probability prediction models are likely to be different, so that during prediction, based on the class to which a specific data unit belongs, the corresponding event probability prediction model can be selected for prediction.
Step S11: and acquiring a predicted event probability value according to the relevant index data value of each data unit by using the event probability prediction model to be trained, and acquiring probability loss according to the predicted event probability value and the actual event probability value.
After the training data set is obtained, the relevant index data values of the data units are used for inputting an event probability prediction model to be trained, the event probability is predicted, the probability value of each predicted event is obtained, then the difference between the predicted event probability value and the actual event probability value is further obtained according to the predicted event probability value and the actual event probability value, and the probability loss is obtained.
Step S12: and judging whether the probability loss meets a loss threshold, if so, executing step S14, and if not, executing step S13.
And after the probability loss is obtained, comparing the probability loss with a loss threshold, if the loss threshold is met, the event probability prediction model meets the accuracy requirement of prediction, and executing step S14, otherwise, the prediction accuracy of the event probability prediction model cannot meet the accuracy requirement, and executing step S13.
Step S13: adjusting parameters of the event probability prediction model according to the probability loss, and executing the step S11.
Based on the probability loss, the parameters of the event probability prediction model are adjusted, and then step S11 is executed again to obtain the predicted event probability value again by using the event probability prediction model after the parameters are adjusted.
Step S14: and obtaining a trained event probability prediction model and an event probability weight matrix.
And when the probability loss meets the loss threshold, obtaining a trained event probability prediction model and an event probability weight matrix, wherein each element of the event probability weight matrix is the weight of each corresponding index.
Therefore, when the event probability prediction model is used for predicting the event probability, the predicted event probability value can be obtained, and the influence factor value of each index on the event probability can be obtained by using the event probability weight matrix.
It can be seen that the event probability prediction model training method provided by the embodiment of the present invention is performed by using the obtainable relevant index data value and the actual event probability value when the event probability prediction model to be trained is trained, and a relationship between the relevant index data value and the event probability in the event probability prediction model is constructed, so as to prepare for realizing prediction of the future event probability by using the trained event probability prediction model and the index data value obtainable at the current time, and ensure the accuracy of the prediction; meanwhile, in the process of model training, the weight of each index is also obtained, when the event probability is actually predicted, the predicted event probability can be obtained, and the influence factor value of each index for obtaining the predicted event probability can be obtained according to the related index data value of each index and the weight of the index corresponding to the related index data value, so that corresponding preparation can be made in advance for achieving the predicted event probability later or avoiding achieving the predicted event probability, and the event probability really obtained in the future can meet the expectation better.
Referring to fig. 4, fig. 4 is another schematic flow chart of the event probability prediction model training method according to the embodiment of the present invention, so as to further improve the accuracy of the model training.
As shown in the figure, in another specific implementation manner, the method for training the event probability prediction model according to the embodiment of the present invention includes:
step S20: a training data set is obtained.
For details of the step S20, please refer to the detailed description of the step S10 in fig. 1, which is not repeated herein.
In this embodiment, in order to implement training of the event probability prediction model to be trained and reduce the computation of training of the event probability prediction model on the basis of ensuring the training accuracy, in an embodiment, the dimension reduction processing may be performed on the index whose correlation satisfies the correlation threshold, so as to fuse more correlation indexes.
In order to ensure the realization of dimension reduction, on one hand, the index before dimension reduction is determined to correspond to the index class after dimension reduction, and on the other hand, how to obtain the index class value of the index class after dimension reduction based on the related index data value of each index before dimension reduction is determined.
In a specific embodiment, in order to implement correspondence between the indexes before dimension reduction and the index categories after dimension reduction, the index categories may be labeled in advance for each index, and the index categories are labeled according to the actual meaning association between the indexes and the index categories, that is, each index includes the labeled index category in advance. It is easily understood that the index categories are labeled to achieve dimension reduction of the index, and thus the number of index categories is smaller than the number of indexes.
For convenience of understanding, the index types of the indexes are explained by combining the cases under the teaching scene:
for each index: title correct rate (including mean and standard deviation), title participation rate (including mean and standard deviation), comprehension rate (including mean and standard deviation), course liking (including mean and standard deviation), rushing to red packet participation rate (including mean and standard deviation), points (including mean and standard deviation), number of praise (including mean and standard deviation), individual show viewing rate (including mean and standard deviation), individual show like rate (including mean and standard deviation), mark of index category, such as:
index type
Subject accuracy, subject participation and comprehension
The course liking degree, the red envelope participation rate and the score liking
The number of times of showing and the viewing rate of the personal show are concerned
Accordingly, a plurality of relevant index data values of 16 indexes are subjected to dimension reduction to obtain a 3-dimensional index class.
Step S21: and performing series training on the dimensionality reduction model to be trained and the event probability prediction model to be trained by using the training data set until the dimensionality reduction model and the event probability prediction model meeting a preset training target are obtained, and obtaining a dimensionality reduction weight matrix of the dimensionality reduction model.
In order to determine how to obtain the index class value of the index class after dimensionality reduction based on the relevant index data value of each index before dimensionality reduction, it is first necessary to determine a calculation weight for obtaining each index class value based on each relevant index data value, that is, a dimensionality reduction weight.
In a specific embodiment, in order to obtain the dimension reduction weight, a dimension reduction weight matrix may be obtained by performing series training on an untrained dimension reduction model and an event probability prediction model to be trained, which are constructed in advance, wherein each row of the dimension reduction weight matrix corresponds to each index, and therefore, the dimension reduction weight matrix has how many rows.
Specifically, the dimension reduction model may be an RBM (Restricted Boltzmann Machines) model or a PCA (Principal components analysis) model.
The step of obtaining the dimensionality reduction weight matrix specifically comprises the following steps:
obtaining each prediction event probability by using the dimension reduction model to be trained and the event probability prediction model to be trained according to the relevant index data value of each data unit of the training data set;
and obtaining probability loss according to each predicted event probability and each actual event probability, and adjusting parameters of the dimensionality reduction model and the event probability prediction model by using the probability loss until the probability loss meets a second loss threshold value to obtain the dimensionality reduction model and the event probability prediction model which meet a preset training target.
Specifically, the relevant index data values of each data unit of the training data set are input into the dimension reduction model, the obtained output is continuously input into the event probability prediction model, and each prediction event probability is obtained.
Of course, in the process of performing the tandem training, the output of the dimension reduction model is directly input into the event probability model, the two are performed in tandem, and there is no data output.
And after obtaining the probabilities of all the predicted events, obtaining probability losses according to the probabilities of all the predicted events and the probabilities of all the actual events, judging whether the probability losses meet a second loss threshold value, if not, simultaneously adjusting parameters of a dimensionality reduction model and an event probability prediction model according to the probability losses, obtaining the probabilities of the predicted events according to the dimensionality reduction model and the event probability prediction model after parameter adjustment, if so, obtaining the dimensionality reduction model and the event probability prediction model which meet a preset training target, and simultaneously obtaining a dimensionality reduction weight matrix of the dimensionality reduction model, wherein each row of the dimensionality reduction weight matrix corresponds to each index.
The event probability prediction model obtained by the series training is obtained for the purpose of training the dimension reduction model, and is not a model used for prediction in the subsequent stage.
Step S22: and acquiring the weight of each index according to the dimension reduction weight matrix.
And obtaining a dimension reduction weight matrix, and further obtaining the weight of each index.
In an embodiment, please refer to fig. 5 for obtaining the index weight, and fig. 5 is a flowchart illustrating a step of obtaining the index weight of the event probability prediction model training method according to the embodiment of the present invention.
As shown in the figure, in one embodiment, each index weight may be obtained by:
step S220: and adjusting the dimension reduction weight matrix by using the index correlation direction value corresponding to the index and the corresponding relation between the index and the index category to obtain a weight square matrix with the number of rows and columns equal to the number of the index category, and acquiring each target element of the weight square matrix, which is used for representing the index category.
In order to obtain the index weight, firstly, the dimension reduction weight matrix is adjusted by using the index correlation direction value corresponding to the index and the corresponding relation between the index and the index category so as to obtain a weight square matrix with the number of rows and columns equal to the number of the index categories. The index correlation direction value is a numerical value of the correlation direction of the index and the event probability, if the correlation degree is a positive value, the index is in positive correlation with the event probability, the correlation direction value is 1, if the correlation degree is a negative value, the index is in negative correlation with the event probability, and the correlation direction value is-1.
In the actual operation process, because the event probability to be predicted is different from the incidence relation between the event probability to be increased (such as the rate of reporting again) or the event probability to be decreased (such as the rate of charge backing) and the index, it is necessary to determine whether to perform the negation operation on the index correlation direction value of the correlation degree obtained in the correlation calculation process according to which type of the event probability to be predicted is: if the probability (such as the follow-up rate) which needs to be improved is predicted, the dimension reduction weight matrix is adjusted by directly utilizing the index correlation direction value and the index category of the correlation degree obtained by correlation calculation, and a weight square matrix with the number of rows and columns equal to the number of the index categories is obtained; if the probability (such as the refuge rate) which needs to be reduced is predicted, firstly, the index correlation direction value of the correlation degree obtained by correlation calculation is subjected to one-time negation operation, then the dimension reduction weight matrix is adjusted by using the index correlation direction value and the index category obtained by negation operation, and a weight square matrix with the number of rows and columns equal to the number of the index categories is obtained.
In this way, it can be ensured that the index category values obtained by the subsequent operations are all embodied forward numerical values, which represent good directions, for example, in combination with the foregoing cases: the higher the learned score, the lower the refund rate.
Each row of the weight square matrix corresponds to the index type of each index, and the number of the target elements is equal to the number of the index types.
And acquiring corresponding index correlation direction values according to indexes represented by all rows of the dimensionality reduction weight matrix, calculating according to the corresponding relation between the indexes and the index categories, taking the index correlation direction values as weights, and performing weighted summation on all elements in the same column of all indexes corresponding to the same index category to obtain elements at corresponding positions of the weight square matrix.
To facilitate understanding of the adjustment manner of the dimension reduction weight matrix, the following is exemplified:
the dimensionality reduction weight matrix, the indexes represented by each row of the dimensionality reduction weight matrix and the index types corresponding to the indexes are as follows:
Figure 599957DEST_PATH_IMAGE001
the index correlation direction values of the indexes (of course, if the event probability to be predicted is the event probability to be reduced, the index correlation direction values are the index correlation direction values which have undergone the inversion operation) are respectively (1, -1, 1, -1, 1, -1), that is, the index correlation direction values of the accuracy average value, the love degree average value and the raise number average value are 1, and the index correlation direction values of the other three indexes are-1.
During the adjustment, the following calculations are performed:
Figure 782676DEST_PATH_IMAGE002
……
a weight matrix is thus obtained, as follows:
Figure 93572DEST_PATH_IMAGE003
it is easy to understand that, in the weight matrix, the index type corresponding to the first row is academic, the index type corresponding to the second row is favorite, and the index type corresponding to the third row is concerned.
And after the weight square matrix is obtained, further determining each target element in the weight square matrix.
The target elements in the weight matrix are determined, so that the dimension reduction target elements can be determined, and the acquisition of index weight is realized.
In a specific embodiment, the maximum value element with the largest value among the elements of the weight square matrix may be obtained first, a target element is obtained, and a row where the target element is located and a column where the target element is located are obtained.
In the weight matrix in the above example, it is assumed that m22 is the maximum value element with the largest value, so that m22 is the first target element, and the row and the column of the target element are the second row and the second column, respectively.
Then, each element in the row where the target element is located and each element in the column where the target element is located in the weight square matrix are ignored, and an adjustment square matrix is obtained.
Ignoring the elements of the second row and second column as in the weight matrix in the above example, the adjusted matrix is obtained as follows:
Figure 895306DEST_PATH_IMAGE004
and further, taking the adjusting square matrix as a new weight square matrix, and acquiring new target elements until all the target elements are obtained.
Continuing with the above example, obtaining the maximum value element in the adjustment matrix, assuming that m13, so as to obtain that m13 is the second target element, where the row where the target element of the target element is located and the column where the target element is located are the first row and the third column, respectively.
Then, ignoring again, the elements of the first row and the third column, resulting in a new adjustment square:
Figure 975257DEST_PATH_IMAGE005
and then acquiring a maximum value element in the new adjustment square matrix: m31, so that m31 is the third target element, and the row and column of the target element are the third row and the first column, respectively.
Since the weight square matrix is a 3-dimensional square matrix, all target elements are obtained.
It can be seen that, by the above method, the acquisition of the maximum value element of the weight matrix and the acquisition of the adjustment matrix can be realized conveniently, and all the target elements can be made more reasonable, i.e. the target elements are determined as the index categories with the maximum index weight importance, which is equivalent to that for a metal mixture, the gold content is 90%, the silver content is 5%, and the copper content is 5%, which we classify as gold more reasonable; on the other hand, the complete and complete index categories reduced to the target dimension can be ensured, and the repetition of a certain index category or the loss of a certain index category can not be caused, for example, the three index categories of the scholarly, the favorite and the concerned can be all provided, and the way of the scholarly, the scholarly and the favorite can not be generated.
Step S221: and determining each dimension reduction target element and each dimension reduction target element value corresponding to each target element in the dimension reduction weight matrix according to the position of each target element in the weight square matrix.
And determining the dimension reduction target elements in the dimension reduction weight matrix according to the positions of the target elements in the matrix square matrix.
Please continue with the previous example:
wherein each target element of the weight matrix is respectively: m22, m13 and m31, wherein the dimension reduction target elements corresponding to the target elements are respectively:
m22 corresponds to w32 and w 42; m13 corresponds to w13 and w 23; m31 corresponding to w51 and w61
Therefore, the dimensionality reduction target element value corresponding to each dimensionality reduction target element can be obtained.
Step S222: and obtaining each index weight by using each dimension reduction target element value and each index corresponding to the dimension reduction target element.
And after obtaining the value of the dimensionality reduction target element and the index corresponding to the dimensionality reduction target element, further obtaining the index weight.
Specifically, each dimension reduction target element value of the index included in each index category is obtained, and then the proportion of each dimension reduction target element value in the sum of each dimension reduction target element value is obtained.
Please continue with the previous example, wherein: the index corresponding to w13 is a correct rate mean value, the index corresponding to w23 is a correct rate standard deviation, the index corresponding to w32 is a love degree mean value, the index corresponding to w42 is a love degree standard deviation, the index corresponding to w51 is a mean value of the number of puffs, and the index corresponding to w61 is a standard deviation of the number of puffs.
Then, the index weight is calculated according to each dimensionality reduction index element value, and the method can be specifically carried out by adopting the following formula:
the weight of the correct rate mean a1 is: w 13/(w 13+ w 23);
the weight of the standard deviation of accuracy a2 is: w 23/(w 13+ w 23);
the weight B1 of the preference degree mean value is w 32/(w 32+ w 42);
the weight of the standard deviation of the favorability B2 is: w 42/(w 32+ w 42);
the weight of the mean raise number C1 is: w 51/(w 51+ w 61);
the weight of the standard deviation of raise times C2 is: w 61/(w 51+ w 61).
Therefore, in the event probability prediction model training method provided by the embodiment of the present invention, when calculating the index weight, since the conversion from the index to the index category is the conversion from a plurality of numbers of values to a smaller number of values, in the model training process, the dimensionality reduction weight matrix is obtained, but the meaning represented by each element in the dimensionality reduction weight matrix cannot be determined, so that the determination of the dimensionality reduction target element in the dimensionality reduction weight matrix and the determination of the meaning represented by the dimensionality reduction target element are realized by using the weight square matrix, and further, the acquisition of the index weight is realized according to the dimensionality reduction target element value in the dimensionality reduction weight matrix, so that the meaning represented by each element which cannot determine the dimensionality reduction weight matrix is displayed by skillfully using the logic of converting the index to the index category, the transparency of the black box information is realized, and the acquisition of the index is realized, and the realization of obtaining the index category value based on the related index data value can be ensured during the probability prediction of the subsequent event.
Step S23: and acquiring the data value of the index type of each data unit according to the index weight corresponding to the same index and the related index data value of each data unit to obtain the index type value of each data unit.
And after the index weight is obtained, further acquiring related index data values corresponding to the same index in each data unit, and acquiring the data value of each index type of each data unit in a weighted summation mode by combining the corresponding relation between the index and the index type, so as to obtain the index type value of each data unit.
Continuing with the previous example:
suppose that the related index data values of a data unit are:
the data values for the correct rate mean are: d1;
the data values for the standard deviation of accuracy are: d2;
the data value of the mean value of the likeness is D3;
the data values of the standard deviation of the favorability are: d4;
the data value of the mean value of the number of table raise is: d5;
the data values of the standard deviation of the raising times are as follows: D6.
then the corresponding individual index class values are:
the learned index category values are: A1D 1+ A2 (1-D2)
The favorite index category values are: B1D 3+ B2 (1-D4)
The indicator class values of interest are: C1D 5+ C2 (1-D6)
And obtaining the index class value of each data unit of the whole training data set by calculating the index class value of each data unit of the training data set.
Step S24: and acquiring a predicted event probability according to the index category value of each data unit by using the event probability prediction model to be trained, and acquiring a probability loss according to the predicted event probability and the actual event probability.
And after the index category value of each data unit is obtained, obtaining the predicted event probability of each data unit by using an event probability prediction model to be trained, and further obtaining the probability loss according to the obtained predicted event probability and the actual event probability corresponding to the same data unit.
Step S25: and judging whether the probability loss meets a loss threshold, if so, executing step S27, and if not, executing step S26.
Step S26: adjusting parameters of the event probability prediction model according to the probability loss, and executing the step S24.
Step S27: and obtaining a trained event probability prediction model and an event probability weight matrix.
For details of steps S25-S27, please refer to the descriptions of steps S12-S14 shown in fig. 1, which are not repeated herein.
It is understood that, of course, each element of the event probability weight matrix obtained by performing the event probability prediction model using the index class value is the weight of each index class, not the weight of each index.
In this way, the influence factor value of each index type can be obtained according to the weight of each index type.
Therefore, according to the event probability prediction model training method provided by the embodiment of the invention, on one hand, a large number of indexes and index data values can be selected, and corresponding index category values are obtained through dimensionality reduction conversion, so that more indexes can be utilized when the trained event probability prediction model is used for event probability prediction, and the prediction accuracy of the event probability prediction model is improved; meanwhile, when the event probability prediction model training and the subsequent event probability prediction are carried out, the used data are index class values of data units, the data amount during the model training and the model prediction can be reduced, the operation amount is reduced, the model training efficiency and the prediction efficiency of the event probability are improved, and the accuracy and the efficiency of the event probability prediction of the model training set can be realized.
In another specific embodiment, in order to further improve the accuracy of the obtained index weights, the training data set includes training data subsets, and the number of the dimension reduction models to be trained is at least equal to the number of the training data subsets, the event probability prediction model training method provided in the embodiment of the present invention includes:
respectively utilizing each training data subset to carry out series training on a dimensionality reduction model to be trained and the event probability prediction model to be trained corresponding to the training data subset until each dimensionality reduction model and the event probability prediction model meeting a preset training target are obtained, and obtaining each dimensionality reduction weight matrix of each dimensionality reduction model;
adjusting each dimensionality reduction weight matrix by using the index correlation direction value corresponding to the index and the index category to obtain each weight square matrix with the number of rows and columns equal to the number of the index category, and acquiring each target element of each weight square matrix;
determining the weight square matrixes with the same positions and the largest quantity of target elements in each weight square matrix to obtain each consistent weight square matrix, and determining each dimension reduction weight matrix corresponding to each consistent weight square matrix;
determining each dimension reduction target element and each dimension reduction target element value corresponding to each target element in each dimension reduction weight matrix according to the position of each target element in each consistent weight matrix, obtaining the mean value of each dimension reduction target element value at the same position in each dimension reduction weight matrix to obtain the mean value of the dimension reduction target elements, and obtaining each index weight by using each dimension reduction target element mean value and the index corresponding to each dimension reduction target element.
It is easy to understand that each training data subset can be obtained by splitting the training data set, and each dimension reduction model can be an RBM model or a PCA model.
In the event probability prediction model training process, each data unit in a training data subset is used to perform series training on a dimensionality reduction model and the event probability prediction model to be trained, and for a specific training mode, reference is made to the description of step S21 in fig. 4, which is not described herein again.
After model series training, the dimension reduction weight matrix of each dimension reduction model is obtained, so if there are n dimension reduction models, n dimension reduction weight matrices can be obtained.
After obtaining each dimension reduction weight matrix, adjusting each dimension reduction weight matrix, that is, adjusting each dimension reduction weight matrix by using the index correlation direction value and the index type corresponding to the index to obtain each weight square matrix with the number of rows and columns equal to the number of the index type, and obtaining each target element of each weight square matrix.
Please refer to the description of step S220 in fig. 5, which is not repeated herein.
It is easy to understand that, based on each dimension reduction weight matrix, each weight square matrix and the target elements of each weight square matrix are obtained, and if n dimension reduction weight matrices are obtained through the foregoing steps, n weight square matrices are obtained, and then n groups of target elements are obtained.
After obtaining each group of target elements corresponding to each weight square matrix, since the positions of the target elements obtained based on each weight square matrix may be different, for example, the foregoing case is continuously combined: the positions of the target elements of some weight matrixes are respectively a first row, a second row, a third column and a third row, and the positions of the target elements of some weight matrixes are respectively a first row, a third column, a second row, a first column and a third row, and so on, and there are various combination modes, in order to ensure the realization of subsequent operation and improve the training accuracy, the training method of the event probability prediction model provided by the embodiment of the invention further comprises:
determining the weight square matrix with the same position and the largest quantity of each target element in each weight square matrix to obtain each consistent weight square matrix, and determining each dimension reduction weight matrix corresponding to each consistent weight square matrix.
Determining the weight square matrixes with the same positions of the target elements according to the obtained positions of all groups of target elements, then counting the number of the weight square matrixes with the same positions of all groups of target elements, taking the group of the weight square matrixes with the largest number as consistent weight square matrixes, and then obtaining all dimension reduction weight matrixes before conversion according to all the consistent weight square matrixes.
Furthermore, according to the position of each target element in each consistent weight square matrix, determining a dimensionality reduction target element and a dimensionality reduction target element value in the corresponding dimensionality reduction weight matrix.
If there are k consistent weight square matrixes, k groups of dimension reduction target elements and dimension reduction target element values are determined, and then the average value of the dimension reduction target element values of the dimension reduction weight matrix corresponding to each consistent weight square matrix, namely the average value of the dimension reduction target element values of the k groups, is calculated to obtain the dimension reduction target element average value.
And then, acquiring each index weight by using each dimension reduction target element mean value and each index corresponding to the dimension reduction target element.
Specifically, firstly, determining each index with the same labeled index type to obtain each index with the same type;
obtaining the sum of the dimensionality reduction target element average values of all the indexes of the same category to obtain dimensionality reduction indexes and values;
and then obtaining the index weight by using the dimensionality reduction index metamean value of each index of the same category and the dimensionality reduction index and value corresponding to the dimensionality reduction index metamean value.
Thus, the index weight can be conveniently obtained.
Please refer to fig. 5 for the description of step S222, but the value used is the dimensionality reduction target metamean.
It can be seen that in the event probability prediction model training method provided in the embodiment of the present invention, a plurality of dimension reduction weight matrices are obtained through a plurality of dimension reduction models, and then a plurality of weight square matrices are obtained, and then a dimension reduction weight matrix for performing index weight calculation is determined by using the determination and selection of the position of the target element of each weight square matrix, and the obtaining of the index weight is realized by using the average value, so that the accuracy of the obtained index weight can be improved, and the accuracy of model training and the accuracy of subsequent event probability prediction are further improved.
In order to predict the event probability and determine the influence factor value influencing the event probability, an embodiment of the present invention further provides an event probability prediction method, please refer to fig. 6, where fig. 6 is a flowchart illustrating the event probability prediction method according to the embodiment of the present invention.
As shown in the figure, the event probability prediction method provided in the embodiment of the present invention includes:
step S40: and acquiring a prediction related index data value of a data unit to be subjected to event probability prediction.
It is easy to understand that, since there are many indexes of the data unit to be event probability predicted, only the prediction related index data value, that is, the prediction related index data value of the index whose correlation satisfies the correlation threshold value corresponding to the target classification dimension, obtained through correlation calculation, needs to be obtained here.
Step S41: and obtaining the influence factor value of each index according to the data value of the prediction related index and the event probability weight matrix obtained by the training method of the event probability prediction model.
And after the data value of the prediction related index is obtained, event probability prediction and acquisition of the influence factor value of the index are further carried out.
It is easy to understand that under the influence of the target classification dimension determined in the model training phase, when performing event probability prediction, it is also necessary to determine a training data set where a data unit to be subjected to event probability prediction is located according to the target classification dimension, and then determine a corresponding trained event probability prediction model and an event probability weight matrix.
During specific prediction, the prediction related index data value of a data unit to be subjected to event probability prediction is input into a corresponding event probability prediction model to obtain the predicted event probability, and then the influence factor value of the index is obtained by using the prediction related index data value and the weight matrix of the same index in the event probability weight matrix.
Therefore, the event probability prediction method provided by the embodiment of the invention not only can realize the prediction of the event probability, but also can obtain the influence factor values of each index of the event probability, and can make corresponding preparation for achieving the expected event probability later or avoiding achieving the predicted event probability in advance, so that the event probability really obtained in the future can better meet the expectation.
In another specific embodiment, whether an event occurs may be further determined according to the obtained predicted event probability and a predetermined event probability threshold, and when the predicted event probability exceeds the predetermined event probability threshold, it is determined that the event occurs, otherwise, it is determined that the event does not occur.
Therefore, whether the event occurs or not can be judged more directly, and the result obtained by the user is clear at a glance.
Of course, the predetermined event probability threshold may be adjusted as needed, so that the final judgment result is more accurate.
In order to predict the event probability and determine the influence factor value influencing the event probability, an embodiment of the present invention further provides another event probability prediction method, please refer to fig. 7, and fig. 7 is another schematic flow chart of the event probability prediction method provided in the embodiment of the present invention.
As shown in the figure, an event probability prediction method provided by another embodiment of the present invention includes:
step S50: and acquiring a prediction related index data value of a data unit to be subjected to event probability prediction.
For details of step S50, please refer to the related description of step S40, which is not repeated herein.
Step S51: and acquiring the prediction index class value of the data unit by using the index weight obtained by the event probability prediction model training method and the prediction related index data value corresponding to the index weight.
After the prediction related index data value is obtained, the index weight obtained by the event probability prediction model training method needs to be obtained, and when the index weight is obtained, the index weight and the obtained data unit to be predicted belong to the same classification under the target classification dimensionality.
Then, weighted summation is carried out according to the corresponding relation between the index weight and the data value of the prediction related index (namely, all the indexes correspond to the same index) and the corresponding relation with the index class (namely, which index class the index belongs to) to obtain each prediction index class value of the data unit.
Step S52: the event probability prediction model obtained by the event probability prediction model training method according to any one of the preceding claims is used for obtaining the predicted event probability according to the index class value, and the influence factor value of each index class is obtained according to the index class value and the event probability weight matrix obtained by the event probability prediction model training method according to any one of the preceding claims.
Please refer to the related description of step S41 for details of step S52, it should be noted that, in this embodiment, when obtaining the predicted event probability, the input event probability prediction model is an index class value, and each element of the used event probability weight matrix is a weight of the index class, so as to obtain an influence factor value of each index class.
Therefore, the event probability prediction method provided by the embodiment of the invention can realize more accurate prediction of event probability through smaller operation amount, and can obtain the influence factor value of the index type, and further adjust the current behavior according to the influence factor value of the index type, so that the actual event probability obtained in the future can meet the expectation.
In another specific embodiment, whether an event occurs may be further determined according to the obtained predicted event probability and a predetermined event probability threshold, and when the predicted event probability exceeds the predetermined event probability threshold, it is determined that the event occurs, otherwise, it is determined that the event does not occur.
Therefore, whether the event occurs or not can be judged more directly, and the result obtained by the user is clear at a glance.
Of course, the predetermined event probability threshold may be adjusted as needed, so that the final judgment result is more accurate.
The event probability prediction model training device and the event probability prediction device provided by the embodiments of the present invention are introduced below, and the event probability prediction model training device and the event probability prediction device described below may be regarded as a functional module architecture that is required to be set by an electronic device (e.g., a PC) to respectively implement the event probability prediction model training method and the event probability prediction method provided by the embodiments of the present invention. The contents of the event probability prediction model training apparatus and the event probability prediction apparatus described below may be referred to in correspondence with the contents of the event probability prediction model training method and the event probability prediction method described above, respectively.
Fig. 8 is a block diagram of an event probability prediction model training apparatus according to an embodiment of the present invention, where the event probability prediction model training apparatus is applicable to both a client and a server, and referring to fig. 8, the event probability prediction model training apparatus includes:
a data set obtaining unit 100 adapted to obtain a training data set, wherein the training data set includes an actual event probability value corresponding to each data unit and a related index data value, the actual event probability value is a true value of an event probability of each data unit, and the related index data value is a numerical value of each index whose degree of correlation with the event probability satisfies a correlation threshold;
an event probability prediction model and event probability weight matrix obtaining unit 110, adapted to obtain a predicted event probability value according to the relevant index data value of each data unit by using the event probability prediction model to be trained, obtain a probability loss according to the predicted event probability value and the actual event probability value, adjust parameters of the event probability prediction model according to the probability loss until the probability loss meets a loss threshold, and obtain a trained event probability prediction model and an event probability weight matrix, where each element of the event probability weight matrix is a weight of each corresponding index.
It is easy to understand that, in order to implement training of the event probability prediction model, it is necessary to first obtain training data, that is, a training data set, where the training data set includes actual event probability values corresponding to the data units and related index data values, the actual event probability values are actual values of the event probabilities of the data units, and the related index data values are values of the indexes whose correlation degree with the event probabilities satisfies a correlation degree threshold.
The data unit refers to a group of corresponding actual event probability values and units to which related index data values belong, such as: for example, the event probability corresponding to a class may be a renewal rate, a withdrawal rate, and the indicators possibly related to the event probability corresponding to a class may include a title accuracy rate, a title participation rate, an understanding rate, a course liking rate, a rushing to red package participation rate, a score, a number of times of showing, a personal show viewing rate, a personal show approval rate, and the like.
Since the indexes that can be obtained and may be related to the event probability are not necessarily related to the event probability, before the event probability prediction model is trained, it is necessary to first process the data of each obtained data unit and determine each index whose correlation degree with the event probability satisfies the correlation degree threshold.
However, for different types of data units, the indexes whose correlation degrees with the event probability satisfy the correlation threshold are likely to be different, and when the prediction of the event probability is specifically performed, the indexes whose correlation degrees satisfy the correlation threshold are respectively obtained according to different classification methods, and then different predictions are respectively performed, so that it is also necessary to determine the classification method used for subsequent prediction and the indexes whose correlation degrees satisfy the correlation threshold under different classification methods.
In one embodiment, to obtain the training data set, the data set obtaining unit 100 is adapted to obtain the training data set, and comprises:
acquiring an original training data set;
acquiring a target classification dimension and an index corresponding to the target classification dimension;
and acquiring a pre-classification data set classified according to the target classification dimension, and screening the predicted relevant index data values of the indexes corresponding to the target classification dimension to obtain the training data set.
It is readily understood that the original training data set is an unprocessed training data set directly acquired, which includes actual event probability values corresponding to respective data units and predicted relevant index data values, which are numerical values of respective indices predicted to be related to the event probability.
In particular, the original training data set may be obtained by corresponding statistical software.
It should be noted that the target classification dimension described herein refers to a classification dimension determined by comparison and having a better prediction effect on the event probability. Of course, the target classification dimension needs to be obtained before the specific classification, and in a specific embodiment, in order to obtain the target classification dimension, the target classification dimension may be determined by performing correlation calculation after data classification on a plurality of classification dimensions.
Therefore, the original training data set is classified according to each dimension to be classified to obtain a pre-classification data set.
The dimension to be classified is a possible classification dimension according to the characteristics of the data unit. Such as: for pedagogically relevant data units, a predetermined classification dimension may be determined as: grade, school, class type, whole, etc.
And classifying the original data sets according to different preset classification dimensions to obtain each pre-classified data set.
Then, through a correlation calculation algorithm, by using the expected relevant index data values and the actual event probability values of the data units of the pre-classified data sets, obtaining the correlation degree between each expected relevant index data value and the actual event probability value, obtaining each index of which the correlation degree meets a correlation degree threshold value, determining the undetermined classification dimension with the largest number of relevant indexes, and the indexes corresponding to the undetermined classification dimension, and obtaining the target classification dimension and the indexes corresponding to the target classification dimension, wherein the number of the relevant indexes is the number of the indexes of which the correlation degree meets the correlation degree threshold value.
After each pre-classified data set is obtained, the corresponding data unit, the predicted relevant index data value and the actual event probability value of each data unit are obtained according to different pre-classified data sets, and then the data values and the actual event probability values are input into a correlation calculation algorithm to obtain each index of which the correlation degree with the actual event probability value meets the correlation degree threshold value.
It should be noted that the correlation degree may be a positive value indicating that the indicator and the event probability are in positive correlation, or may be a negative value indicating that the indicator and the event probability are in negative correlation, where the correlation degree satisfying the correlation degree threshold means that the absolute value of the correlation degree is greater than or equal to the correlation degree threshold.
Because the correlation threshold is too high, the number of indexes meeting the correlation threshold is too small, the subsequent training of the model is not facilitated, the correlation threshold is too low, the number of indexes meeting the correlation threshold is too large, and the operation amount of model training is increased, therefore, in a specific implementation mode, the correlation threshold can be selected to be 0.2, namely, the index with the absolute value of the correlation greater than or equal to 0.2 is determined as the index of the event probability, so that the accuracy and the operation amount of the model training are considered at the same time.
After the original training data set is classified according to each dimension to be classified, the original training data set is influenced by the data unit quantity of the original training data set, the data unit quantity of each pre-classified data set is different, and different correlation calculation algorithms can be selected respectively.
Specifically, a spearman-level correlation coefficient (sperman correlation coefficient) calculation algorithm and a kendall-level correlation coefficient (Kendal' stau-b correlation coefficient) calculation algorithm may be selected.
The requirement of the spearman rank correlation coefficient (spearman correlation coefficient) calculation algorithm on the data set neither requires that the spearman rank correlation coefficient (spearman correlation coefficient) is in accordance with normal distribution, and is applicable to any relation defined by a monotonous function, and when the sample size is relatively large, the spearman correlation coefficient calculation algorithm is selected to be better; when the sample size is relatively small, the Kendall's tau-b correlation coefficient calculation algorithm is less sensitive to errors and more accurate.
For this purpose, in the determination of the relevance indicator, it is also possible to first select a specific calculation algorithm depending on the amount of data units of the pre-classified data set.
Specifically, the data unit quantity of the pre-classified data set is obtained firstly, then whether the data unit quantity exceeds a first data quantity threshold value or not is judged, if yes, a spearman grade correlation coefficient calculation algorithm is selected for calculation, and if not, a kender grade correlation coefficient calculation algorithm can be selected.
In order to avoid the uncertain influence on the determination of the index meeting the correlation threshold value due to the fact that the data volume is too small, when the data unit volume is judged not to exceed the first data volume threshold value, whether the data unit volume exceeds the second data volume threshold value or not can be further judged, if yes, a Kendell-grade correlation coefficient calculation algorithm can be selected, and if not, the Kendel-grade correlation coefficient calculation algorithm is directly discarded.
Specific values of the first data amount threshold and the second data amount threshold may be determined as needed, and it is easily understood that the first data amount threshold is greater than the second data amount threshold.
After the correlation calculation of each pre-classified data set, each index meeting the correlation threshold corresponding to each pre-classified data set can be obtained, and then a target classification dimension needs to be selected from the dimensions to be classified according to each index meeting the correlation threshold corresponding to each pre-classified data set.
In order to ensure the accuracy of model training, the undetermined classification dimension with the largest number of indexes meeting the correlation threshold value can be selected as the target classification dimension, so that each index corresponding to the target classification dimension can be obtained.
And after the target classification dimension is obtained, acquiring a pre-classification data set classified according to the target classification dimension, and then selecting the data value of the index of each data unit in the pre-classification data set, which corresponds to the target classification dimension, to obtain a training data set.
Therefore, the obtained training data set includes an actual event probability value corresponding to each data unit and a related index data value, where the actual event probability value is a true value of the event probability of each data unit, and the related index data value is a value of each index whose degree of correlation with the event probability satisfies a correlation threshold.
It is easy to understand that, because the training data sets are obtained by classifying the original training data sets according to the target classification dimensions, there are a plurality of training data sets, and based on different training data sets, the trained event probability prediction models are likely to be different, so that during prediction, based on the class to which a specific data unit belongs, the corresponding event probability prediction model can be selected for prediction.
After the training data set is obtained, the relevant index data values of the data units are used for inputting an event probability prediction model to be trained, the event probability is predicted, the probability value of each predicted event is obtained, then the difference between the predicted event probability value and the actual event probability value is further obtained according to the predicted event probability value and the actual event probability value, and the probability loss is obtained.
After the probability loss is obtained, comparing the probability loss with a loss threshold, if the loss threshold is met, the event probability prediction model meets the accuracy requirement of prediction, and obtaining a trained event probability prediction model and an event probability weight matrix, wherein each element of the event probability weight matrix is the weight of each corresponding index; otherwise, the prediction accuracy of the event probability prediction model can not meet the accuracy requirement, the parameters of the event probability prediction model are adjusted according to the probability loss, and the event probability value after parameter adjustment is used for obtaining the predicted event probability value again.
Therefore, when the event probability prediction model is used for predicting the event probability, the predicted event probability value can be obtained, and the influence factor value of each index on the event probability can be obtained by using the event probability weight matrix.
It can be seen that, when the event probability prediction model to be trained is trained, the event probability prediction model training device provided by the embodiment of the invention utilizes the obtainable relevant index data value and the actual event probability value to construct the relationship between the relevant index data value and the event probability in the event probability prediction model, so as to realize prediction of the future event probability by utilizing the trained event probability prediction model and the index data value obtainable at the current moment, prepare for the prediction and ensure the accuracy of the prediction; meanwhile, in the process of model training, the weight of each index is also obtained, when the event probability is actually predicted, the predicted event probability can be obtained, and the influence factor value of each index for obtaining the predicted event probability can be obtained according to the related index data value of each index and the weight of the index corresponding to the related index data value, so that corresponding preparation can be made in advance for achieving the predicted event probability later or avoiding achieving the predicted event probability, and the event probability really obtained in the future can meet the expectation better.
In another specific embodiment, in order to reduce the computation amount of training on the basis of ensuring the training accuracy, an event probability prediction model training apparatus according to an embodiment of the present invention is provided, please refer to fig. 9, where fig. 9 is another block diagram of the event probability prediction model training apparatus according to the embodiment of the present invention, and the event probability prediction model training apparatus according to the embodiment of the present invention further includes:
a dimension reduction weight matrix obtaining unit 120, adapted to perform serial training of a dimension reduction model to be trained and the event probability prediction model to be trained by using the training data set until the dimension reduction model and the event probability prediction model meeting a predetermined training target are obtained, and obtain a dimension reduction weight matrix of the dimension reduction model, where each row of the dimension reduction weight matrix corresponds to each index;
an index weight obtaining unit 130, adapted to obtain each index weight according to the dimensionality reduction weight matrix;
an index category value obtaining unit 140, adapted to obtain a data value of the index category of each data unit according to the index weight corresponding to the same index and the related index data value of each data unit, so as to obtain an index category value of each data unit;
the event probability prediction model and event probability weight matrix obtaining unit 110 is adapted to obtain, by using the event probability prediction model to be trained, a predicted event probability according to the relevant index data value of each data unit, including:
and acquiring a predicted event probability according to the index category value of each data unit by using the event probability prediction model to be trained, wherein when the trained event probability prediction model is acquired, each element of the acquired event probability weight matrix is the weight of each index category.
In order to realize the training of the event probability prediction model to be trained and reduce the operation amount of the training of the event probability prediction model on the basis of ensuring the training accuracy, therefore, the indexes with the correlation degree meeting the correlation degree threshold value can be subjected to dimension reduction treatment so as to fuse more correlation indexes.
In order to ensure the realization of dimension reduction, on one hand, the index before dimension reduction is determined to correspond to the index class after dimension reduction, and on the other hand, how to obtain the index class value of the index class after dimension reduction based on the related index data value of each index before dimension reduction is determined.
In a specific embodiment, in order to implement correspondence between the indexes before dimension reduction and the index categories after dimension reduction, the index categories may be labeled in advance for each index, and the index categories are labeled according to the actual meaning association between the indexes and the index categories, that is, each index includes the labeled index category in advance. It is easily understood that the index categories are labeled to achieve dimension reduction of the index, and thus the number of index categories is smaller than the number of indexes.
In order to determine how to obtain the index class value of the index class after dimensionality reduction based on the relevant index data value of each index before dimensionality reduction, it is first necessary to determine a calculation weight for obtaining each index class value based on each relevant index data value, that is, a dimensionality reduction weight.
In a specific embodiment, in order to obtain the dimension reduction weight, a dimension reduction weight matrix may be obtained by performing series training on a pre-constructed dimension reduction model to be trained and an event probability prediction model to be trained, wherein each row of the dimension reduction weight matrix corresponds to each index, and therefore, the dimension reduction weight matrix has how many rows.
Specifically, the dimension reduction model may be an RBM (Restricted Boltzmann Machines) model or a PCA (Principal components analysis) model.
Obtaining the dimensionality reduction weight matrix specifically comprises the following steps:
obtaining each prediction event probability by using the dimension reduction model to be trained and the event probability prediction model to be trained according to the relevant index data value of each data unit of the training data set;
and obtaining probability loss according to each predicted event probability and each actual event probability, and adjusting parameters of the dimensionality reduction model and the event probability prediction model by using the probability loss until the probability loss meets a second loss threshold value to obtain the dimensionality reduction model and the event probability prediction model which meet a preset training target.
Specifically, the relevant index data values of each data unit of the training data set are input into the dimension reduction model, the obtained output is continuously input into the event probability prediction model, and each prediction event probability is obtained.
Of course, in the process of performing the tandem training, the output of the dimension reduction model is directly input into the event probability model, the two are performed in tandem, and there is no data output.
And after obtaining the probabilities of all the predicted events, obtaining probability losses according to the probabilities of all the predicted events and the probabilities of all the actual events, judging whether the probability losses meet a second loss threshold value, if not, simultaneously adjusting parameters of a dimensionality reduction model and an event probability prediction model according to the probability losses, obtaining the probabilities of the predicted events according to the dimensionality reduction model and the event probability prediction model after parameter adjustment, if so, obtaining the dimensionality reduction model and the event probability prediction model which meet a preset training target, and simultaneously obtaining a dimensionality reduction weight matrix of the dimensionality reduction model, wherein each row of the dimensionality reduction weight matrix corresponds to each index.
The event probability prediction model obtained by the series training is obtained for the purpose of training the dimension reduction model, and is not a model used for prediction in the subsequent stage.
Obtaining the dimension reduction weight matrix, and further obtaining each index weight, in an embodiment, the index weight obtaining unit 130 is adapted to obtain each index weight, and may include:
adjusting the dimensionality reduction weight matrix by utilizing the index correlation direction value corresponding to the index and the corresponding relation between the index and the index category to obtain a weight square matrix with the number of rows and columns equal to the number of the index category, and acquiring each target element of the weight square matrix, which is used for representing the index category;
determining each dimension reduction target element and each dimension reduction target element value corresponding to each target element in the dimension reduction weight matrix according to the position of each target element in the weight square matrix;
and obtaining each index weight by using each dimension reduction target element value and each index corresponding to the dimension reduction target element.
In order to obtain the index weight, firstly, a dimension reduction weight matrix is adjusted by utilizing an index correlation direction value corresponding to the index and a corresponding relation between the index and an index category to obtain a weight square matrix with the number of rows and columns equal to the number of the index category, wherein the index correlation direction value is a numerical value of the correlation direction between the index and the event probability, if the correlation degree is a positive value, the index is positively correlated with the event probability, the correlation direction value is 1, if the correlation degree is a negative value, the index is negatively correlated with the event probability, and the correlation direction value is-1.
In the actual operation process, because the event probability to be predicted is different from the incidence relation between the event probability to be increased (such as the rate of reporting again) or the event probability to be decreased (such as the rate of charge backing) and the index, it is necessary to determine whether to perform the negation operation on the index correlation direction value of the correlation degree obtained in the correlation calculation process according to which type of the event probability to be predicted is: if the probability (such as the follow-up rate) which needs to be improved is predicted, the dimension reduction weight matrix is adjusted by directly utilizing the index correlation direction value and the index category of the correlation degree obtained by correlation calculation, and a weight square matrix with the number of rows and columns equal to the number of the index categories is obtained; if the probability (such as the refuge rate) which needs to be reduced is predicted, firstly, the index correlation direction value of the correlation degree obtained by correlation calculation is subjected to one-time negation operation, then the dimension reduction weight matrix is adjusted by using the index correlation direction value and the index category obtained by negation operation, and a weight square matrix with the number of rows and columns equal to the number of the index categories is obtained.
Each row of the weight square matrix corresponds to the index type of each index, and the number of the target elements is equal to the number of the index types.
And acquiring corresponding index correlation direction values according to indexes represented by all rows of the dimensionality reduction weight matrix, calculating according to the corresponding relation between the indexes and the index categories, taking the index correlation direction values as weights, and performing weighted summation on all elements in the same column of all indexes corresponding to the same index category to obtain elements at corresponding positions of the weight square matrix.
And after the weight square matrix is obtained, further determining each target element in the weight square matrix.
The target elements in the weight matrix are determined, so that the dimension reduction target elements can be determined, and the acquisition of index weight is realized.
In a specific embodiment, the maximum value element with the largest value among the elements of the weight square matrix may be obtained first, a target element is obtained, and a row where the target element is located and a column where the target element is located are obtained.
Then, each element in the row where the target element is located and each element in the column where the target element is located in the weight square matrix are ignored, and an adjustment square matrix is obtained.
And further, taking the adjusting square matrix as a new weight square matrix, and acquiring new target elements until all the target elements are obtained.
It can be seen that, by the above method, the acquisition of the maximum value element of the weight matrix and the acquisition of the adjustment matrix can be realized conveniently, and all the target elements can be made more reasonable, i.e. the target elements are determined as the index categories with the maximum index weight importance, which is equivalent to that for a metal mixture, the gold content is 90%, the silver content is 5%, and the copper content is 5%, which we classify as gold more reasonable; on the other hand, the complete and complete index categories reduced to the target dimension can be ensured, and the repetition of a certain index category or the loss of a certain index category can not be caused, for example, the three index categories of the scholarly, the favorite and the concerned can be all provided, and the way of the scholarly, the scholarly and the favorite can not be generated.
And reversely determining the dimension reduction target elements in the dimension reduction weight matrix according to the positions of the target elements in the matrix square matrix.
And after obtaining the value of the dimensionality reduction target element and the index corresponding to the dimensionality reduction target element, further obtaining the index weight.
Specifically, each dimension reduction target element value of the index included in each index category is obtained, and then the proportion of each dimension reduction target element value in the sum of each dimension reduction target element value is obtained.
Therefore, in the event probability prediction model training method provided by the embodiment of the present invention, when calculating the index weight, since the conversion from the index to the index category is the conversion from a plurality of numbers of values to a smaller number of values, in the model training process, the dimensionality reduction weight matrix is obtained, but the meaning represented by each element in the dimensionality reduction weight matrix cannot be determined, so that the determination of the dimensionality reduction target element in the dimensionality reduction weight matrix and the determination of the meaning represented by the dimensionality reduction target element are realized by using the weight square matrix, and further, the acquisition of the index weight is realized according to the dimensionality reduction target element value in the dimensionality reduction weight matrix, so that the meaning represented by each element which cannot determine the dimensionality reduction weight matrix is displayed by skillfully using the logic of converting the index to the index category, the transparency of the black box information is realized, and the acquisition of the index is realized, and the realization of obtaining the index category value based on the related index data value can be ensured during the probability prediction of the subsequent event.
And after the index weight is obtained, further acquiring related index data values corresponding to the same index in each data unit, and acquiring the data value of each index type of each data unit in a weighted summation mode by combining the corresponding relation between the index and the index type, so as to obtain the index type value of each data unit.
And acquiring the index class value of each data unit of the whole training data set by calculating the index class value of each data unit of the training data set.
And after the index category value of each data unit is obtained, obtaining the predicted event probability of each data unit by using an event probability prediction model to be trained, and further obtaining the probability loss according to the obtained predicted event probability and the actual event probability corresponding to the same data unit.
Judging whether the probability loss meets a loss threshold value, if so, obtaining a trained event probability prediction model and an event probability weight matrix, and using an index class value to carry out event probability prediction on each element of the event probability weight matrix obtained by the event probability prediction model, wherein the element is the weight of each index class instead of the weight of each index; if not, adjusting the parameters of the event probability prediction model according to the probability loss, and training again.
Therefore, on one hand, the event probability prediction model training device provided by the embodiment of the invention can select a large number of indexes and index data values, and obtain corresponding index category values through dimension reduction conversion, so that more indexes can be utilized when the trained event probability prediction model is utilized to carry out event probability prediction, and the prediction accuracy of the event probability prediction model is improved; meanwhile, when the event probability prediction model training and the subsequent event probability prediction are carried out, the used data are index class values of data units, the data amount during the model training and the model prediction can be reduced, the operation amount is reduced, the model training efficiency and the prediction efficiency of the event probability are improved, and the accuracy and the efficiency of the event probability prediction of the model training set can be realized.
In another specific embodiment, in order to further improve the accuracy of the obtained index weights, the training data set includes training data subsets, and the number of the dimension reduction models to be trained is at least equal to the number of the training data subsets, the dimension reduction weight matrix obtaining unit 120 of the event probability prediction model training device according to the embodiment of the present invention is adapted to perform series training of the corresponding dimension reduction models to be trained and the event probability prediction models to be trained by using the training data subsets, respectively, until obtaining each dimension reduction model and the event probability prediction model that satisfy a predetermined training target, and obtaining each dimension reduction weight matrix of each dimension reduction model;
adjusting each dimensionality reduction weight matrix by using the index correlation direction value corresponding to the index and the index category to obtain each weight square matrix with the number of rows and columns equal to the number of the index category, and acquiring each target element of each weight square matrix;
determining the weight square matrixes with the same positions and the largest quantity of target elements in each weight square matrix to obtain each consistent weight square matrix, and determining each dimension reduction weight matrix corresponding to each consistent weight square matrix;
an index weight obtaining unit 130, adapted to determine, according to a position of each target element in each consistent weight matrix, each dimension-reduced target element and each dimension-reduced target element value corresponding to each target element in each dimension-reduced weight matrix, obtain a mean value of each dimension-reduced target element value at the same position in each dimension-reduced weight matrix, obtain a mean value of dimension-reduced target elements, and obtain each index weight by using each dimension-reduced target element mean value and an index corresponding to each dimension-reduced target element.
It is easy to understand that each training data subset can be obtained by splitting the training data set, and each dimension reduction model can be an RBM model or a PCA model.
In the training process of the event probability prediction model, each data unit in a training data subset is used for carrying out series training on a dimensionality reduction model and the event probability prediction model to be trained.
After model series training, the dimension reduction weight matrix of each dimension reduction model is obtained, so if there are n dimension reduction models, n dimension reduction weight matrices can be obtained.
After obtaining each dimension reduction weight matrix, adjusting each dimension reduction weight matrix, that is, adjusting each dimension reduction weight matrix by using the index correlation direction value and the index type corresponding to the index to obtain each weight square matrix with the number of rows and columns equal to the number of the index type, and obtaining each target element of each weight square matrix.
It is easy to understand that, based on each dimension reduction weight matrix, each weight square matrix and the target elements of each weight square matrix are obtained, and if n dimension reduction weight matrices are obtained through the foregoing steps, n weight square matrices are obtained, and then n groups of target elements are obtained.
After obtaining each set of target elements corresponding to each weight square matrix, since positions of the target elements obtained based on each weight square matrix are different, in order to ensure implementation of subsequent operations and improve training accuracy, the dimension reduction weight matrix obtaining unit 120 of the training device of the event probability prediction model provided by the embodiment of the present invention is further adapted to determine the weight square matrices with the same position and the largest number of target elements in each weight square matrix, obtain each uniform weight square matrix, and determine each dimension reduction weight matrix corresponding to each uniform weight square matrix.
Determining the weight square matrixes with the same positions of the target elements according to the obtained positions of all groups of target elements, then counting the number of the weight square matrixes with the same positions of all groups of target elements, taking the group of the weight square matrixes with the largest number as consistent weight square matrixes, and then obtaining all dimension reduction weight matrixes before conversion according to all the consistent weight square matrixes.
Furthermore, according to the position of each target element in each consistent weight square matrix, determining a dimensionality reduction target element and a dimensionality reduction target element value in the corresponding dimensionality reduction weight matrix.
If there are k consistent weight square matrixes, k groups of dimension reduction target elements and dimension reduction target element values are determined, and then the average value of the dimension reduction target element values of the dimension reduction weight matrix corresponding to each consistent weight square matrix, namely the average value of the dimension reduction target element values of the k groups, is calculated to obtain the dimension reduction target element average value.
And then, acquiring each index weight by using each dimension reduction target element mean value and each index corresponding to the dimension reduction target element.
Specifically, firstly, determining each index with the same labeled index type to obtain each index with the same type;
obtaining the sum of the dimensionality reduction target element average values of all the indexes of the same category to obtain dimensionality reduction indexes and values;
and then obtaining the index weight by using the dimensionality reduction index metamean value of each index of the same category and the dimensionality reduction index and value corresponding to the dimensionality reduction index metamean value.
Thus, the index weight can be conveniently obtained.
It can be seen that, in the event probability prediction model training device provided in the embodiment of the present invention, a plurality of dimension reduction weight matrices are obtained through a plurality of dimension reduction models, and then a plurality of weight square matrices are obtained, and then a dimension reduction weight matrix for performing index weight calculation is determined by using determination and selection of positions of target elements of each weight square matrix, and acquisition of index weights is achieved by using an average value, so that accuracy of the obtained index weights can be improved, and accuracy of model training and accuracy of subsequent event probability prediction are further improved.
In order to predict the event probability and determine the influence factor value that affects the event probability, an embodiment of the present invention further provides an event probability prediction apparatus, please refer to fig. 10, where fig. 10 is a block diagram of the event probability prediction apparatus provided in the embodiment of the present invention, and includes:
a prediction related index data value obtaining unit 200 adapted to obtain a prediction related index data value of a data unit to be event probability predicted;
the prediction related index data value and influence factor value obtaining unit 210 is adapted to obtain a prediction event probability according to the prediction related index data value by using an event probability prediction model obtained by the aforementioned event probability prediction model training method, and obtain an influence factor value of each index according to the prediction related index data value and an event probability weight matrix obtained by the aforementioned event probability prediction model training method.
Since there are many indexes of the data unit to be event probability predicted, here, the prediction related index data value obtaining unit 200 only needs to obtain the prediction related index data value, that is, the prediction related index data value of the index whose correlation satisfies the correlation threshold value, which is obtained through correlation calculation and corresponds to the target classification dimension.
And after the data value of the prediction related index is obtained, event probability prediction and acquisition of the influence factor value of the index are further carried out.
It is easy to understand that under the influence of the target classification dimension determined in the model training phase, when performing event probability prediction, it is also necessary to determine a training data set where a data unit to be subjected to event probability prediction is located according to the target classification dimension, and then determine a corresponding trained event probability prediction model and an event probability weight matrix.
During specific prediction, the prediction related index data value of a data unit to be subjected to event probability prediction is input into a corresponding event probability prediction model to obtain the predicted event probability, and then the influence factor value of the index is obtained by using the prediction related index data value and the weight matrix of the same index in the event probability weight matrix.
It can be seen that the event probability prediction device provided in the embodiment of the present invention not only can predict the event probability, but also can obtain the influence factor values of each index of the event probability, and can make corresponding preparations in advance for subsequently achieving the expected event probability or avoiding achieving the predicted event probability, so that the event probability actually obtained in the future can better meet the expectation.
In another specific embodiment, the unit 210 may further determine whether an event occurs according to the obtained predicted event probability and a predetermined event probability threshold, determine that the event occurs when the predicted event probability exceeds the predetermined event probability threshold, and otherwise determine that the event does not occur.
Therefore, whether the event occurs or not can be judged more directly, and the result obtained by the user is clear at a glance.
Of course, the predetermined event probability threshold may be adjusted as needed, so that the final judgment result is more accurate.
In order to predict the event probability and determine an influence factor value that affects the event probability, an embodiment of the present invention further provides another event probability prediction apparatus, further including: a prediction index class value obtaining unit, adapted to obtain a prediction index class value of the data unit by using the index weight obtained by the event probability prediction model training method according to any one of the preceding claims and a prediction-related index data value corresponding to the index weight;
the unit 210 for obtaining data values of prediction-related indexes and values of influence factors is further adapted to obtain a probability of a predicted event according to the index class value by using an event probability prediction model obtained by the training method of the event probability prediction model according to any one of the preceding items, and obtain values of influence factors of each index class according to the index class value and an event probability weight matrix obtained by the training method of the event probability prediction model according to any one of the preceding items.
After the prediction related index data value is obtained, the prediction index class value obtaining unit further needs to obtain the index weight obtained by the event probability prediction model training method, and when the index weight is obtained, the obtained index weight and the obtained data unit to be predicted belong to the same class under the target class dimension.
Then, weighted summation is carried out according to the corresponding relation between the index weight and the data value of the prediction related index (namely, all the indexes correspond to the same index) and the corresponding relation with the index class (namely, which index class the index belongs to) to obtain each prediction index class value of the data unit.
In this embodiment, when obtaining the predicted event probability, the index class value is used, and each element of the event probability weight matrix used is the weight of the index class, so as to obtain the influence factor value of each index class.
Therefore, the event probability prediction method provided by the embodiment of the invention can realize more accurate prediction of event probability through smaller operation amount, and can obtain the influence factor value of the index type, and further adjust the current behavior according to the influence factor value of the index type, so that the actual event probability obtained in the future can meet the expectation.
Of course, the embodiment of the present invention further provides an apparatus, and the apparatus provided in the embodiment of the present invention may load the program module architecture in a program form, so as to implement the event probability prediction model training method or the event probability prediction method provided in the embodiment of the present invention; the hardware device can be applied to an electronic device with specific data processing capacity, and the electronic device can be: such as a terminal device or a server device.
Optionally, fig. 11 shows an optional hardware device architecture of the device provided in the embodiment of the present invention, which may include: at least one memory 3 and at least one processor 1; the memory stores a program that the processor calls to execute the aforementioned event probability prediction model training method or event probability prediction method, in addition to at least one communication interface 2 and at least one communication bus 4; the processor 1 and the memory 3 may be located in the same electronic device, for example, the processor 1 and the memory 3 may be located in a server device or a terminal device; the processor 1 and the memory 3 may also be located in different electronic devices.
As an alternative implementation of the disclosure of the embodiment of the present invention, the memory 3 may store a program, and the processor 1 may call the program to execute the event probability prediction model training method or the event probability prediction method provided by the above-described embodiment of the present invention.
In the embodiment of the present invention, the electronic device may be a tablet computer, a notebook computer, or the like, which is capable of performing event probability prediction model training or event probability prediction.
In the embodiment of the present invention, the number of the processor 1, the communication interface 2, the memory 3, and the communication bus 4 is at least one, and the processor 1, the communication interface 2, and the memory 3 complete mutual communication through the communication bus 4; it is clear that the communication connection of the processor 1, the communication interface 2, the memory 3 and the communication bus 4 shown in fig. 11 is only an alternative;
optionally, the communication interface 2 may be an interface of a communication module, such as an interface of a GSM module;
the processor 1 may be a central processing unit CPU or a Specific Integrated circuit asic (application Specific Integrated circuit) or one or more Integrated circuits configured to implement an embodiment of the invention.
The memory 3 may comprise a high-speed RAM memory and may also comprise a non-volatile memory, such as at least one disk memory.
It should be noted that the above-mentioned apparatus may also include other devices (not shown) that may not be necessary to the disclosure of the embodiments of the present invention; these other components may not be necessary to understand the disclosure of embodiments of the present invention, which are not individually described herein.
Embodiments of the present invention further provide a computer-readable storage medium, where computer-executable instructions are stored, and when executed by a processor, the instructions may implement the event probability prediction model training method or the event probability prediction method as described above.
The computer executable instruction stored in the storage medium provided by the embodiment of the invention is carried out by utilizing the acquirable related index data value and the actual event probability value when the event probability prediction model to be trained is trained and the event probability prediction model to be trained is trained, so that the relation between the related index data value and the event probability in the event probability prediction model is constructed, the trained event probability prediction model is utilized subsequently, the prediction of the future event probability is realized through the acquired index data value at the current moment, the preparation is made, and the prediction accuracy is ensured; meanwhile, in the process of model training, the weight of each index is also obtained, when the event probability is actually predicted, the predicted event probability can be obtained, and the influence factor value of each index for obtaining the predicted event probability can be obtained according to the related index data value of each index and the weight of the index corresponding to the related index data value, so that corresponding preparation can be made in advance for achieving the predicted event probability later or avoiding achieving the predicted event probability, and the event probability really obtained in the future can meet the expectation better.
The embodiments of the present invention described above are combinations of elements and features of the present invention. Unless otherwise mentioned, the elements or features may be considered optional. Each element or feature may be practiced without being combined with other elements or features. In addition, the embodiments of the present invention may be configured by combining some elements and/or features. The order of operations described in the embodiments of the present invention may be rearranged. Some configurations of any embodiment may be included in another embodiment, and may be replaced with corresponding configurations of the other embodiment. It is obvious to those skilled in the art that claims that are not explicitly cited in each other in the appended claims may be combined into an embodiment of the present invention or may be included as new claims in a modification after the filing of the present application.
Embodiments of the invention may be implemented by various means, such as hardware, firmware, software, or a combination thereof. In a hardware configuration, the method according to an exemplary embodiment of the present invention may be implemented by one or more Application Specific Integrated Circuits (ASICs), Digital Signal Processors (DSPs), Digital Signal Processing Devices (DSPDs), Programmable Logic Devices (PLDs), Field Programmable Gate Arrays (FPGAs), processors, controllers, micro-controllers, microprocessors, and the like.
In a firmware or software configuration, embodiments of the present invention may be implemented in the form of modules, procedures, functions, and the like. The software codes may be stored in memory units and executed by processors. The memory unit is located inside or outside the processor, and may transmit and receive data to and from the processor via various known means.
The previous description of the disclosed embodiments is provided to enable any person skilled in the art to make or use the present invention. Various modifications to these embodiments will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other embodiments without departing from the spirit or scope of the invention. Thus, the present invention is not intended to be limited to the embodiments shown herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.
Although the embodiments of the present invention have been disclosed, the present invention is not limited thereto. Various changes and modifications may be effected therein by one skilled in the art without departing from the spirit and scope of the invention as defined in the appended claims.

Claims (16)

1. An event probability prediction model training method is applied to the field of education and comprises the following steps:
acquiring a training data set, wherein the training data set comprises an actual event probability value and a related index data value corresponding to each data unit, the actual event probability value is a real value of an event probability of each data unit, the related index data value is a numerical value of each index of which the degree of correlation with the event probability meets a degree of correlation threshold, the event probability comprises a back rate or a report continuing rate, and the indexes comprise a topic accuracy rate, a topic participation rate, an understanding rate, a course liking rate, a red packet snatching participation rate, a score, a raise number, a personal show viewing rate and a personal show praise rate;
acquiring a predicted event probability value according to the relevant index data value of each data unit by using the event probability prediction model to be trained, acquiring probability loss according to the predicted event probability value and the actual event probability value, adjusting parameters of the event probability prediction model according to the probability loss until the probability loss meets a loss threshold value, and acquiring the trained event probability prediction model and an event probability weight matrix, wherein each element of the event probability weight matrix is the weight of each corresponding index;
each index comprises pre-labeled index types, and the number of the index types is smaller than that of the indexes;
before the step of obtaining the predicted event probability according to the relevant index data value of each data unit by using the event probability prediction model to be trained, the method further includes:
performing series training on a dimensionality reduction model to be trained and the event probability prediction model to be trained by using the training data set until the dimensionality reduction model and the event probability prediction model meeting a preset training target are obtained, and acquiring a dimensionality reduction weight matrix of the dimensionality reduction model, wherein each row of the dimensionality reduction weight matrix corresponds to each index;
acquiring each index weight according to the dimensionality reduction weight matrix;
acquiring the data value of the index type of each data unit according to the index weight corresponding to the same index and the related index data value of each data unit to obtain the index type value of each data unit;
the step of obtaining the predicted event probability according to the relevant index data value of each data unit by using the event probability prediction model to be trained comprises the following steps:
and acquiring a predicted event probability according to the index category value of each data unit by using the event probability prediction model to be trained, wherein when the trained event probability prediction model is acquired, each element of the acquired event probability weight matrix is the weight of each index category.
2. The method for training the event probability prediction model according to claim 1, wherein the step of obtaining the weight of each index according to the dimensionality reduction weight matrix comprises:
adjusting the dimensionality reduction weight matrix by using an index correlation direction value and the index category corresponding to the index to obtain a weight square matrix with the number of rows and columns equal to the number of the index category, and acquiring target elements of the weight square matrix, wherein the target elements are used for representing the index category, the index correlation direction value is a numerical value of the correlation direction of the index and the event probability, each row of the weight square matrix corresponds to the index category of each index, and the number of the target elements is equal to the number of the index category;
determining each dimension reduction target element and each dimension reduction target element value corresponding to each target element in the dimension reduction weight matrix according to the position of each target element in the weight square matrix;
and obtaining each index weight by using each dimension reduction target element value and each index corresponding to the dimension reduction target element.
3. The method of training an event probability prediction model according to claim 2, wherein the training data set includes training data subsets, the number of the dimension reduction models to be trained being at least equal to the number of the training data subsets;
the step of performing series training of the dimensionality reduction model to be trained and the event probability prediction model to be trained by using the training data set until the dimensionality reduction model and the event probability prediction model meeting a preset training target are obtained, and acquiring the dimensionality reduction weight matrix of the dimensionality reduction model comprises the following steps of:
respectively utilizing each training data subset to carry out series training on a dimensionality reduction model to be trained and the event probability prediction model to be trained corresponding to the training data subset until each dimensionality reduction model and the event probability prediction model meeting a preset training target are obtained, and obtaining each dimensionality reduction weight matrix of each dimensionality reduction model;
the step of adjusting the dimensionality reduction weight matrix by using the index correlation direction value corresponding to the index and the index category to obtain a weight square matrix with the number of rows and columns equal to the number of the index category, and the step of obtaining each target element of the weight square matrix, which is used for representing the index category, includes:
adjusting each dimensionality reduction weight matrix by using the index correlation direction value corresponding to the index and the index category to obtain each weight square matrix with the number of rows and columns equal to the number of the index category, and acquiring each target element of each weight square matrix;
the step of determining each dimension reduction target element and each dimension reduction target element value corresponding to each target element in the dimension reduction weight matrix according to the row and the column of each target element in the weight matrix, and obtaining each index weight by using each dimension reduction target element value and the index corresponding to each dimension reduction target element comprises:
determining the weight square matrixes with the same positions and the largest quantity of target elements in each weight square matrix to obtain each consistent weight square matrix, and determining each dimension reduction weight matrix corresponding to each consistent weight square matrix;
determining each dimension reduction target element and each dimension reduction target element value corresponding to each target element in each dimension reduction weight matrix according to the position of each target element in each consistent weight matrix, obtaining the mean value of each dimension reduction target element value at the same position in each dimension reduction weight matrix to obtain the mean value of the dimension reduction target elements, and obtaining each index weight by using each dimension reduction target element mean value and the index corresponding to each dimension reduction target element.
4. The training method of the event probability prediction model according to claim 3, wherein the step of obtaining the weight of each index by using the mean value of each dimension-reducing target element and the index corresponding to each dimension-reducing target element comprises:
determining each index with the same labeled index type to obtain each index with the same type;
obtaining the sum of the dimensionality reduction target element average values of all the indexes of the same category to obtain dimensionality reduction indexes and values;
and obtaining the index weight by using the dimensionality reduction index metamean value of each index of the same category and the dimensionality reduction index and value corresponding to the dimensionality reduction index metamean value.
5. The method for training the event probability prediction model according to claim 2, wherein the step of adjusting the dimensionality reduction weight matrix by using the index correlation direction value corresponding to the index and the index class to obtain a weight square matrix with the number of rows and columns equal to the number of the index classes comprises:
when the event probability is determined to be of the type needing to be reduced, performing negation operation on the index correlation direction value corresponding to the index;
and adjusting the dimensionality reduction weight matrix by utilizing the index correlation direction value which corresponds to the index and is subjected to the negation operation and the index type to obtain a weight square matrix with the number of rows and columns equal to the number of the index type.
6. The training method of the event probability prediction model according to claim 2, wherein the step of obtaining each target element of the weight matrix for representing an index class comprises:
obtaining a maximum value element with the maximum value in each element of the weight square matrix to obtain a target element, and obtaining a row where the target element is located and a column where the target element is located;
and neglecting each element of the line where the target element is located and each element of the column where the target element is located in the weight square matrix to obtain an adjustment square matrix, and taking the adjustment square matrix as a new weight square matrix to obtain a new target element until all the target elements are obtained.
7. The method of training an event probability prediction model of claim 1, wherein the step of obtaining a training data set comprises:
acquiring an original training data set, wherein the original training data set comprises actual event probability values corresponding to all data units and predicted related index data values, and the predicted related index data values are values of all indexes predicted to be related to the event probability;
acquiring a target classification dimension and an index corresponding to the target classification dimension;
and acquiring a pre-classification data set classified according to the target classification dimension, and screening the predicted relevant index data values of the indexes corresponding to the target classification dimension to obtain the training data set.
8. The method of training an event probability prediction model according to claim 7, wherein the step of obtaining a target classification dimension and an index corresponding to the target classification dimension comprises:
classifying the original training data set according to each dimension to be classified to obtain a pre-classification data set;
through a correlation calculation algorithm, by utilizing the expected relevant index data values and the actual event probability values of the data units of the pre-classified data sets, obtaining the correlation degree between each expected relevant index data value and the actual event probability value, obtaining each index of which the correlation degree meets a correlation degree threshold value, determining the dimension to be classified with the largest number of relevant indexes, and the indexes corresponding to the dimension to be classified, and obtaining the target classification dimension and the indexes corresponding to the target classification dimension, wherein the number of the relevant indexes is the number of the indexes of which the correlation degree meets the correlation degree threshold value.
9. The method of training an event probability prediction model of claim 8, wherein the correlation calculation algorithm comprises: a spearman rank correlation coefficient calculation algorithm and a kender rank correlation coefficient calculation algorithm.
10. The method for training the event probability prediction model according to claim 1, wherein the step of performing the series training of the dimension reduction model to be trained and the event probability prediction model to be trained by using the training data set until obtaining the dimension reduction model and the event probability prediction model which satisfy a predetermined training target comprises:
and obtaining each prediction event probability by using the dimension reduction model to be trained and the event probability prediction model to be trained according to the relevant index data value of each data unit of the training data set, obtaining probability loss according to each prediction event probability and each actual event probability, and adjusting parameters of the dimension reduction model and the event probability prediction model by using the probability loss until the probability loss meets a second loss threshold value to obtain the dimension reduction model and the event probability prediction model meeting a preset training target.
11. An event probability prediction method, comprising:
acquiring a prediction related index data value of a data unit to be subjected to event probability prediction;
acquiring a prediction index class value of the data unit by using the index weight obtained by the event probability prediction model training method according to any one of claims 1 to 10 and a prediction-related index data value corresponding to the index weight;
the event probability prediction model obtained by the event probability prediction model training method according to any one of claims 1 to 10 obtains a predicted event probability according to the index class value, and obtains an influence factor value of each index class according to the index class value and the event probability weight matrix obtained by the event probability prediction model training method according to any one of claims 1 to 10.
12. The event probability prediction method of claim 11, further comprising:
determining that the event will occur when the predicted event probability exceeds a predetermined event probability threshold, otherwise determining that the event will not occur.
13. An event probability prediction model training device is applied to the field of education and comprises the following components:
a data set obtaining unit adapted to obtain a training data set, wherein the training data set includes an actual event probability value corresponding to each data unit and a related index data value, the actual event probability value is a real value of an event probability of each data unit, the related index data value is a numerical value of each index whose degree of correlation with the event probability satisfies a degree of correlation threshold, wherein the actual event probability includes a back rate or a follow-up rate, and the indexes include a title accuracy rate, a title participation rate, an understanding rate, a course liking rate, a red packet grabbing participation rate, a score, a raise number, a personal show viewing rate, and a personal show like rate;
an event probability prediction model and event probability weight matrix obtaining unit, adapted to obtain a predicted event probability value according to the relevant index data value of each data unit by using the event probability prediction model to be trained, obtain a probability loss according to the predicted event probability value and the actual event probability value, adjust parameters of the event probability prediction model according to the probability loss until the probability loss meets a loss threshold, and obtain a trained event probability prediction model and an event probability weight matrix, wherein each element of the event probability weight matrix is a weight of each corresponding index;
each of the indicators includes a pre-labeled indicator category, the number of the indicator categories is smaller than the number of the indicators, and the method further includes:
a dimension reduction weight matrix obtaining unit, adapted to perform series training of a dimension reduction model to be trained and the event probability prediction model to be trained by using the training data set until the dimension reduction model and the event probability prediction model satisfying a predetermined training target are obtained, and obtain a dimension reduction weight matrix of the dimension reduction model, wherein each row of the dimension reduction weight matrix corresponds to each index;
the index weight obtaining unit is suitable for obtaining each index weight according to the dimensionality reduction weight matrix;
an index category value obtaining unit adapted to obtain a data value of the index category of each data unit according to the index weight corresponding to the same index and the related index data value of each data unit, to obtain an index category value of each data unit;
the unit for obtaining the event probability prediction model and the event probability weight matrix is adapted to obtain the predicted event probability according to the relevant index data value of each data unit by using the event probability prediction model to be trained, and comprises:
and acquiring a predicted event probability according to the index category value of each data unit by using the event probability prediction model to be trained, wherein when the trained event probability prediction model is acquired, each element of the acquired event probability weight matrix is the weight of each index category.
14. An event probability prediction device, comprising:
the device comprises a prediction related index data value acquisition unit, a prediction related index data value acquisition unit and a prediction processing unit, wherein the prediction related index data value acquisition unit is suitable for acquiring a prediction related index data value of a data unit to be subjected to event probability prediction;
a prediction index class value obtaining unit adapted to obtain a prediction index class value of the data unit by using the index weight obtained by the event probability prediction model training method according to any one of claims 1 to 10 and a prediction-related index data value corresponding to the index weight;
a prediction-related index data value and influence factor value obtaining unit adapted to obtain a predicted event probability according to the index class value using the event probability prediction model obtained by the event probability prediction model training method according to any one of claims 1 to 10, and obtain an influence factor value of each index class according to the index class value and an event probability weight matrix obtained by the event probability prediction model training method according to any one of claims 1 to 10.
15. A storage medium characterized in that the storage medium stores a program adapted for event probability prediction model training to implement the event probability prediction model training method according to any one of claims 1 to 10, or the storage medium stores a program adapted for event probability prediction to implement the event probability prediction method according to claim 11 or 12.
16. An electronic device comprising at least one memory and at least one processor; the memory stores a program that is called by the processor to perform the event probability prediction model training method according to any one of claims 1 to 10 or the event probability prediction method according to claim 11 or 12.
CN202110049876.5A 2021-01-14 2021-01-14 Event probability prediction model training method, event probability prediction method and related device Active CN112381338B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110049876.5A CN112381338B (en) 2021-01-14 2021-01-14 Event probability prediction model training method, event probability prediction method and related device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110049876.5A CN112381338B (en) 2021-01-14 2021-01-14 Event probability prediction model training method, event probability prediction method and related device

Publications (2)

Publication Number Publication Date
CN112381338A CN112381338A (en) 2021-02-19
CN112381338B true CN112381338B (en) 2021-07-27

Family

ID=74581853

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110049876.5A Active CN112381338B (en) 2021-01-14 2021-01-14 Event probability prediction model training method, event probability prediction method and related device

Country Status (1)

Country Link
CN (1) CN112381338B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116894653B (en) * 2023-08-16 2024-02-23 广州红海云计算股份有限公司 Personnel management data processing method and system based on multi-prediction model linkage

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102075352A (en) * 2010-12-17 2011-05-25 北京邮电大学 Method and device for predicting network user behavior
CN104504460A (en) * 2014-12-09 2015-04-08 北京嘀嘀无限科技发展有限公司 Method and device for predicating user loss of car calling platform
CN106682754A (en) * 2015-11-05 2017-05-17 阿里巴巴集团控股有限公司 Event occurrence probability prediction method and device
CN108491817A (en) * 2018-03-30 2018-09-04 国信优易数据有限公司 A kind of event detection model training method, device and event detecting method
CN110956296A (en) * 2018-09-26 2020-04-03 北京嘀嘀无限科技发展有限公司 User loss probability prediction method and device
CN111564223A (en) * 2020-07-20 2020-08-21 医渡云(北京)技术有限公司 Infectious disease survival probability prediction method, and prediction model training method and device

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106685674B (en) * 2015-11-05 2020-01-10 华为技术有限公司 Method and device for predicting network event and establishing network event prediction model
CN109711534A (en) * 2018-12-20 2019-05-03 树根互联技术有限公司 Dimensionality reduction model training method, device and electronic equipment

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102075352A (en) * 2010-12-17 2011-05-25 北京邮电大学 Method and device for predicting network user behavior
CN104504460A (en) * 2014-12-09 2015-04-08 北京嘀嘀无限科技发展有限公司 Method and device for predicating user loss of car calling platform
CN106682754A (en) * 2015-11-05 2017-05-17 阿里巴巴集团控股有限公司 Event occurrence probability prediction method and device
CN108491817A (en) * 2018-03-30 2018-09-04 国信优易数据有限公司 A kind of event detection model training method, device and event detecting method
CN110956296A (en) * 2018-09-26 2020-04-03 北京嘀嘀无限科技发展有限公司 User loss probability prediction method and device
CN111564223A (en) * 2020-07-20 2020-08-21 医渡云(北京)技术有限公司 Infectious disease survival probability prediction method, and prediction model training method and device

Also Published As

Publication number Publication date
CN112381338A (en) 2021-02-19

Similar Documents

Publication Publication Date Title
Banasik et al. Sample selection bias in credit scoring models
CN114265979B (en) Method for determining fusion parameters, information recommendation method and model training method
CN108573358A (en) A kind of overdue prediction model generation method and terminal device
CN105446988A (en) Classification predicting method and device
CN113239914B (en) Classroom student expression recognition and classroom state evaluation method and device
CN113761359B (en) Data packet recommendation method, device, electronic equipment and storage medium
CN112766402A (en) Algorithm selection method and device and electronic equipment
CN112365384B (en) Target event result index weight, influence factor value determination method and related device
CN112381338B (en) Event probability prediction model training method, event probability prediction method and related device
CN114492279A (en) Parameter optimization method and system for analog integrated circuit
CN112561320A (en) Training method of mechanism risk prediction model, mechanism risk prediction method and device
CN115545103A (en) Abnormal data identification method, label identification method and abnormal data identification device
CN114202174A (en) Electricity price risk grade early warning method and device and storage medium
CN103763123A (en) Method and device for evaluating health condition of network
CN110929516A (en) Text emotion analysis method and device, electronic equipment and readable storage medium
US20200410361A1 (en) Information processing apparatus, control method, and non-transitory storage medium
CN113010687B (en) Exercise label prediction method and device, storage medium and computer equipment
CN114493674A (en) Advertisement click rate prediction model and method
CN113868523A (en) Recommendation model training method, electronic device and storage medium
CN113362179B (en) Method, apparatus, device, storage medium and program product for predicting transaction data
CN109740671B (en) Image identification method and device
CN116975621A (en) Model stability monitoring method and device and computer equipment
CN115658899A (en) Text classification method and device, computer equipment and storage medium
CN109690581B (en) User guidance system and method
CN117216619A (en) Training of message classification model, message recommendation method, device, medium and equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant