CN111598338A - Method, apparatus, medium, and electronic device for updating prediction model - Google Patents
Method, apparatus, medium, and electronic device for updating prediction model Download PDFInfo
- Publication number
- CN111598338A CN111598338A CN202010418579.9A CN202010418579A CN111598338A CN 111598338 A CN111598338 A CN 111598338A CN 202010418579 A CN202010418579 A CN 202010418579A CN 111598338 A CN111598338 A CN 111598338A
- Authority
- CN
- China
- Prior art keywords
- user
- training sample
- sample set
- users
- quasi
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q10/00—Administration; Management
- G06Q10/04—Forecasting or optimisation specially adapted for administrative or management purposes, e.g. linear programming or "cutting stock problem"
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N20/00—Machine learning
- G06N20/20—Ensemble learning
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/044—Recurrent networks, e.g. Hopfield networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Software Systems (AREA)
- General Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- Computing Systems (AREA)
- Mathematical Physics (AREA)
- Artificial Intelligence (AREA)
- Evolutionary Computation (AREA)
- General Engineering & Computer Science (AREA)
- Business, Economics & Management (AREA)
- Molecular Biology (AREA)
- General Health & Medical Sciences (AREA)
- Biomedical Technology (AREA)
- Computational Linguistics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Health & Medical Sciences (AREA)
- Biophysics (AREA)
- Human Resources & Organizations (AREA)
- Strategic Management (AREA)
- Economics (AREA)
- Operations Research (AREA)
- Development Economics (AREA)
- Entrepreneurship & Innovation (AREA)
- Marketing (AREA)
- Game Theory and Decision Science (AREA)
- Quality & Reliability (AREA)
- Tourism & Hospitality (AREA)
- General Business, Economics & Management (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Medical Informatics (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
Abstract
A method, apparatus, medium, and electronic device for updating a predictive model are disclosed. The method comprises the following steps: acquiring user characteristics of a plurality of users from a training sample set of a prediction model currently used by a system; obtaining user characteristics of a plurality of users from a quasi-training sample set; determining feature distribution difference information of user features of a plurality of users from the training sample set and user features of a plurality of users from the quasi-training sample set; if the feature distribution difference information meets a preset difference condition, performing prediction model training by using user features in a quasi-training sample set; updating the current prediction model used by the system by using the prediction model obtained by training; wherein the predictive model is used to predict a probability that a user in the system will perform a target behavior within a first time range after a predetermined point in time. The technical scheme provided by the disclosure is beneficial to promoting the prediction model currently used by the system to continuously keep better prediction accuracy.
Description
Technical Field
The present disclosure relates to computer technologies, and in particular, to a method for updating a prediction model, an apparatus for updating a prediction model, a storage medium, and an electronic device.
Background
In the process of providing services for users in the system, the probability of target behaviors such as deal behaviors and the like of the users is often required to be predicted by using a prediction model so as to provide better services for the users. In order to improve the prediction accuracy of the prediction model, the prediction model is often required to be updated. How to update the prediction model in time so as to improve the prediction accuracy of the prediction model is a technical problem worthy of attention.
Disclosure of Invention
The present disclosure is proposed to solve the above technical problems. Embodiments of the present disclosure provide a method for updating a prediction model, an apparatus for updating a prediction model, a storage medium, and an electronic device.
According to an aspect of an embodiment of the present disclosure, there is provided a method for updating a prediction model, the method including: acquiring user characteristics of a plurality of users from a training sample set of a prediction model currently used by a system; obtaining user characteristics of a plurality of users from a quasi-training sample set; determining feature distribution difference information of user features of a plurality of users from the training sample set and user features of a plurality of users from the quasi-training sample set; if the feature distribution difference information meets a preset difference condition, performing prediction model training by using the user features in the quasi-training sample set; updating a prediction model currently used by the system by using the prediction model obtained by training; wherein the predictive model is used to predict a probability that a user in the system will perform a target behavior within a first time range after a predetermined point in time.
In an embodiment of the present disclosure, before obtaining user features of a plurality of users from a quasi-training sample set, the method further includes: and according to the service data, regularly acquiring the user characteristics of all active users in a second time range closest to the preset historical time to form a quasi-training sample set.
In still another embodiment of the present disclosure, the feature distribution difference information includes: degree of difference in feature distribution; and/or ranking information of the contribution of each feature element in the user feature to the feature distribution difference.
In yet another embodiment of the present disclosure, the determining feature distribution difference information of the user features from the plurality of users in the training sample set and the user features from the plurality of users in the quasi-training sample set includes: respectively setting first version marking information for the user characteristics of each user in the training sample set; setting second version marking information for the user characteristics of each user in the quasi-training sample set respectively; respectively taking the user features in the training sample set and the user features in the quasi-training sample set as inputs to be provided to a version recognition model, and respectively carrying out version recognition processing on the input user features through the version recognition model; and determining the feature distribution difference information of the user features of the plurality of users from the training sample set and the user features of the plurality of users from the quasi-training sample set according to the version recognition processing result output by the version recognition model, the first version marking information and the second version marking information.
In yet another embodiment of the present disclosure, the providing the user features in the training sample set and the user features in the quasi-training sample set as inputs to a version recognition model, and performing version recognition processing on the input user features through the version recognition model respectively includes: taking the user characteristics of part of users in the training sample set as a first training sample; taking the user characteristics of part of users in the quasi-training sample set as second training samples; training a version recognition model by using the first training sample and the second training sample; and respectively providing the user characteristics of the other part of users in the training sample set and the user characteristics of the other part of users in the quasi-training sample set as input to the trained version recognition model, and respectively carrying out version recognition processing on the input user characteristics through the trained version recognition model.
In yet another embodiment of the present disclosure, the determining, according to the version identification processing result output by the version identification model, the first version label information, and the second version label information, the feature distribution difference information of the user features from the plurality of users in the training sample set and the user features from the plurality of users in the quasi-training sample set includes: and calculating the Mauss correlation coefficient of the user features in the training sample set and the user features in the quasi-training sample set according to the version information of the user features output by the version identification model and the input first version marking information/second version marking information of the user features.
In another embodiment of the present disclosure, if the feature distribution difference information satisfies a preset difference condition, performing prediction model training by using the user features in the quasi-training sample set includes: if the Mazis correlation coefficient reaches a preset threshold value, taking the user characteristics in the quasi-training sample set as input and providing the input to a prediction model to be trained; carrying out prediction processing on input user characteristics through the prediction model to be trained; and adjusting the network parameters of the prediction model to be trained according to the prediction processing result output by the prediction model to be trained and the input target behavior generation labeling information of the user characteristics.
In yet another embodiment of the present disclosure, the method further comprises: if detecting that a user in the system generates an execution behavior operation in real time, acquiring the current user characteristics of the user; providing the current user characteristics to a prediction model currently used by the system, and performing online prediction processing through the prediction model; and obtaining the probability of the current user executing the target behavior in the first time range in the future according to the prediction processing result of the prediction model.
In yet another embodiment of the present disclosure, the method further comprises: updating the current value of the system according to the probability that the current user performs the target behavior within the first time range in the future.
In yet another embodiment of the present disclosure, the method further comprises: determining an area AUC under a subject working characteristic curve of the prediction model according to the prediction processing results of the prediction model for a plurality of current users; and/or determining the deviation of the current value of the system according to the prediction processing results of the prediction model aiming at a plurality of current users and the posterior value of the number of the users executing the target behaviors; and/or determining the prediction deviation of the prediction model currently used by the system according to the prediction processing results of the prediction model aiming at a plurality of current users and the posterior value of the number of the users executing the target behaviors.
According to another aspect of an embodiment of the present disclosure, there is provided an apparatus for updating a prediction model, the apparatus including: the system comprises a first acquisition module, a second acquisition module and a third acquisition module, wherein the first acquisition module is used for acquiring user characteristics of a plurality of users from a training sample set of a prediction model currently used by the system; the second acquisition module is used for acquiring user characteristics of a plurality of users from the quasi-training sample set; a distribution difference determining module for determining feature distribution difference information of user features of a plurality of users from the training sample set and user features of a plurality of users from the quasi-training sample set; the prediction model training module is used for performing prediction model training by using the user characteristics in the quasi-training sample set if the characteristic distribution difference information meets a preset difference condition; the model updating module is used for updating the prediction model currently used by the system by using the prediction model obtained by the training of the prediction model training module; wherein the predictive model is used to predict a probability that a user in the system will perform a target behavior within a first time range after a predetermined point in time.
In an embodiment of the present disclosure, the apparatus further includes: and the forming set module is used for regularly acquiring the user characteristics of all active users in a second time range closest to the preset historical time according to the service data to form a quasi-training sample set.
In still another embodiment of the present disclosure, the feature distribution difference information includes: degree of difference in feature distribution; and/or ranking information of the contribution of each feature element in the user feature to the feature distribution difference.
In yet another embodiment of the present disclosure, the determining a distribution difference module includes: the first sub-module is used for respectively setting first version marking information for the user characteristics of each user in the training sample set; the second sub-module is used for respectively setting second version marking information for the user characteristics of each user in the quasi-training sample set; the third sub-module is used for respectively taking the user characteristics in the training sample set and the user characteristics in the quasi-training sample set as input to be provided to a version recognition model, and respectively carrying out version recognition processing on the input user characteristics through the version recognition model; and the fourth sub-module is used for determining the feature distribution difference information of the user features of the plurality of users from the training sample set and the user features of the plurality of users from the quasi-training sample set according to the version identification processing result output by the version identification model, the first version marking information and the second version marking information.
In yet another embodiment of the present disclosure, the third sub-module includes: a first unit, configured to use user characteristics of a part of users in the training sample set as a first training sample; a second unit, configured to use user features of a part of users in the quasi-training sample set as a second training sample; a third unit, configured to train a version recognition model using the first training sample and the second training sample; and a fourth unit, configured to provide the user characteristics of the other part of the users in the training sample set and the user characteristics of the other part of the users in the quasi-training sample set as inputs to the trained version recognition model, and perform version recognition processing on the input user characteristics through the trained version recognition model.
In yet another embodiment of the present disclosure, the fourth sub-module is further configured to: and calculating the Mauss correlation coefficient of the user features in the training sample set and the user features in the quasi-training sample set according to the version information of the user features output by the version identification model and the input first version marking information/second version marking information of the user features.
In yet another embodiment of the present disclosure, the predictive model training module is further configured to: if the Mazis correlation coefficient reaches a preset threshold value, taking the user characteristics in the quasi-training sample set as input and providing the input to a prediction model to be trained; carrying out prediction processing on input user characteristics through the prediction model to be trained; and adjusting the network parameters of the prediction model to be trained according to the prediction processing result output by the prediction model to be trained and the input target behavior generation labeling information of the user characteristics.
In yet another embodiment of the present disclosure, the apparatus further includes: the third acquisition module is used for acquiring the current user characteristics of a user if the fact that the execution behavior operation of the user in the system is detected in real time; and the online prediction module is used for providing the current user characteristics to a prediction model currently used by the system, executing online prediction processing through the prediction model, and acquiring the probability of executing target behaviors of the current user in a future first time range according to the prediction processing result of the prediction model.
In yet another embodiment of the present disclosure, the apparatus further includes: and the system value updating module is used for updating the current value of the system according to the probability that the current user executes the target behavior in the first time range in the future.
In yet another embodiment of the present disclosure, the apparatus further includes: the first monitoring module is used for determining the area AUC under the object working characteristic curve of the prediction model according to the prediction processing results of the prediction model for a plurality of current users; and/or the second monitoring module is used for determining the deviation of the current value of the system according to the prediction processing results of the prediction model aiming at a plurality of current users and the posterior value of the number of the users executing the target behaviors; and/or the third monitoring module is used for determining the prediction deviation of the prediction model currently used by the system according to the prediction processing results of the prediction model aiming at a plurality of current users and the posterior value of the number of the users executing the target behaviors.
According to yet another aspect of the embodiments of the present disclosure, there is provided a computer-readable storage medium storing a computer program for executing the above-mentioned method for updating a prediction model.
According to still another aspect of the embodiments of the present disclosure, there is provided an electronic apparatus including: a processor; a memory for storing the processor-executable instructions; the processor is configured to read the executable instructions from the memory and execute the instructions to implement the above-described method for updating a predictive model.
Based on the method and the device for updating the prediction model provided by the embodiment of the disclosure, when the feature distribution of the user features of each user in the quasi-training sample set and the feature distribution of the user features of each user in the training sample set are judged to meet the preset difference condition, the user feature samples in the quasi-training sample set are used for performing prediction model training, so that a proper model updating time is favorably found; because the feature distribution change of the user features of each user in the system often has a change trend, the prediction accuracy can be better no matter the online prediction is carried out by using the prediction model updated by the system or the offline prediction is carried out by using the prediction model updated by the system. Therefore, the technical scheme provided by the disclosure is beneficial to avoiding the resource waste phenomenon caused by unnecessary model training and is beneficial to promoting the prediction model currently used by the system to continuously keep better prediction accuracy.
The technical solution of the present disclosure is further described in detail by the accompanying drawings and examples.
Drawings
The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments of the disclosure and together with the description, serve to explain the principles of the disclosure.
The present disclosure may be more clearly understood from the following detailed description, taken with reference to the accompanying drawings, in which:
FIG. 1 is a schematic diagram of one embodiment of a suitable scenario for use with the present disclosure;
FIG. 2 is a flow diagram of one embodiment of a method for updating a predictive model of the present disclosure;
FIG. 3 is a flow diagram of one embodiment of the present disclosure for determining feature distribution variance information using a version identification model;
FIG. 4 is a flow diagram of another embodiment of the present disclosure for determining feature distribution variance information using a version identification model;
FIG. 5 is a flow diagram of one embodiment of training a predictive model according to the present disclosure;
FIG. 6 is a schematic block diagram illustrating an embodiment of an apparatus for updating a predictive model according to the present disclosure;
fig. 7 is a block diagram of an electronic device provided in an exemplary embodiment of the present disclosure.
Detailed Description
Example embodiments according to the present disclosure will be described in detail below with reference to the accompanying drawings. It is to be understood that the described embodiments are merely a subset of the embodiments of the present disclosure and not all embodiments of the present disclosure, with the understanding that the present disclosure is not limited to the example embodiments described herein.
It should be noted that: the relative arrangement of the components and steps, the numerical expressions, and numerical values set forth in these embodiments do not limit the scope of the present disclosure unless specifically stated otherwise.
It will be understood by those of skill in the art that the terms "first," "second," and the like in the embodiments of the present disclosure are used merely to distinguish one element from another, and are not intended to imply any particular technical meaning, nor is the necessary logical order between them.
It is also understood that in embodiments of the present disclosure, "a plurality" may refer to two or more than two and "at least one" may refer to one, two or more than two.
It is also to be understood that any reference to any component, data, or structure in the embodiments of the disclosure, may be generally understood as one or more, unless explicitly defined otherwise or stated otherwise.
In addition, the term "and/or" in the present disclosure is only one kind of association relationship describing the associated object, and means that there may be three kinds of relationships, such as a and/or B, and may mean: a exists alone, A and B exist simultaneously, and B exists alone. In addition, the character "/" in the present disclosure generally indicates that the former and latter associated objects are in an "or" relationship.
It should also be understood that the description of the various embodiments of the present disclosure emphasizes the differences between the various embodiments, and the same or similar parts may be referred to each other, so that the descriptions thereof are omitted for brevity.
Meanwhile, it should be understood that the sizes of the respective portions shown in the drawings are not drawn in an actual proportional relationship for the convenience of description.
The following description of at least one exemplary embodiment is merely illustrative in nature and is in no way intended to limit the disclosure, its application, or uses.
Techniques, methods, and apparatus known to those of ordinary skill in the relevant art may not be discussed in detail but are intended to be part of the specification where appropriate.
It should be noted that: like reference numbers and letters refer to like items in the following figures, and thus, once an item is defined in one figure, further discussion thereof is not required in subsequent figures.
Embodiments of the present disclosure may be implemented in electronic devices such as terminal devices, computer systems, servers, etc., which are operational with numerous other general purpose or special purpose computing system environments or configurations. Examples of well known terminal devices, computing systems, environments, and/or configurations that may be suitable for use with an electronic device, such as a terminal device, computer system, or server, include, but are not limited to: personal computer systems, server computer systems, thin clients, thick clients, hand-held or laptop devices, microprocessor-based systems, set top boxes, programmable consumer electronics, network pcs, minicomputer systems, mainframe computer systems, distributed cloud computing environments that include any of the above, and the like.
Electronic devices such as terminal devices, computer systems, servers, etc. may be described in the general context of computer system-executable instructions, such as program modules, being executed by a computer system. Generally, program modules may include routines, programs, objects, components, logic, data structures, etc. that perform particular tasks or implement particular abstract data types. The computer system/server may be implemented in a distributed cloud computing environment. In a distributed cloud computing environment, tasks may be performed by remote processing devices that are linked through a communications network. In a distributed cloud computing environment, program modules may be located in both local and remote computer system storage media including memory storage devices.
Summary of the disclosure
In implementing the present disclosure, the inventors have found that, at present, the prediction model is generally updated according to a preset update period (for example, daily, weekly, monthly, etc.). The update period is usually set according to manual experience. If the update period setting is too short, it is likely to cause a resource waste phenomenon, and if the update period setting is too long, it is likely to cause a phenomenon in which the prediction accuracy of the prediction model is lowered.
Brief description of the drawings
One example of an application scenario for updating a predictive model provided by the present disclosure is shown in fig. 1.
In fig. 1, it is assumed that there are N users (i.e., user 100-1, users 100-2, … …, and user 100-N accessing a system 101 online, where the system 101 may be a website for providing real estate services for users (i.e., a platform for providing real estate services for users), when any one of the N users performs a behavior, the system 101 may obtain the current user characteristics of the user in real time, and perform online prediction processing on the current user characteristics of the user by using the prediction model 102 currently used by the user, so that the system 101 may obtain the probability that the user performs a target behavior (e.g., a transfer delegation behavior or a deal behavior, etc.) within the next N days in real time.
The system 101 stores a training sample set 103 of the currently used prediction model 102, and the training sample set 103 includes user features of a plurality of users. The system 101 may obtain user characteristics of a plurality of users according to the traffic data from the T- (N +31) th day to the T- (N +1) th day at a timing (e.g., 00:00 per day), and the user characteristics of the plurality of users form the quasi-training sample set 104. The system 101 may obtain target behavior occurrence labeling information of the user characteristics of each user in the quasi-training sample set 104 according to the service data from the T-N day to the T-1 day, so that the user characteristics of all users in the quasi-training sample set 104 and the labeling information thereof form a plurality of training samples together. Wherein T may be the current day. The users in the training sample set 103 and the users in the quasi-training sample set 104 are typically not identical and may be completely different.
After obtaining the quasi-training sample set 104, the system 101 may determine whether the currently used prediction model 102 needs to be updated by using the user characteristics of at least some users in the training sample set 103 and the user characteristics of at least some users in the quasi-training sample set 104, and if the determination result is that the currently used prediction model 102 needs to be updated, the system 101 may perform a training operation of the prediction model by using all the user characteristics and the labeled information thereof in the quasi-training sample set 104, thereby forming a new prediction model 105, and replace the currently used prediction model 102 with the new prediction model 105, and the system 101 may store the quasi-training sample set 104 as the training sample set 103, that is, the training sample set 103 is replaced by the quasi-training sample set 104. If the current prediction model 102 does not need to be updated, the system 101 keeps the currently used prediction model 102 and the training sample set 103 is not updated.
Exemplary method
FIG. 2 is a flow chart of one embodiment of a method for updating a predictive model of the present disclosure. The method of the embodiment shown in fig. 2 comprises the steps of: s200, S201, S202, S203, and S204. The following describes each step.
S200, obtaining user characteristics of a plurality of users from a training sample set of a prediction model currently used by the system.
The predictive model currently used by the system in the present disclosure is used to predict a probability that a user in the system will perform a target behavior within a first time range after a predetermined point in time. The predetermined time point may be a time point at which the user performs an action. For example, the predictive model currently used by the system is used to predict the probability that a user in the system will perform a target action within the next N days after the current point in time at which the action was performed. The present disclosure may also refer to the predicted probability as a user value based on the target behavior.
The prediction model in the present disclosure may be used to predict, in real-time, on-line the probability that each user in the system will perform a target action within the future N (N is an integer greater than 1, e.g., N ═ 7) days after it performs an action. The predictive model in the present disclosure may also be used to predict the probability that each user in the system will perform a target action within N days after performing the action offline.
The prediction model in the present disclosure may be an XGBoost (extreme Gradient boost) model, a Light Gradient boost Machine (Light Gradient boost Machine) model, a category boost (category boost) model, or a deep learning-based model, etc. The Deep learning-based model may be DNN (Deep Neural Networks) or LSTM (Long short term Memory Networks). The present disclosure does not limit the concrete representation of the predictive model.
The target behavior in the present disclosure may be set according to the actual requirements of the application scenario, and in the field of real estate, the target behavior in the present disclosure may include: a delegation action, a business opportunity occurrence action, a deal action, and the like. The present disclosure is not limited to specific manifestations of the target behavior. If the target behaviors in the present disclosure are multiple, the prediction models currently used by the system of the present disclosure may also be multiple, one prediction model corresponding to one target behavior, and different prediction models corresponding to different target behaviors. Any one of the prediction models is used for predicting the probability that the user executes the target behavior corresponding to the prediction model in a first time range after a preset time point.
The user characteristics in the present disclosure may refer to information describing characteristics possessed by at least one of user behavior, user's own situation, and user terminal device's own situation. The user feature generally includes a plurality of feature elements, and all the feature elements included in the user feature may be set according to actual requirements of an application scenario, for example, in the field of real estate, the user feature in the present disclosure may include: the method comprises the following steps of characteristic elements such as user access time, user login time, user terminal equipment information, a preset application program downloading channel in the terminal equipment, a city corresponding to a service, user browsed page information, the stay time of a user browsed page, a user searching condition, a user sharing attention condition, a business opportunity occurrence condition, a commission occurrence condition, a watching taking occurrence condition and a deal occurrence condition. The present disclosure does not limit the concrete representation forms of all the feature elements included in the user feature.
The prediction model currently used by the system in the present disclosure is a prediction model obtained by successfully training with a training sample set. The training sample set in the present disclosure may include: and marking information of the user characteristics of a plurality of users and the target behaviors of the user characteristics. The user characteristics of each user in the training sample set and the target behavior occurrence marking information of each user characteristic are formed according to the business data. For example, for a feature element used for describing a user behavior in the user feature, the corresponding field content in the service data in a period of time may be searched, extracted, cumulatively calculated, or mapped, and the like, with the user as a unit, so as to form a specific value of each feature element in the user feature of each user and target behavior occurrence tagging information of each user feature.
An example of setting target behavior occurrence labeling information for user features in a training sample set according to the present disclosure may be: for any user in the training sample set, if it is determined that the user has the target behavior within the first time range after the predetermined time point according to the service data, the value of the target behavior occurrence tagging information of the user characteristic of the user may be set to 1, and if it is determined that the user has not the target behavior within the first time range after the predetermined time point according to the service data, the value of the target behavior occurrence tagging information of the user characteristic of the user may be set to 0.
The service data in the present disclosure may include information such as a user operation log or a user access log formed on the server side of the system.
The user characteristics of all users can be obtained from the training sample set, and the user characteristics of part of users can also be obtained from the training sample set.
S201, obtaining user characteristics of a plurality of users from the quasi-training sample set.
The quasi-training sample set in the present disclosure also includes: and marking information of the user characteristics of a plurality of users and the target behaviors of the user characteristics. The user characteristics of each user in the quasi-training sample set and the target behavior occurrence labeling information of each user characteristic are also formed according to the business data. For example, for a feature element used for describing a user behavior in the user feature, the corresponding field content in the service data may be subjected to processing such as searching, extracting, cumulative calculating, or mapping by taking the user as a unit, so as to form a specific value of each feature element in the user feature of each user and target behavior occurrence tagging information of each user feature.
The traffic data forming the user features and their labels in the quasi-training sample set generally belong to one time period (hereinafter referred to as a first time period), and the traffic data forming the user features and their labels in the training sample set generally belong to another time period (hereinafter referred to as a second time period), where the start time of the first time period is generally later than the start time of the second time period, and the end time of the first time period is generally later than the end time of the second time period.
All users in the training sample set of the present disclosure are typically not identical to all users in the quasi-training sample set, but the present disclosure does not preclude all users in the training sample set from being identical to all users in the quasi-training sample set. Even though all users in the training sample set are identical to all users in the quasi-training sample set, the user characteristics of all users in the training sample set are usually not identical to the user characteristics of all users in the quasi-training sample set.
The method and the device can acquire the user characteristics of all the users from the quasi-training sample set, and also can acquire the user characteristics of part of the users from the quasi-training sample set. In addition, the number of user features obtained from the quasi-training sample set may be substantially the same as the number of user features obtained from the training sample set.
S202, determining the feature distribution difference information of the user features of the users from the training sample set and the user features of the users from the quasi-training sample set.
The feature distribution difference information in the present disclosure may characterize the feature distribution difference situation of the user features of the plurality of users from the training sample set and the user features of the plurality of users from the quasi-training sample set. The present disclosure may obtain feature distribution difference information of user features of a plurality of users from a training sample set and user features of a plurality of users from a quasi-training sample set using a preset model.
And S203, if the feature distribution difference information meets the preset difference condition, performing prediction model training by using the user features in the quasi-training sample set.
The preset difference condition in the present disclosure is used to measure whether the difference in the feature distribution between the user features of the plurality of users from the training sample set and the user features of the plurality of users from the quasi-training sample set is large enough. The preset difference condition can be set according to actual requirements. The present disclosure does not limit the predictive model training process.
And S204, updating the prediction model currently used by the system by using the prediction model obtained by training.
The prediction model obtained by successful training of the method is used as the current prediction model of the system, and meanwhile, the quasi-training sample set is stored as the training sample set. I.e. the set of training samples is updated to the set of quasi-training samples. In addition, if the feature distribution difference information does not meet the preset difference condition, the prediction model training does not need to be carried out by using the user features in the quasi-training sample set, namely, the prediction model currently used by the system cannot be updated by the method, and the training sample set cannot be updated by the method.
The feature distribution of the user features of each user in the system typically changes over time. The feature distribution of all user feature samples used for training the prediction model generally has certain characteristics, and if the difference between the feature distribution of the user features of each user needing to be predicted at present and the feature distribution of all user feature samples used for training the prediction model is small, the prediction accuracy of the prediction model used by the system at present can be ensured; if the difference between the feature distribution of the user features of each user currently required to be predicted and the feature distribution of all user feature samples used for training the prediction model is large, the prediction accuracy of the prediction model currently used by the system cannot be ensured. According to the method, when the characteristic distribution of the user characteristics of each user in the quasi-training sample set and the characteristic distribution of the user characteristics of each user in the training sample set are judged to meet the preset difference condition, the user characteristic samples in the quasi-training sample set are used for conducting prediction model training, and the method is favorable for finding out a proper model updating opportunity; because the feature distribution change of the user features of each user in the system often has a change trend, the prediction accuracy can be better no matter the online prediction is carried out by using the prediction model updated by the system or the offline prediction is carried out by using the prediction model updated by the system. Therefore, the technical scheme provided by the disclosure is beneficial to avoiding the resource waste phenomenon caused by unnecessary model training and is beneficial to promoting the prediction model currently used by the system to continuously keep better prediction accuracy.
In one optional example, all users in the quasi-training sample set in the present disclosure may be all active users within a second time range that is closest to a predetermined historical time (e.g., yesterday's zero point, etc.). That is, the present disclosure may first determine all active users in a second time range closest to the predetermined historical time, and then obtain user features of all active users (e.g., monthly active users) according to corresponding service data (e.g., service data from T- (N +31) th day to T- (N +1) th day, so as to form user features in the quasi-training sample set. An active user in the present disclosure may refer to a user who has performed a corresponding operation (e.g., browsing a detail page, etc.) within a second time range closest to a predetermined historical time. All active users within a second time range closest to the predetermined history time in the present disclosure may be a season user (a season user having yesterday zero as the predetermined history time), a month user (a month user having yesterday zero as the predetermined history time), or a week user (a week user having yesterday zero as the predetermined history time), or the like.
Alternatively, the present disclosure may form quasi-training sample sets periodically. For example, the present disclosure may periodically form a quasi-training sample set according to the traffic data of the current monthly user. In a more specific example, assuming that T is the current day and the predetermined historical time is the zero point of T-1 day, the present disclosure may perform processing operations such as extraction, calculation, conversion, or mapping on corresponding contents in the service data from T-31 days to T-2 every predetermined time every day (e.g., the zero point of every day) by taking each active user as a unit, so as to obtain values of at least part of feature elements (e.g., page information browsed by the user, dwell time of a user browsing a page, user search conditions, etc.) in the user features of each active user. Some of the feature elements in the user feature of the present disclosure (e.g., the user's terminal device information, a predetermined application download channel in the terminal device, etc.) may be formed using earlier business data.
The quasi-training sample set is formed at regular time, the characteristic distribution difference between the characteristic distribution of the user characteristics of a plurality of users (such as all users or part of users) in the training sample set and the characteristic distribution of the user characteristics of a plurality of users (such as all users or part of users) in the quasi-training sample set can be periodically detected, and the resource consumed by the operation of forming the quasi-training sample set and judging the characteristic distribution difference is less, so that the method is favorable for finding out the proper time for updating the prediction model currently used by the system under the condition of less resource consumption, and is favorable for continuously keeping better prediction accuracy of the prediction model currently used by the system.
In one optional example, the feature distribution difference information in the present disclosure may include: degree of difference in feature distribution. The degree of difference of the feature distribution is used for characterizing the difference size of the feature distribution. The feature distribution difference information in the present disclosure may also include: ranking information of the contribution of each feature element in the user feature to the feature distribution difference. The ranking information may be regarded as a result of comparison of the feature distribution difference of each feature element in the user feature. The method and the device have the advantages that by obtaining the degree of the difference of the feature distribution and/or the sequencing information of the contribution of each feature element in the user feature to the difference of the feature distribution, the strategy for updating the prediction model currently used by the system can be flexibly set (for example, the preset difference condition can be flexibly set), and the specific conditions of the user features in the training sample set and the quasi-training sample set can be mastered.
In one optional example, the present disclosure may utilize a version recognition model to determine feature distribution difference information for user features of a plurality of users in a training sample set and user features of a plurality of users in a quasi-training sample set. A specific example is shown in fig. 3.
In fig. 3, S300, first version labeling information is respectively set for the user features of each user in the training sample set.
Optionally, the first version of annotation information in this disclosure may be referred to as old version of annotation information, and the first version of annotation information is used to indicate that the formation time of the user feature is an earlier time. In one example, the first version of annotation information can be represented using 0.
S301, setting second version marking information for the user characteristics of each user in the quasi-training sample set respectively.
Optionally, the second version of annotation information in the present disclosure may be referred to as new version annotation information, and the second version annotation information is used to indicate that the forming time of the user feature is a later time. That is, the formation time of the user features in the quasi-training sample set is later than the formation time of the user features in the training sample set. That is, the traffic data used to form the training sample set is older than the traffic data that forms the quasi-training sample set. In one example, the second version of annotation information may be represented using 1.
S302, the user features in the training sample set and the user features in the quasi-training sample set are respectively used as input to be provided to a version recognition model, and version recognition processing is respectively carried out on the input user features through the version recognition model.
Alternatively, the version identification model in the present disclosure may be a classification-based model, for example, a model using a tree algorithm, or a neural network, etc. More specifically, the version identification model may be an XGBoost (extreme Gradient Boosting) model, a Light Gradient Boosting machine (Light gbm) model, a category boost (category Boosting) model, or a linear model (such as a logistic regression model). The present disclosure does not limit the concrete representation of the version identification model.
Optionally, the present disclosure may select user characteristics of a part of users from the training sample set, and select user characteristics of a part of users from the quasi-training sample set. In addition, the number of user features selected from the training sample set by the present disclosure is generally substantially the same as the number of user features selected from the quasi-training sample set. Of course, the present disclosure does not exclude the case where all the user features in the training sample set and all the user features in the quasi-training sample set are provided to the version recognition model as inputs, respectively.
S303, determining the user characteristics of a plurality of users from the training sample set and the characteristic distribution difference information of the user characteristics of the plurality of users from the quasi-training sample set according to the version recognition processing result output by the version recognition model, the input first version marking information of the user characteristics and the input second version marking information of the user characteristics.
Optionally, the version identification model in the present disclosure may output a probability value for each input user feature. In one example, for any user feature input, if the probability value output by the version identification model for the user feature reaches a predetermined value (e.g., 0.5), the user feature is considered to be from the training sample set, and if the probability value output by the version identification model for the user feature does not reach the predetermined value, the user feature is considered to be from the quasi-training sample set. In another example, for any user feature input, the user feature is considered to be from the quasi-training sample set if the probability value output by the version identification model for the user feature reaches a predetermined value, and the user feature is considered to be from the training sample set if the probability value output by the version identification model for the user feature does not reach the predetermined value.
Optionally, the version identification model in the present disclosure may also output two probability values for each input user feature, where for any input user feature, one of the probability values output by the version identification model represents a probability that the user feature is a user feature from the training sample set, and the other of the probability values output by the version identification model represents a probability that the user feature is a user feature from the quasi-training sample set. That is to say, the version identification model in the present disclosure may output, for an input user feature, a probability that the version label information of the user feature is the first version label information and a probability that the version label information of the user feature is the second version label information. The version marking information corresponding to the higher probability value of the two probability values can be used as the version marking information of the user characteristics identified by the version identification model.
Optionally, for any input user feature, the present disclosure may determine whether the version identification processing result output by the version identification model is correct or wrong according to the first version label information/the second version label information of the user feature. The present disclosure may obtain the feature distribution difference information of the user features from the plurality of users in the training sample set and the user features from the plurality of users in the quasi-training sample set by calculating all correct recognition processing results and all erroneous recognition processing results.
For example, the present disclosure may calculate the mausis correlation coefficient of the user features in the training sample set and the user features in the quasi-training sample set according to the version label information of each user feature output by the version identification model and the first version label information/second version label information of each user feature. The method can directly use the calculated Mazis correlation coefficient as the difference degree of the feature distribution, and can also carry out corresponding processing (such as mapping) on the calculated Mazis correlation coefficient, and use the processing result as the difference degree of the feature distribution.
Alternatively, it is assumed that the present disclosure may take one of the user features from the training sample set and the user features from the quasi-training sample set as a positive sample and the other of the user features from the training sample set and the user features from the quasi-training sample set as a negative sample. For example, if the first version label information is 1 and the second version label information is 0, the present disclosure may take the user characteristic with the first version label information as a positive sample and the user characteristic with the second version label information as a negative sample. For another example, if the first version label information is 0 and the second version label information is 1, the present disclosure may take the user characteristic with the second version label information as a positive sample and the user characteristic with the first version label information as a negative sample. Under the above assumptions, the present disclosure may calculate MCC (Matthews Correlation Coefficient) using the following equation (1):
in the above formula (1), TP (True Positive) represents the number of user features that are identified as Positive samples by the version identification model; TN (True Negative) represents the number of user features identified as Negative by the version identification model; FP (False Negative) represents the number of user features identified as positive samples by the version identification model as Negative samples; FN represents the number of user features identified as positive samples by the version identification model as negative samples.
Optionally, the range of the mausis correlation coefficient in the present disclosure may be [ -1, 1], and the closer the mausis correlation coefficient is to 1, the higher the recognition accuracy of the version recognition model is, that is, the larger the difference between the feature distributions of the user features from the multiple users in the training sample set and the user features from the multiple users in the quasi-training sample set is.
Optionally, after the version recognition model in the present disclosure performs version recognition processing on all the input user features, it may further output ranking information of the contribution of each feature element in the user features to the feature distribution difference.
According to the method and the device, the first version marking information is set for the user characteristics in the training sample set, the second version marking information is set for the user characteristics in the quasi-training sample set, and the version recognition model is used for carrying out version recognition processing on the input user characteristics, so that the quick acquired characteristic distribution difference information can be conveniently obtained according to the version recognition processing result. Particularly, by calculating the Mauss correlation coefficient, the method is beneficial to conveniently and accurately measuring the feature distribution difference information of the user features of the users in the training sample set and the user features of the users in the quasi-training sample set.
In an optional example, in each process of determining the feature distribution difference information, the present disclosure should train the version recognition model, and then determine the feature distribution difference information of the user features from the multiple users in the training sample set and the user features from the multiple users in the quasi-training sample set by using the trained version recognition model. An example of the deterministic feature distribution variance information included in the version identification model training process of the present disclosure is shown in fig. 4 below.
In fig. 4, S400 sets first version labeling information for the user characteristics of each user in the training sample set.
S401, second version marking information is set for the user characteristics of each user in the quasi-training sample set.
S402, selecting user characteristics of part of users from the training sample set, and taking the selected user characteristics as a first training sample.
Optionally, the present disclosure may randomly select user characteristics of a part of users from the training sample set. The number of user features of the part of users selected from the training sample set is a proportion of the number of user features of all users in the training sample set, and is usually small, such as 10%.
Optionally, the present disclosure may divide all user features in the training sample set into two parts, where one part is used for training the version recognition model, and the other part is used for obtaining the feature distribution difference information.
And S403, selecting user characteristics of part of users from the quasi-training sample set, and taking the selected user characteristics as a second training sample.
Optionally, the present disclosure may randomly select user characteristics of a part of users from the quasi-training sample set. The number of user features of the part of users selected from the quasi-training sample set is a proportion of the number of user features of all users in the quasi-training sample set, and is usually small, such as 10%. In addition, the number of the user features of the part of users selected from the training sample set is generally substantially the same as the number of the user features of the part of users selected from the quasi-training sample set.
Optionally, the present disclosure may divide all user features in the quasi-training sample set into two parts, where one part is used for training the version recognition model, and the other part is used for obtaining the feature distribution difference information.
S404, training the version recognition model by using the first training sample and the second training sample.
Optionally, in the present disclosure, a certain number of user features may be selected from the first training sample and the second training sample according to a preset batch processing number, and the user features of the certain number selected at present are provided to the version identification model as inputs. Then, the method and the device can perform loss calculation by using corresponding loss functions according to the version recognition processing result output by the version recognition model and the input first version label information of the user characteristics and the second version label information of the user characteristics, and adjust network parameters (such as weight matrix of neural network or structural parameters of binary tree) of the version recognition model by using the loss calculation result, thereby completing a round of training of the version recognition model. The present disclosure may obtain a trained version recognition model after performing a corresponding round (e.g., 10 rounds) of training on the version recognition model. The method does not require the recognition accuracy of the trained version recognition model.
S405, the user characteristics of the other part of users in the training sample set and the user characteristics of the other part of users in the quasi-training sample set are respectively used as input and provided for the trained version recognition model, and the trained version recognition model is used for respectively carrying out version recognition processing on the input user characteristics.
S406, determining the user characteristics from the multiple users in the training sample set and the characteristic distribution difference information of the user characteristics from the multiple users in the quasi-training sample set according to the version recognition processing result output by the version recognition model, the input first version marking information of the user characteristics and the input second version marking information of the user characteristics.
In the process of determining the feature distribution difference information each time, the version recognition model is trained by using the user features in the training sample set and the quasi-training sample set, so that the version recognition capability of the version recognition model is adaptive to the user features in the training sample set and the quasi-training sample set, and the feature distribution difference information can be accurately obtained.
In an optional example, in a case that the feature distribution difference information includes a manges correlation coefficient, the preset difference condition in the present disclosure may be whether the manges correlation coefficient reaches a predetermined threshold, and if the manges correlation coefficient reaches the predetermined threshold, the feature distribution difference information is considered to satisfy the preset difference condition; and if the Mazis correlation coefficient does not reach the preset threshold value, the feature distribution difference information is considered not to meet the preset difference condition. The predetermined threshold in the present disclosure may be set according to actual conditions, for example, the predetermined threshold may be 0.2, and the like.
Optionally, if the mausus correlation coefficient reaches a predetermined threshold, the present disclosure determines that the training process of the prediction model should be performed using the quasi-training sample set. One example of the present disclosure for training a predictive model may be as shown in fig. 5.
In fig. 5, S500, the user features in the quasi-training sample set are provided as input to the prediction model to be trained.
Optionally, in a case where an input is provided for the prediction model to be trained for the first time, the present disclosure may use the initial prediction model (i.e., the prediction model initialized in advance) as the prediction model to be trained. According to the method and the device, a certain number of user characteristics can be randomly selected from the quasi-training sample set according to the preset batch processing number, and the selected user characteristics are respectively used as input and provided for the prediction model to be trained.
S501, input user characteristics are subjected to prediction processing through a prediction model to be trained.
Optionally, the to-be-trained prediction model performs, for the input user feature of each user, a probability that the user performs the target behavior within a first time range after a predetermined time point. For example, assuming that T is the current day, the predetermined time point is the zero point of T-8 days, and the feature elements related to the user behavior in the user features in the quasi-training sample set are formed based on the traffic data from T-37 days to T-8 days, the to-be-trained prediction model may predict the probability that the user performs the target behavior in the range from T-7 days to T-1 days for each input user feature.
S502, adjusting network parameters of the prediction model to be trained according to the prediction processing result output by the prediction model to be trained and the input target behavior generation labeling information of the user characteristics.
Optionally, the target behavior occurrence tagging information of each user feature in the quasi-training sample set in the present disclosure is formed by using the business data in a first time range after a predetermined time point. For the previous example, the present disclosure may determine whether each user in the quasi-training sample set executes the target behavior by using the service data in the range from T-7 days to T-1 days, set the target behavior occurrence tagging information of the user characteristic of the user executing the target behavior to "1", and set the user characteristic target behavior occurrence tagging information of the user not executing the target behavior to "0".
Optionally, the present disclosure may perform loss calculation by using a corresponding loss function according to a prediction processing result output by the prediction model to be trained and target behavior occurrence tagging information of the input user characteristics, and adjust network parameters (such as a weight matrix of a neural network or structural parameters of a tree) of the prediction model to be trained by using the calculated loss, thereby completing a round of training of the prediction model.
Optionally, when the training for the prediction model to be trained reaches a predetermined iteration condition, the training process for the prediction model to be trained is ended. The predetermined iteration condition in the present disclosure may include: and outputting the user characteristics in the test sample set according to the to-be-trained prediction model, wherein the accuracy of the obtained prediction result meets the preset requirement. And under the condition that the accuracy of the obtained prediction result of the prediction model to be trained aiming at the output of the user characteristics in the test sample set reaches the preset requirement, successfully training the prediction model to be trained this time. The predetermined iteration condition in the present disclosure may further include: training a to-be-trained prediction model, wherein the number of the used user features already meets the requirement of the preset number, and the like. When the number of the used user features meets the requirement of a preset number, however, under the condition that the accuracy of the prediction result obtained by the prediction model to be trained aiming at the output of the user features in the test sample set does not meet the preset requirement, the prediction model to be trained is not trained successfully at this time. The successfully trained predictive model may be used to predict online/offline the probability of a user's target behavior occurring within the next N days. The method and the device can draw a part of user characteristics from the quasi-training sample set to form a measurement sample set.
Compared with the user characteristics in the training sample set, the user characteristics in the quasi-training sample set are formed based on the updated business data, and the characteristic distribution change of the user characteristics generally has a certain trend.
In one optional example, the predictive model of the present disclosure may enable on-line prediction. For example, the present disclosure may detect whether each user in the system performs an operation of an execution behavior in real time, and if it is detected that a user performs an operation of an execution behavior, may form a current user characteristic of the user in real time, provide the current user characteristic of the user to a prediction model currently used by the system, and perform online prediction processing by the prediction model, so that the present disclosure may obtain a probability that the current user performs a target behavior in a first time range in the future (e.g., 7 days in the future) according to a prediction processing result of the prediction model.
According to the method and the device, the prediction model is used for on-line prediction, so that the possibility that the user executes the target behavior within a period of time (such as within N days in the future) in the future can be obtained in time, the prediction delay phenomenon can be effectively avoided, better service can be provided for the user in time, and the possibility that the user in the system generates the target behavior can be improved.
In one optional example, the present disclosure may update the system's current value with a probability that the current user performed the target behavior within a first time frame in the future. In one example, the system of the present disclosure is provided with a system current value, and after the system current value is initialized to a preset initial value, each time the probability that a user in the system executes a target behavior within a first time range in the future after executing the behavior is predicted and obtained by using the prediction model, the sum of the system current value and the probability value obtained by prediction may be calculated, and the system current value may be updated by using the calculated sum. In the case that the target behavior is multiple, the present disclosure should set the current value of the system for each target behavior. For example, in the target behavior includes: when business opportunities are behaving or bargaining, the current value of the system in the present disclosure may include: the current value of the merchant-based system (which may also be referred to as the system merchant value) and the current value of the deal-based system (which may also be referred to as the system deal value).
It should be noted that, in the process of predicting the probability of a user in the system performing a target behavior in a first time range in the future after performing a behavior and calculating the sum of the current value of the system and the probability value obtained by prediction, if a probability value (hereinafter referred to as a first probability value) predicted based on the user performing another behavior at an earlier time has accumulated in the current value of the system, the first probability value is removed from the current value of the system.
The present disclosure can use the obtained current value of the system (e.g., the current value of the system obtained every day) as a parameter for fully understanding the overall situation of all users in the system, thereby facilitating corresponding improvements to the system in order to increase the occurrence number of target behaviors in the system.
In an alternative example, the present disclosure may determine AUC (Area Under Curve) of the prediction model according to the prediction processing results of the prediction model for multiple current users, where the Area Under Curve is a receiver operator managed Curve. The specific process of calculating the AUC of the predictive model is not described in detail herein. According to the method and the device, the training effect of the prediction model can be better known by calculating the AUC of the prediction model.
In one optional example, the present disclosure may determine a prediction bias of a prediction model currently used by the system according to a posterior value of a prediction processing result of the prediction model for a plurality of current users and a number of users performing the target behavior. The posterior value of the number of users performing the target behavior is usually obtained by counting the traffic data actually generated subsequently. The method and the device can obtain the actual value of the system according to the posterior value of the number of the users executing the target behaviors, and can compare the actual value of the system with the current value of the system, so that whether the current value of the system has an overestimation or underestimation phenomenon can be known. In addition, the method and the device can determine the number of the users accurately predicted by the prediction model currently used by the system, the number of the users wrongly predicted, the proportion of the accurately predicted number of the users to all the users, the proportion of the incorrectly predicted number of the users to all the users and the like according to the prediction processing results of the prediction model for a plurality of current users and the posterior value of the number of the users executing the target behaviors, so that the method and the device are favorable for more accurately knowing the prediction accuracy of the prediction model.
Exemplary devices
FIG. 6 is a schematic structural diagram illustrating an embodiment of an apparatus for updating a predictive model according to the present disclosure. The apparatus of this embodiment may be used to implement the method embodiments of the present disclosure described above.
As shown in fig. 6, the apparatus of the present embodiment may include: a first obtaining module 600, a second obtaining module 601, a distribution difference determining module 602, a predictive model training module 603, and a model updating module 604. Optionally, the apparatus of this embodiment may further include: at least one of a form set module 605, a third acquisition module 606, an online prediction module 607, an update system value module 608, a first monitoring module 609, a second monitoring module 610, and a third monitoring module 611.
The first obtaining module 600 is configured to obtain user features of a plurality of users from a training sample set of a prediction model currently used by the system. The predictive model is used to predict a probability that a user in the system will perform a target action within a first time range after a predetermined point in time.
The second obtaining module 601 is configured to obtain user characteristics of a plurality of users from a quasi-training sample set.
The determine distribution variance module 602 is used to determine feature distribution variance information of user features of a plurality of users from the set of training samples and user features of a plurality of users from the set of quasi-training samples.
Optionally, the feature distribution difference information in the present disclosure may include: and at least one of the degree of difference in the feature distribution and ranking information of the contribution of each feature element in the user feature to the difference in the feature distribution.
Optionally, the determining distribution difference module 602 in the present disclosure may include: a first sub-module 6021, a second sub-module 6022, a third sub-module 6023, and a fourth sub-module 6024. The first sub-module 6021 is configured to set first version labeling information for the user characteristics of each user in the training sample set. The second sub-module 6022 is configured to set second version labeling information for the user characteristics of each user in the quasi-training sample set. The third sub-module 6023 is configured to provide the user features in the training sample set and the user features in the quasi-training sample set as inputs to the version identification model, and perform version identification processing on the input user features through the version identification model. For example, the third sub-module 6023 may include: a first cell 60231, a second cell 60232, a third cell 60233, and a fourth cell 60234. The first element 60231 is used to take the user characteristics of some users in the training sample set as the first training sample. The second element 60232 is used to use the user characteristics of some users in the quasi-training sample set as the second training sample. The third unit 60233 is for training the version recognition model using the first training sample and the second training sample. The fourth unit 60234 is configured to provide the user characteristics of the other part of the users in the training sample set and the user characteristics of the other part of the users in the quasi-training sample set as inputs to the trained version identification model, and perform version identification processing on the input user characteristics through the trained version identification model obtained by the third unit 60233. The fourth sub-module 6024 is configured to determine, according to the version identification processing result output by the version identification model, the first version label information, and the second version label information, the user characteristics of the multiple users from the training sample set and the characteristic distribution difference information of the user characteristics of the multiple users from the quasi-training sample set. For example, the fourth sub-module 6024 may calculate a mausus correlation coefficient between the user feature in the training sample set and the user feature in the quasi-training sample set according to the version information of each user feature output by the version identification model and the input first version label information/second version label information of each user feature.
The prediction model training module 603 is configured to perform prediction model training by using the user characteristics in the quasi-training sample set if the feature distribution difference information satisfies a preset difference condition. If it is determined that the feature distribution difference information obtained by the distribution difference module 602 does not satisfy the preset difference condition, the prediction model training module 603 may not perform the training operation of the prediction model. For example, if the mahius correlation coefficient calculated by the fourth sub-module 6024 reaches the predetermined threshold, the prediction model training module 603 takes the user characteristics in the quasi-training sample set as input, and provides the input to the prediction model to be trained, so as to perform prediction processing on the input user characteristics through the prediction model to be trained, and the prediction model training module 603 may adjust the network parameters of the prediction model to be trained according to the prediction processing result output by the prediction model to be trained and the target behavior generation labeling information of the user characteristics input by the model. If the computed mahius correlation coefficient of the fourth sub-module 6024 does not reach the predetermined threshold, the predictive model training module 603 may not perform the training operation of the predictive model.
The model updating module 604 is used for updating the prediction model currently used by the system with the prediction model obtained by training of the prediction model training module 603. In addition, the model update module 604 should also update the training sample set with the quasi-training sample set when updating the prediction model currently used by the system.
The forming set module 605 is configured to periodically obtain user features of all active users in a second time range closest to the predetermined historical time according to the service data, and form a quasi-training sample set.
The third obtaining module 606 is configured to obtain a current user characteristic of a user in the system if it is detected that the user generates an execution behavior operation in real time.
The online prediction module 607 is configured to provide the current user characteristic obtained by the third obtaining module 606 to a prediction model currently used by the system, perform online prediction processing through the prediction model, and obtain a probability that the current user performs the target behavior in a first time range in the future according to a prediction processing result of the prediction model.
The update system value module 608 is configured to update the current value of the system based on a probability that the current user performed the target behavior within a first time frame in the future.
The first monitoring module 609 is configured to determine, according to the prediction processing results of the prediction model for a plurality of current users, an area AUC under a subject working characteristic curve of the prediction model.
The second monitoring module 610 is configured to determine a deviation of a current value of the system according to a prediction processing result of the prediction model for a plurality of current users and a posterior value of the number of users performing the target behavior.
The third monitoring module 611 is configured to determine a prediction deviation of the prediction model currently used by the system according to the prediction processing results of the prediction model for a plurality of current users and the posterior value of the number of users executing the target behavior.
The operations specifically executed by the modules and the sub-modules and units included in the modules may be referred to in the description of the method embodiments with reference to fig. 2 to 5, and are not described in detail here.
Exemplary electronic device
An electronic device according to an embodiment of the present disclosure is described below with reference to fig. 7. FIG. 7 shows a block diagram of an electronic device in accordance with an embodiment of the disclosure. As shown in fig. 7, the electronic device 71 includes one or more processors 711 and memory 712.
The processor 711 may be a Central Processing Unit (CPU) or other form of processing unit having the capability to update a predictive model and/or instruction execution capability, and may control other components in the electronic device 71 to perform desired functions.
Memory 712 may include one or more computer program products that may include various forms of computer-readable storage media, such as volatile memory and/or non-volatile memory. The volatile memory, for example, may include: random Access Memory (RAM) and/or cache memory (cache), etc. The nonvolatile memory, for example, may include: read Only Memory (ROM), hard disk, flash memory, and the like. One or more computer program instructions may be stored on the computer-readable storage medium and executed by the processor 711 to implement the methods for updating a predictive model of the various embodiments of the present disclosure described above and/or other desired functions. Various contents such as an input signal, a signal component, a noise component, etc. may also be stored in the computer-readable storage medium.
In one example, the electronic device 71 may further include: input devices 713 and output devices 714, among other components, interconnected by a bus system and/or other form of connection mechanism (not shown). The input device 713 may also include, for example, a keyboard, a mouse, and the like. The output device 714 can output various information to the outside. The output devices 714 may include, for example, a display, speakers, a printer, and a communication network and remote output devices connected thereto, among others.
Of course, for simplicity, only some of the components of the electronic device 71 relevant to the present disclosure are shown in fig. 7, omitting components such as buses, input/output interfaces, and the like. In addition, the electronic device 71 may include any other suitable components, depending on the particular application.
Exemplary computer program product and computer-readable storage Medium
In addition to the above-described methods and apparatus, embodiments of the present disclosure may also be a computer program product comprising computer program instructions that, when executed by a processor, cause the processor to perform the steps in the method for updating a predictive model according to various embodiments of the present disclosure described in the "exemplary methods" section above of this specification.
The computer program product may write program code for carrying out operations for embodiments of the present disclosure in any combination of one or more programming languages, including an object oriented programming language such as Java, C + + or the like and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the user's computing device, partly on the user's device, as a stand-alone software package, partly on the user's computing device and partly on a remote computing device, or entirely on the remote computing device or server.
Furthermore, embodiments of the present disclosure may also be a computer-readable storage medium having stored thereon computer program instructions that, when executed by a processor, cause the processor to perform steps in a method for updating a predictive model according to various embodiments of the present disclosure described in the "exemplary methods" section above in this specification.
The computer-readable storage medium may take any combination of one or more readable media. The readable medium may be a readable signal medium or a readable storage medium. A readable storage medium may include, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or a combination of any of the foregoing. More specific examples (a non-exhaustive list) of the readable storage medium may include: an electrical connection having one or more wires, a portable disk, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.
The foregoing describes the general principles of the present disclosure in conjunction with specific embodiments, however, it is noted that the advantages, effects, etc. mentioned in the present disclosure are merely examples and are not limiting, and they should not be considered essential to the various embodiments of the present disclosure. Furthermore, the foregoing disclosure of specific details is for the purpose of illustration and description and is not intended to be limiting, since the disclosure is not intended to be limited to the specific details so described.
In the present specification, the embodiments are described in a progressive manner, each embodiment focuses on differences from other embodiments, and the same or similar parts in the embodiments are referred to each other. For the system embodiment, since it basically corresponds to the method embodiment, the description is relatively simple, and for the relevant points, reference may be made to the partial description of the method embodiment.
The block diagrams of devices, apparatuses, systems referred to in this disclosure are only given as illustrative examples and are not intended to require or imply that the connections, arrangements, configurations, etc. must be made in the manner shown in the block diagrams. These devices, apparatuses, devices, and systems may be connected, arranged, configured in any manner, as will be appreciated by those skilled in the art. Words such as "including," comprising, "having," and the like are open-ended words that mean "including, but not limited to," and are used interchangeably therewith. The words "or" and "as used herein mean, and are used interchangeably with, the word" and/or, "unless the context clearly dictates otherwise. The word "such as" is used herein to mean, and is used interchangeably with, the phrase "such as but not limited to".
The methods and apparatus of the present disclosure may be implemented in a number of ways. For example, the methods and apparatus of the present disclosure may be implemented by software, hardware, firmware, or any combination of software, hardware, and firmware. The above-described order for the steps of the method is for illustration only, and the steps of the method of the present disclosure are not limited to the order specifically described above unless specifically stated otherwise. Further, in some embodiments, the present disclosure may also be embodied as programs recorded in a recording medium, the programs including machine-readable instructions for implementing the methods according to the present disclosure. Thus, the present disclosure also covers a recording medium storing a program for executing the method according to the present disclosure.
It is also noted that in the devices, apparatuses, and methods of the present disclosure, each component or step can be decomposed and/or recombined. These decompositions and/or recombinations are to be considered equivalents of the present disclosure.
The previous description of the disclosed aspects is provided to enable any person skilled in the art to make or use the present disclosure. Various modifications to these aspects, and the like, will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other aspects without departing from the scope of the disclosure. Thus, the present disclosure is not intended to be limited to the aspects shown herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.
The foregoing description has been presented for purposes of illustration and description. Furthermore, the description is not intended to limit embodiments of the disclosure to the form disclosed herein. While a number of example aspects and embodiments have been discussed above, those of skill in the art will recognize certain variations, modifications, alterations, additions and sub-combinations thereof.
Claims (10)
1. A method for updating a predictive model, comprising:
acquiring user characteristics of a plurality of users from a training sample set of a prediction model currently used by a system;
obtaining user characteristics of a plurality of users from a quasi-training sample set;
determining feature distribution difference information of user features of a plurality of users from the training sample set and user features of a plurality of users from the quasi-training sample set;
if the feature distribution difference information meets a preset difference condition, performing prediction model training by using the user features in the quasi-training sample set;
updating a prediction model currently used by the system by using the prediction model obtained by training;
wherein the predictive model is used to predict a probability that a user in the system will perform a target behavior within a first time range after a predetermined point in time.
2. The method of claim 1, wherein prior to obtaining user features for a plurality of users from a set of quasi-training samples, the method further comprises:
and according to the service data, regularly acquiring the user characteristics of all active users in a second time range closest to the preset historical time to form a quasi-training sample set.
3. The method of claim 1 or 2, wherein the feature distribution difference information comprises:
degree of difference in feature distribution; and/or
Ranking information of the contribution of each feature element in the user feature to the feature distribution difference.
4. The method of any of claims 1 to 3, wherein the determining feature distribution difference information for user features from a plurality of users in the training sample set and user features from a plurality of users in the quasi-training sample set comprises:
respectively setting first version marking information for the user characteristics of each user in the training sample set;
setting second version marking information for the user characteristics of each user in the quasi-training sample set respectively;
respectively taking the user features in the training sample set and the user features in the quasi-training sample set as inputs to be provided to a version recognition model, and respectively carrying out version recognition processing on the input user features through the version recognition model;
and determining the feature distribution difference information of the user features of the plurality of users from the training sample set and the user features of the plurality of users from the quasi-training sample set according to the version recognition processing result output by the version recognition model, the first version marking information and the second version marking information.
5. The method of claim 4, wherein the providing the user features in the training sample set and the user features in the quasi-training sample set as inputs to a version recognition model, respectively, and performing version recognition processing on the input user features via the version recognition model, respectively, comprises:
taking the user characteristics of part of users in the training sample set as a first training sample;
taking the user characteristics of part of users in the quasi-training sample set as second training samples;
training a version recognition model by using the first training sample and the second training sample;
and respectively providing the user characteristics of the other part of users in the training sample set and the user characteristics of the other part of users in the quasi-training sample set as input to the trained version recognition model, and respectively carrying out version recognition processing on the input user characteristics through the trained version recognition model.
6. The method according to any one of claims 4 to 5, wherein the determining, according to the version recognition processing result output by the version recognition model, the first version label information and the second version label information, the feature distribution difference information of the user features from the plurality of users in the training sample set and the user features from the plurality of users in the quasi-training sample set comprises:
and calculating the Mauss correlation coefficient of the user features in the training sample set and the user features in the quasi-training sample set according to the version information of the user features output by the version identification model and the input first version marking information/second version marking information of the user features.
7. The method of claim 6, wherein the performing prediction model training using the user features in the quasi-training sample set if the feature distribution difference information satisfies a preset difference condition comprises:
if the Mazis correlation coefficient reaches a preset threshold value, taking the user characteristics in the quasi-training sample set as input and providing the input to a prediction model to be trained;
carrying out prediction processing on input user characteristics through the prediction model to be trained;
and adjusting the network parameters of the prediction model to be trained according to the prediction processing result output by the prediction model to be trained and the input target behavior generation labeling information of the user characteristics.
8. An apparatus for updating a predictive model, wherein the apparatus comprises:
the system comprises a first acquisition module, a second acquisition module and a third acquisition module, wherein the first acquisition module is used for acquiring user characteristics of a plurality of users from a training sample set of a prediction model currently used by the system;
the second acquisition module is used for acquiring user characteristics of a plurality of users from the quasi-training sample set;
a distribution difference determining module for determining feature distribution difference information of user features of a plurality of users from the training sample set and user features of a plurality of users from the quasi-training sample set;
the prediction model training module is used for performing prediction model training by using the user characteristics in the quasi-training sample set if the characteristic distribution difference information meets a preset difference condition;
the model updating module is used for updating the prediction model currently used by the system by using the prediction model obtained by training;
wherein the predictive model is used to predict a probability that a user in the system will perform a target behavior within a first time range after a predetermined point in time.
9. A computer-readable storage medium, the storage medium storing a computer program for performing the method of any of the preceding claims 1-7.
10. An electronic device, the electronic device comprising:
a processor;
a memory for storing the processor-executable instructions;
the processor is configured to read the executable instructions from the memory and execute the instructions to implement the method of any one of claims 1-7.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010418579.9A CN111598338B (en) | 2020-05-18 | 2020-05-18 | Method, apparatus, medium, and electronic device for updating prediction model |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010418579.9A CN111598338B (en) | 2020-05-18 | 2020-05-18 | Method, apparatus, medium, and electronic device for updating prediction model |
Publications (2)
Publication Number | Publication Date |
---|---|
CN111598338A true CN111598338A (en) | 2020-08-28 |
CN111598338B CN111598338B (en) | 2021-08-31 |
Family
ID=72183468
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202010418579.9A Active CN111598338B (en) | 2020-05-18 | 2020-05-18 | Method, apparatus, medium, and electronic device for updating prediction model |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN111598338B (en) |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112734086A (en) * | 2020-12-24 | 2021-04-30 | 贝壳技术有限公司 | Method and device for updating neural network prediction model |
CN112800076A (en) * | 2021-02-02 | 2021-05-14 | 北京明略昭辉科技有限公司 | Method, device and equipment for data updating |
CN113259141A (en) * | 2021-06-11 | 2021-08-13 | 腾讯科技(深圳)有限公司 | Test method and device of group prediction model, storage medium and electronic equipment |
CN114511563A (en) * | 2022-04-19 | 2022-05-17 | 江苏智云天工科技有限公司 | Method and device for detecting abnormal picture in industrial quality inspection |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109359793A (en) * | 2018-08-03 | 2019-02-19 | 阿里巴巴集团控股有限公司 | A kind of prediction model training method and device for new scene |
CN109947782A (en) * | 2017-11-03 | 2019-06-28 | 中国移动通信有限公司研究院 | A kind of update method of big data real-time application system, apparatus and system |
US20190392494A1 (en) * | 2018-06-22 | 2019-12-26 | General Electric Company | System and method relating to part pricing and procurement |
CN110738304A (en) * | 2018-07-18 | 2020-01-31 | 科沃斯机器人股份有限公司 | Machine model updating method, device and storage medium |
-
2020
- 2020-05-18 CN CN202010418579.9A patent/CN111598338B/en active Active
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109947782A (en) * | 2017-11-03 | 2019-06-28 | 中国移动通信有限公司研究院 | A kind of update method of big data real-time application system, apparatus and system |
US20190392494A1 (en) * | 2018-06-22 | 2019-12-26 | General Electric Company | System and method relating to part pricing and procurement |
CN110738304A (en) * | 2018-07-18 | 2020-01-31 | 科沃斯机器人股份有限公司 | Machine model updating method, device and storage medium |
CN109359793A (en) * | 2018-08-03 | 2019-02-19 | 阿里巴巴集团控股有限公司 | A kind of prediction model training method and device for new scene |
Cited By (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112734086A (en) * | 2020-12-24 | 2021-04-30 | 贝壳技术有限公司 | Method and device for updating neural network prediction model |
CN112800076A (en) * | 2021-02-02 | 2021-05-14 | 北京明略昭辉科技有限公司 | Method, device and equipment for data updating |
CN113259141A (en) * | 2021-06-11 | 2021-08-13 | 腾讯科技(深圳)有限公司 | Test method and device of group prediction model, storage medium and electronic equipment |
CN113259141B (en) * | 2021-06-11 | 2021-09-24 | 腾讯科技(深圳)有限公司 | Test method and device of group prediction model, storage medium and electronic equipment |
CN114511563A (en) * | 2022-04-19 | 2022-05-17 | 江苏智云天工科技有限公司 | Method and device for detecting abnormal picture in industrial quality inspection |
CN114511563B (en) * | 2022-04-19 | 2022-08-05 | 江苏智云天工科技有限公司 | Method and device for detecting abnormal picture in industrial quality inspection |
Also Published As
Publication number | Publication date |
---|---|
CN111598338B (en) | 2021-08-31 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN111598338B (en) | Method, apparatus, medium, and electronic device for updating prediction model | |
US20210216915A1 (en) | Systems and Methods for Predictive Coding | |
JP6588572B2 (en) | Information recommendation method and information recommendation device | |
CN110163647B (en) | Data processing method and device | |
CN112070545B (en) | Method, apparatus, medium, and electronic device for optimizing information reach | |
CN111258593B (en) | Application program prediction model building method and device, storage medium and terminal | |
US11809505B2 (en) | Method for pushing information, electronic device | |
CN111400126B (en) | Network service abnormal data detection method, device, equipment and medium | |
CN117196322B (en) | Intelligent wind control method, intelligent wind control device, computer equipment and storage medium | |
CN111626898B (en) | Method, device, medium and electronic equipment for realizing attribution of events | |
CN112116393B (en) | Method, device and equipment for realizing event user maintenance | |
CN107644042B (en) | Software program click rate pre-estimation sorting method and server | |
CN110659347B (en) | Associated document determining method, device, computer equipment and storage medium | |
CN112116397A (en) | User behavior characteristic real-time processing method and device, storage medium and electronic equipment | |
JP2015184818A (en) | Server, model application propriety determination method and computer program | |
CN116664306A (en) | Intelligent recommendation method and device for wind control rules, electronic equipment and medium | |
CN107885879A (en) | Semantic analysis, device, electronic equipment and computer-readable recording medium | |
CN113988152A (en) | User type prediction model training method, resource allocation method, medium, and apparatus | |
CN113296990A (en) | Method and device for recognizing abnormity of time sequence data | |
KR101871827B1 (en) | Content priority personalization apparatus, method and program | |
CN110083517A (en) | A kind of optimization method and device of user's portrait confidence level | |
CN113886723B (en) | Method and device for determining ordering stability, storage medium and electronic equipment | |
CN112115365B (en) | Model collaborative optimization method, device, medium and electronic equipment | |
CN116700686A (en) | Task duration prediction method, device, server and medium | |
GB2612960A (en) | Authorisation system and methods |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
TA01 | Transfer of patent application right | ||
TA01 | Transfer of patent application right |
Effective date of registration: 20201104 Address after: 100085 Floor 102-1, Building No. 35, West Second Banner Road, Haidian District, Beijing Applicant after: Seashell Housing (Beijing) Technology Co.,Ltd. Address before: 300 457 days Unit 5, Room 1, 112, Room 1, Office Building C, Nangang Industrial Zone, Binhai New Area Economic and Technological Development Zone, Tianjin Applicant before: BEIKE TECHNOLOGY Co.,Ltd. |
|
GR01 | Patent grant | ||
GR01 | Patent grant |