CN112085528A - Data processing method and device - Google Patents

Data processing method and device Download PDF

Info

Publication number
CN112085528A
CN112085528A CN202010937318.8A CN202010937318A CN112085528A CN 112085528 A CN112085528 A CN 112085528A CN 202010937318 A CN202010937318 A CN 202010937318A CN 112085528 A CN112085528 A CN 112085528A
Authority
CN
China
Prior art keywords
data
early warning
training
warning model
user
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202010937318.8A
Other languages
Chinese (zh)
Inventor
李见黎
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Shenyan Intelligent Technology Co ltd
Original Assignee
Beijing Shenyan Intelligent Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Shenyan Intelligent Technology Co ltd filed Critical Beijing Shenyan Intelligent Technology Co ltd
Priority to CN202010937318.8A priority Critical patent/CN112085528A/en
Publication of CN112085528A publication Critical patent/CN112085528A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/02Marketing; Price estimation or determination; Fundraising
    • G06Q30/0201Market modelling; Market analysis; Collecting market data
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2458Special types of queries, e.g. statistical queries, fuzzy queries or distributed queries
    • G06F16/2465Query processing support for facilitating data mining operations in structured databases
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/28Databases characterised by their database models, e.g. relational or object models
    • G06F16/284Relational databases
    • G06F16/285Clustering or classification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/02Marketing; Price estimation or determination; Fundraising
    • G06Q30/0201Market modelling; Market analysis; Collecting market data
    • G06Q30/0202Market predictions or forecasting for commercial activities

Abstract

The invention discloses a data processing method and a data processing device. Wherein, the method comprises the following steps: acquiring sample data and predictive variable data from historical data, wherein the historical data is behavior data generated when a user browses a page; training a loss early warning model according to the sample data and the prediction variable data; classifying users according to a loss early warning model to obtain at least two types of user groups; and matching corresponding loss prevention strategies according to at least two types of user groups. The invention solves the technical problem that the accuracy and the effectiveness of data prediction cannot be guaranteed because the rules used in the process of data analysis in the prior art are manually defined rules.

Description

Data processing method and device
Technical Field
The invention relates to the technical field of internet, in particular to a data processing method and device.
Background
Under the influence of the internet technology, the demand for services generated based on the internet technology is gradually increased based on services derived from the internet, particularly in the field of electronic commerce, and particularly after Artificial Intelligence (AI) is started, how to efficiently combine the AI technology and utilize various computing models to perform data analysis on passenger flow data in an electronic commerce platform becomes a direction for providing an effective technical scheme in the prior art.
However, in the prior art, data analysis on passenger flow data is generally determined according to manual definition rules of technicians, so that whether market behaviors and technical behaviors can be effectively fused, that is, whether a prediction result obtained by the internet technology is similar to a result generated by the influence of the market behaviors or not, and the accuracy and the effectiveness of data prediction cannot be guaranteed by using such data analysis schemes.
For the above-mentioned problem that the accuracy and effectiveness of data prediction cannot be guaranteed because the rules used in the process of analyzing data in the prior art are manually defined rules, no effective solution is proposed at present.
Disclosure of Invention
The embodiment of the invention provides a data processing method and a data processing device, which at least solve the technical problem that the accuracy and the effectiveness of data prediction cannot be guaranteed because the rule used in the data analysis process in the prior art is a manually defined rule.
According to an aspect of an embodiment of the present invention, there is provided a data processing method including: acquiring sample data and predictive variable data from historical data, wherein the historical data is behavior data generated when a user browses a page; training a loss early warning model according to the sample data and the prediction variable data; classifying users according to a loss early warning model to obtain at least two types of user groups; and matching corresponding loss prevention strategies according to at least two types of user groups.
Optionally, the obtaining of the sample data and the predictor variable data from the historical data includes: classifying the historical data according to the historical behavior data of the page browsed by the user to obtain observation period data and expression period data; generating sample data according to the observation period data; predictive variable data is generated from the performance period data.
Further, optionally, the observation period data comprises: at least one of purchase transaction amount, purchase item class, purchase frequency and purchase time of at least one user when browsing the page; the predictor variable data includes: the number of users that are churned, the type of users, and the impact of predictive variables on churning.
Optionally, training the attrition early warning model according to the sample data and the predictive variable data includes: segmenting the sample data to obtain at least one training set and a test set corresponding to the at least one training set; training a loss early warning model by a preset verification method according to at least one training set and a test set corresponding to at least one training set to obtain trained model parameters; and correcting the loss early warning model according to the trained model parameters and the prediction variable data to obtain a corrected loss early warning model.
Further, optionally, classifying the users according to the churn early warning model to obtain at least two types of user groups includes: scoring the users according to the loss early warning model to obtain at least one scored user group; and matching the corresponding risk label according to the score of at least one user group to obtain at least two types of user groups.
According to another aspect of the embodiments of the present invention, there is also provided a data processing apparatus, including: the acquisition module is used for acquiring sample data and predictive variable data from historical data, wherein the historical data is behavior data generated when a user browses a page; the training module is used for training the loss early warning model according to the sample data and the predictive variable data; the classification module is used for classifying the users according to the loss early warning model to obtain at least two types of user groups; and the matching module is used for matching the corresponding loss prevention strategies according to at least two types of user groups.
Optionally, the obtaining module includes: the classification unit is used for classifying the historical data according to the historical behavior data of the page browsed by the user to obtain observation period data and presentation period data; the first data generation unit is used for generating sample data according to the observation period data; and a second data generation unit for generating predictor variable data based on the presentation period data.
Further, optionally, the observation period data comprises: at least one of purchase transaction amount, purchase item class, purchase frequency and purchase time of at least one user when browsing the page; the predictor variable data includes: the number of users that are churned, the type of users, and the impact of predictive variables on churning.
Optionally, the training module includes: the data set dividing unit is used for segmenting the sample data to obtain at least one training set and a test set corresponding to the at least one training set; the training unit is used for training the loss early warning model by a preset verification method according to at least one training set and a test set corresponding to the at least one training set to obtain model parameters after training; and the correcting unit is used for correcting the loss early warning model according to the trained model parameters and the prediction variable data to obtain the corrected loss early warning model.
Further, optionally, the classification module includes: the scoring unit is used for scoring the users according to the loss early warning model to obtain at least one scored user group; and the classification unit is used for matching the corresponding risk label according to the score of at least one user group to obtain at least two user groups.
In the embodiment of the invention, sample data and predictive variable data are acquired from historical data, wherein the historical data is behavior data generated when a user browses a page; training a loss early warning model according to the sample data and the prediction variable data; classifying users according to a loss early warning model to obtain at least two types of user groups; according to the loss prevention strategy corresponding to the matching of at least two types of user groups, the purpose of effectively marking and distinguishing the user groups is achieved, so that the technical effect of guaranteeing the accuracy and effectiveness of data prediction is achieved, and the technical problem that in the prior art, the accuracy and effectiveness of data prediction cannot be guaranteed because the rules used in the data analysis process are manually defined rules is solved.
Drawings
The accompanying drawings, which are included to provide a further understanding of the invention and are incorporated in and constitute a part of this application, illustrate embodiment(s) of the invention and together with the description serve to explain the invention without limiting the invention. In the drawings:
FIG. 1 is a schematic flow diagram of a data processing method according to an embodiment of the invention;
FIG. 2a is a diagram illustrating a distribution of prediction values in a data processing method according to an embodiment of the present invention;
FIG. 2b is a schematic diagram of a calibration curve in a data processing method according to an embodiment of the present invention;
FIG. 3 is a schematic illustration of different risk client classes in a data processing method according to an embodiment of the present invention;
fig. 4a to 4c are schematic diagrams of a scheme implementation architecture in a data processing method according to an embodiment of the present invention;
fig. 5 is a schematic diagram of a data processing apparatus according to an embodiment of the present invention.
Detailed Description
In order to make the technical solutions of the present invention better understood, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
It should be noted that the terms "first," "second," and the like in the description and claims of the present invention and in the drawings described above are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used is interchangeable under appropriate circumstances such that the embodiments of the invention described herein are capable of operation in sequences other than those illustrated or described herein. Furthermore, the terms "comprises," "comprising," and "having," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed, but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.
Example 1
In accordance with an embodiment of the present invention, there is provided a method embodiment of a data processing method, it being noted that the steps illustrated in the flowchart of the figure may be performed in a computer system such as a set of computer-executable instructions and that, although a logical order is illustrated in the flowchart, in some cases, the steps illustrated or described may be performed in an order different than that presented herein.
Fig. 1 is a schematic flow chart of a data processing method according to an embodiment of the present invention, as shown in fig. 1, the method includes the following steps:
step S102, acquiring sample data and predictive variable data from historical data, wherein the historical data is behavior data generated when a user browses a page;
in an optional implementation, obtaining sample data and predictor variable data from historical data includes: classifying the historical data according to the historical behavior data of the page browsed by the user to obtain observation period data and expression period data; generating sample data according to the observation period data; predictive variable data is generated from the performance period data.
Further, optionally, the observation period data comprises: at least one of purchase transaction amount, purchase item class, purchase frequency and purchase time of at least one user when browsing the page; the predictor variable data includes: the number of users that are churned, the type of users, and the impact of predictive variables on churning.
Step S104, training a loss early warning model according to the sample data and the predictive variable data;
in an optional implementation manner, training the attrition early warning model according to the sample data and the predictive variable data includes: segmenting the sample data to obtain at least one training set and a test set corresponding to the at least one training set; training a loss early warning model by a preset verification method according to at least one training set and a test set corresponding to at least one training set to obtain trained model parameters; and correcting the loss early warning model according to the trained model parameters and the prediction variable data to obtain a corrected loss early warning model.
Specifically, in the embodiment of the present application, the selection of the loss early warning model may be: the method considers that the basic data features are not high-dimensional and sparse and are relatively dense, and is based on the inherent advantages of the algorithm: 1. and (4) regularizing. 2. And (4) training parallelism of large-scale data. 3. And the XGboost algorithm is selected according to the characteristics of flexibility, missing value processing and the like, and the interval of the model AUC is between 0.75 and 0.82 on different data sets.
The optimization of the model comprises the following steps:
step 1: optimizing model parameters; and (3) adopting GridSearchCV for multiple verification, and optimizing model parameters: the optimal value of n _ estimators is 400, max _ depth is 10, min _ child _ weight is 5, colsample _ byte is 0.3, learning rate is 0.1, etc.
Step 2: performing cross validation; the cross validation is that data is repeatedly used, obtained sample data is segmented and combined into different training sets and test sets, the training sets are used for training the model, and the test sets are used for evaluating the quality of model prediction, so that a plurality of groups of different training sets and test sets can be obtained on the basis, and a simple cross method, an S-fold cross validation method and a leave-one cross validation method are generally adopted. In the method, an S-fold cross-validation method is used for improving the model normalization capability and finding the optimal model parameters.
And step 3: model fusion; by fusing a plurality of different models, the performance of machine learning can be improved. The model fusion method in the embodiment of the application comprises the following steps:
average method: the average method includes a general evaluation and a weighted average. For the averaging method, the method is generally used in a regression prediction model, and in the Boosting series fusion model, weighted average fusion is generally adopted.
Voting method: there are absolute majority votes (more than half votes), relative majority votes (most votes), weighted votes. The method is generally used for classification models and is used in bagging models.
A learning method comprises the following steps: a more powerful combination strategy is to use "learning", i.e. combining by another learner, the individual learner being referred to as the primary learner and the learner used for combining being referred to as the secondary learner or meta-learner.
In the examples of the present application, the voting method is taken as an example, and contributes to the AUC of the model by about 2%.
And 4, step 4: calibrlation: and carrying out corresponding calibration according to the real data distribution and the predicted data distribution.
As shown in fig. 2a and 2b, fig. 2a is a schematic diagram of a distribution of prediction values in a data processing method according to an embodiment of the present invention; FIG. 2b is a schematic diagram of a calibration curve in a data processing method according to an embodiment of the present invention; in fig. 2b, the calibration graph (reliability curve) is shown in fig. 2b, in the first graph, the vertical axis represents the positive score, in the first graph, the dotted line represents the ideal calibration curve (perfect calibrated), and the solid line represents the prediction curve; in the second graph, the vertical axis represents the parameter, and the curve in the graph represents the predicted curve; in the embodiment of the application, the distribution of the prediction value has a larger relation with the proportion of the positive sample and the negative sample, and corresponding calibration is carried out according to the real distribution.
Step S106, classifying the users according to the loss early warning model to obtain at least two types of user groups;
in an optional implementation manner, classifying users according to the churn early warning model to obtain at least two types of user groups includes: scoring the users according to the loss early warning model to obtain at least one scored user group; and matching the corresponding risk label according to the score of at least one user group to obtain at least two types of user groups.
Step S108, matching the corresponding loss prevention strategy according to at least two types of user groups.
Specifically, as shown in fig. 3, fig. 3 is a schematic diagram of different risk customer levels in the data processing method according to the embodiment of the present invention. In the embodiment of the application, a user portrait service is constructed by generating labels on large-scale data regularly, and users with different levels are divided: high risk customers, medium risk customers, low risk customers, no risk customers, and subsequently, different countermeasures can be adopted according to different value levels.
In summary, with reference to steps S102 to S108, as shown in fig. 4a to 4c, fig. 4a to 4c are schematic diagrams of a scheme implementation architecture in a data processing method according to an embodiment of the present invention, and as shown in fig. 4a and 4b, in an observation period window, a batch of sample data is mined from historical data to perfect an attrition evaluation dimension, an attrition early warning model is constructed using presentation period window data (i.e., predictive variable data in the present embodiment), an attrition probability is predicted for users who have not yet been definitely attrited through the model in the prediction window for several weeks or months in the future, an attrition scoring system is established, corresponding attrition labels are applied to user groups through scoring rules, at least two types of user groups are obtained, and corresponding anti-attrition strategies are matched for different user groups. In the embodiment of the present application, the attrition risk classes of the user groups are divided based on the average order interval duration exceeding 95%.
As shown in fig. 4c, the acquired historical data may be historical offline behavior data and real-time update data for the data lake Datalake, based on the beginning of 0 a day in the morning; constructing an Extract-Transform-Load (ETL for short) according to historical offline behavior data and real-time update data, and performing feature engineering on the data lake to obtain offline features and training data, wherein the offline features are cached; updating the model training service according to the training data; and (4) generating a model file by the reason training data, and estimating the loss risk by using the generated model file.
The loss reasons of the user in the embodiment of the present application may include: 1. a user reason; 2. service and product quality; 3. a competing factor; 4. and (6) feeding back. As shown in table 1:
TABLE 1
Figure BDA0002672411160000061
Figure BDA0002672411160000071
In the embodiment of the invention, sample data and predictive variable data are acquired from historical data, wherein the historical data is behavior data generated when a user browses a page; training a loss early warning model according to the sample data and the prediction variable data; classifying users according to a loss early warning model to obtain at least two types of user groups; according to the loss prevention strategy corresponding to the matching of at least two types of user groups, the purpose of effectively marking and distinguishing the user groups is achieved, so that the technical effect of guaranteeing the accuracy and effectiveness of data prediction is achieved, and the technical problem that in the prior art, the accuracy and effectiveness of data prediction cannot be guaranteed because the rules used in the data analysis process are manually defined rules is solved.
Example 2
According to another aspect of the embodiments of the present invention, there is also provided a data processing apparatus, and fig. 5 is a schematic diagram of the data processing apparatus according to the embodiments of the present invention, as shown in fig. 5, including: the obtaining module 52 is configured to obtain sample data and predicted variable data from historical data, where the historical data is behavior data generated when a user browses a page; a training module 54, configured to train a loss early warning model according to the sample data and the predictive variable data; the classification module 56 is configured to classify users according to the loss early warning model to obtain at least two user groups; the matching module 58 is configured to match the corresponding anti-attrition strategies according to at least two types of user groups.
Optionally, the obtaining module 52 includes: the classification unit is used for classifying the historical data according to the historical behavior data of the page browsed by the user to obtain observation period data and presentation period data; the first data generation unit is used for generating sample data according to the observation period data; and a second data generation unit for generating predictor variable data based on the presentation period data.
Further, optionally, the observation period data comprises: at least one of purchase transaction amount, purchase item class, purchase frequency and purchase time of at least one user when browsing the page; the predictor variable data includes: the number of users that are churned, the type of users, and the impact of predictive variables on churning.
Optionally, the training module 54 includes: the data set dividing unit is used for segmenting the sample data to obtain at least one training set and a test set corresponding to the at least one training set; the training unit is used for training the loss early warning model by a preset verification method according to at least one training set and a test set corresponding to the at least one training set to obtain model parameters after training; and the correcting unit is used for correcting the loss early warning model according to the trained model parameters and the prediction variable data to obtain the corrected loss early warning model.
Further, optionally, the classification module 56 includes: the scoring unit is used for scoring the users according to the loss early warning model to obtain at least one scored user group; and the classification unit is used for matching the corresponding risk label according to the score of at least one user group to obtain at least two user groups.
The above-mentioned serial numbers of the embodiments of the present invention are merely for description and do not represent the merits of the embodiments.
In the above embodiments of the present invention, the descriptions of the respective embodiments have respective emphasis, and for parts that are not described in detail in a certain embodiment, reference may be made to related descriptions of other embodiments.
In the embodiments provided in the present application, it should be understood that the disclosed technology can be implemented in other ways. The above-described embodiments of the apparatus are merely illustrative, and for example, the division of the units may be a logical division, and in actual implementation, there may be another division, for example, multiple units or components may be combined or integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection through some interfaces, units or modules, and may be in an electrical or other form.
The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.
In addition, functional units in the embodiments of the present invention may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit. The integrated unit can be realized in a form of hardware, and can also be realized in a form of a software functional unit.
The integrated unit, if implemented in the form of a software functional unit and sold or used as a stand-alone product, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present invention may be embodied in the form of a software product, which is stored in a storage medium and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present invention. And the aforementioned storage medium includes: a U-disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a removable hard disk, a magnetic or optical disk, and other various media capable of storing program codes.
The foregoing is only a preferred embodiment of the present invention, and it should be noted that, for those skilled in the art, various modifications and decorations can be made without departing from the principle of the present invention, and these modifications and decorations should also be regarded as the protection scope of the present invention.

Claims (10)

1. A data processing method, comprising:
acquiring sample data and predictive variable data from historical data, wherein the historical data is behavior data generated when a user browses a page;
training a loss early warning model according to the sample data and the predictive variable data;
classifying users according to the loss early warning model to obtain at least two types of user groups;
and matching corresponding loss prevention strategies according to the at least two types of user groups.
2. The method of claim 1, wherein obtaining sample data and predictor variable data from historical data comprises:
classifying the historical data according to the historical behavior data of the user browsing page to obtain observation period data and presentation period data;
generating the sample data according to the observation period data;
and generating the predictive variable data according to the presentation period data.
3. The method of claim 2, wherein the observation period data comprises: at least one of purchase transaction amount, purchase item class, purchase frequency and purchase time of at least one user when browsing the page; the predictor variable data includes: the number of users that are churned, the type of users, and the impact of predictive variables on churning.
4. The method of claim 1 or 2, wherein training an attrition early warning model based on the sample data and the predictor variable data comprises:
segmenting the sample data to obtain at least one training set and a test set corresponding to the at least one training set;
training the loss early warning model by a preset verification method according to the at least one training set and the test set corresponding to the at least one training set to obtain trained model parameters;
and correcting the loss early warning model according to the trained model parameters and the predictive variable data to obtain a corrected loss early warning model.
5. The method of claim 4, wherein the classifying users according to the attrition early warning model to obtain at least two user groups comprises:
scoring the users according to the loss early warning model to obtain at least one scored user group;
and matching the corresponding risk label according to the score of the at least one user group to obtain at least two types of user groups.
6. A data processing apparatus, comprising:
the acquisition module is used for acquiring sample data and predictive variable data from historical data, wherein the historical data is behavior data generated when a user browses a page;
the training module is used for training a loss early warning model according to the sample data and the predictive variable data;
the classification module is used for classifying users according to the loss early warning model to obtain at least two types of user groups;
and the matching module is used for matching the corresponding loss prevention strategies according to the at least two types of user groups.
7. The apparatus of claim 6, wherein the obtaining module comprises:
the classification unit is used for classifying the historical data according to the historical behavior data of the page browsed by the user to obtain observation period data and presentation period data;
a first data generating unit, configured to generate the sample data according to the observation period data;
and the second data generation unit is used for generating the predictive variable data according to the presentation period data.
8. The apparatus of claim 7, wherein the observation period data comprises: at least one of purchase transaction amount, purchase item class, purchase frequency and purchase time of at least one user when browsing the page; the predictor variable data includes: the number of users that are churned, the type of users, and the impact of predictive variables on churning.
9. The apparatus of claim 6 or 7, wherein the training module comprises:
the data set dividing unit is used for segmenting the sample data to obtain at least one training set and a test set corresponding to the at least one training set;
the training unit is used for training the loss early warning model through a preset verification method according to the at least one training set and the test set corresponding to the at least one training set to obtain model parameters after training;
and the correcting unit is used for correcting the loss early warning model according to the trained model parameters and the predictive variable data to obtain a corrected loss early warning model.
10. The apparatus of claim 9, wherein the classification module comprises:
the scoring unit is used for scoring the users according to the loss early warning model to obtain at least one scored user group;
and the classification unit is used for matching the corresponding risk label according to the score of the at least one user group to obtain at least two user groups.
CN202010937318.8A 2020-09-08 2020-09-08 Data processing method and device Pending CN112085528A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010937318.8A CN112085528A (en) 2020-09-08 2020-09-08 Data processing method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010937318.8A CN112085528A (en) 2020-09-08 2020-09-08 Data processing method and device

Publications (1)

Publication Number Publication Date
CN112085528A true CN112085528A (en) 2020-12-15

Family

ID=73732486

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010937318.8A Pending CN112085528A (en) 2020-09-08 2020-09-08 Data processing method and device

Country Status (1)

Country Link
CN (1) CN112085528A (en)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108989096A (en) * 2018-06-28 2018-12-11 亚信科技(成都)有限公司 A kind of broadband user's attrition prediction method and system
CN110147803A (en) * 2018-02-08 2019-08-20 北大方正集团有限公司 Customer churn early-warning processing method and device
CN110585726A (en) * 2019-09-16 2019-12-20 腾讯科技(深圳)有限公司 User recall method, device, server and computer readable storage medium
CN110889724A (en) * 2019-11-22 2020-03-17 北京明略软件系统有限公司 Customer churn prediction method, customer churn prediction device, electronic equipment and storage medium

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110147803A (en) * 2018-02-08 2019-08-20 北大方正集团有限公司 Customer churn early-warning processing method and device
CN108989096A (en) * 2018-06-28 2018-12-11 亚信科技(成都)有限公司 A kind of broadband user's attrition prediction method and system
CN110585726A (en) * 2019-09-16 2019-12-20 腾讯科技(深圳)有限公司 User recall method, device, server and computer readable storage medium
CN110889724A (en) * 2019-11-22 2020-03-17 北京明略软件系统有限公司 Customer churn prediction method, customer churn prediction device, electronic equipment and storage medium

Similar Documents

Publication Publication Date Title
US20220335501A1 (en) Item recommendations using convolutions on weighted graphs
JP5946073B2 (en) Estimation method, estimation system, computer system, and program
CN110598016A (en) Method, device, equipment and medium for recommending multimedia information
US20190236497A1 (en) System and method for automated model selection for key performance indicator forecasting
CN108509975B (en) Abnormal online clustering method and device and electronic equipment
Wu et al. Comparison of different machine learning algorithms for multiple regression on black friday sales data
CN110930017A (en) Data processing method and device
JP5094643B2 (en) Expected successful bid price calculation apparatus, expected successful bid price calculation method, and computer program
Nabi et al. Bayesian meta-prior learning using Empirical Bayes
CN112308623A (en) High-quality client loss prediction method and device based on supervised learning and storage medium
Tauhid et al. Sentiment analysis of indonesians response to influencer in social media
CN112085528A (en) Data processing method and device
CN113313615A (en) Method and device for quantitatively grading and grading enterprise judicial risks
CN115293867A (en) Financial reimbursement user portrait optimization method, device, equipment and storage medium
Doan et al. Generating realistic sequences of customer-level transactions for retail datasets
CN108287902B (en) Recommendation system method based on data non-random missing mechanism
JP7363911B2 (en) Display method, display program and information processing device
CN113407827A (en) Information recommendation method, device, equipment and medium based on user value classification
WO2023062750A1 (en) Data generation method, data generation program, and information processing device
Wójcik et al. Improvement of e-commerce recommendation systems with deep hybrid collaborative filtering with content: A case study
CN111881355B (en) Object recommendation method and device, storage medium and processor
JP7309673B2 (en) Information processing device, information processing method, and program
CN115511582B (en) Commodity recommendation system and method based on artificial intelligence
Sharma et al. Recommendation system for movies using improved version of som with hybrid filtering methods
US20230325630A1 (en) Graph learning-based system with updated vectors

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination