CN109558962A - Predict device, method and storage medium that telecommunication user is lost - Google Patents

Predict device, method and storage medium that telecommunication user is lost Download PDF

Info

Publication number
CN109558962A
CN109558962A CN201710881795.5A CN201710881795A CN109558962A CN 109558962 A CN109558962 A CN 109558962A CN 201710881795 A CN201710881795 A CN 201710881795A CN 109558962 A CN109558962 A CN 109558962A
Authority
CN
China
Prior art keywords
data
prediction
prediction model
user
lost
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201710881795.5A
Other languages
Chinese (zh)
Inventor
季文海
王维东
陈海波
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
China Mobile Communications Group Co Ltd
China Mobile Group Shanxi Co Ltd
Original Assignee
China Mobile Communications Group Co Ltd
China Mobile Group Shanxi Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by China Mobile Communications Group Co Ltd, China Mobile Group Shanxi Co Ltd filed Critical China Mobile Communications Group Co Ltd
Priority to CN201710881795.5A priority Critical patent/CN109558962A/en
Publication of CN109558962A publication Critical patent/CN109558962A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/04Forecasting or optimisation specially adapted for administrative or management purposes, e.g. linear programming or "cutting stock problem"
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2216/00Indexing scheme relating to additional aspects of information retrieval not explicitly covered by G06F16/00 and subgroups
    • G06F2216/03Data mining

Landscapes

  • Engineering & Computer Science (AREA)
  • Business, Economics & Management (AREA)
  • Human Resources & Organizations (AREA)
  • Economics (AREA)
  • Strategic Management (AREA)
  • Marketing (AREA)
  • Game Theory and Decision Science (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Development Economics (AREA)
  • Operations Research (AREA)
  • Quality & Reliability (AREA)
  • Tourism & Hospitality (AREA)
  • Physics & Mathematics (AREA)
  • General Business, Economics & Management (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention discloses a kind of analytical equipment, method and storage mediums that prediction telecommunication user is lost.This method comprises: data preprocessing module, the characteristic value of the teledata for acquiring telecommunication user, form data bins to be analyzed, data sample is pulled out from the overall data of data bins, is training set and test set by data sample random division;Model building module, for establishing prediction model for training set based on merging mechanism and Decision tree classified algorithms are split;Scoring modules are assessed, prediction model are tested using test set, according to the estimated performance of test result assessment prediction model;Forecast analysis module analyzes overall data using the prediction model of test performance qualification, and predicts the loss user in telecommunication user.The function that machine learns automatically not only may be implemented in the embodiment of the present invention as a result, but also the problem of can solve data trend, eliminates due to data trend and the defect of bring accuracy difference, improve the accuracy and precision of prediction of data.

Description

Predict device, method and storage medium that telecommunication user is lost
Technical field
The invention belongs to telecommunications big data field more particularly to it is a kind of prediction telecommunication user be lost analytical equipment, Method and storage medium.
Background technique
With the fast development of network communication technology, the type of telecommunications service is more and more, and the selection of telecommunication user is also got over Come wider.Telecom operators are also preventing frequent customer to be lost while exploring new client actively.Telecommunication user attrition prediction point Analysis technology receives the attention of telecom operators.Traditional telecommunication user attrition prediction mainly uses questionnaire form or compares controlled Experiment, obtains the information gain of telecommunication user, and the information gain of telecommunication user is compared with preset threshold, obtains telecommunications The strong relating attribute of user.Finally probability is lost using the telecommunication user of the strong relating attribute to be lost in advance to carry out telecommunication user It is alert.
But applicant it has been investigated that: the leakage of telecommunication user generally requires to carry out more relating attributes comprehensive point Analysis, merely with more unilateral, the prediction as the judgment mode of prediction customer churn of the customer churn probability of single strong relating attribute As a result reasonability is also lacking, and the precision of prediction is relatively low.
The precision of prediction for how improving telecommunication user loss, becomes industry technical problem urgently to be resolved.
Summary of the invention
Precision in order to solve the problems, such as telecommunication user attrition prediction is lower, and the embodiment of the invention provides a kind of prediction electricity Analytical equipment, method and the storage medium that credit household is lost.
In a first aspect, providing a kind of analytical equipment that prediction telecommunication user is lost.The device includes:
Data preprocessing module, the characteristic value of the teledata for acquiring telecommunication user, forms data bins to be analyzed, Data sample is pulled out from the overall data of data bins, is training set and test set by data sample random division;
Model building module, for establishing prediction model for training set based on merging mechanism and Decision tree classified algorithms are split;
Scoring modules are assessed, prediction model are tested using test set, according to the predictability of test result assessment prediction model Energy;
Forecast analysis module is analyzed overall data using the prediction model of test performance qualification, and is predicted in telecommunication user Loss user.
Second aspect provides a kind of analysis method that prediction telecommunication user is lost.This method comprises:
The characteristic value for acquiring the teledata of telecommunication user, forms data bins to be analyzed, from the overall data of data bins In pull out data sample, by data sample random division be training set and test set;
Based on merging mechanism and Decision tree classified algorithms are split, prediction model is established on training set;
Prediction model is tested using test set, according to the estimated performance of test result assessment prediction model;
Overall data are analyzed using the prediction model of test performance qualification, and predict the user that can be lost in telecommunication user.
The third aspect provides a kind of analytical equipment that prediction telecommunication user is lost.The device includes:
Memory, for storing program;
Processor, for executing the program of memory storage, the method that program makes processor execute above-mentioned second aspect.
Fourth aspect provides a kind of computer readable storage medium.The storage medium includes instruction,
When instruction is run on computers, so that the method that computer executes above-mentioned second aspect.
5th aspect, provides a kind of computer program product comprising instruction.When the product is run on computers, So that computer executes method described in above-mentioned various aspects.
6th aspect, provides a kind of computer program.When the computer program is run on computers, so that calculating Machine executes method described in above-mentioned various aspects.
On the one hand, foregoing invention embodiment can acquire the spy of the teledata of telecommunication user by data preprocessing module Value indicative forms data bins to be analyzed, by pulling out data sample from the overall data of data bins, and by data sample with Machine is divided into the preprocessed datas such as training set and test set, can train and extract preferable data in the data of magnanimity, by compared with Good data training and test, can not only reduce operand, reduce time-consuming, and the function that machine learns automatically may be implemented Energy.
On the other hand, foregoing invention embodiment can be based on splitting merging mechanism by model building module and decision tree is calculated Method is eliminated due to data trend and bring the problem of establishing prediction model for training set, can solve data trend The defect of accuracy difference, improves the accuracy of data, and then significantly improve precision of prediction.
Another aspect, foregoing invention embodiment can test prediction model, root using test set by assessment scoring modules According to the estimated performance of test result assessment prediction model;The prediction model point of test performance qualification is utilized using forecast analysis module Overall data are analysed, and predict that the loss user in telecommunication user may be implemented: the time required to effectively reducing whole prediction, being realized The real time implementation of customer churn prediction, high efficiency are lost prevention method by the telecommunication user based on event, realize and flow to user Lose the integration control of prediction and prevention.
Detailed description of the invention
In order to illustrate the technical solution of the embodiments of the present invention more clearly, will make below to required in the embodiment of the present invention Attached drawing is briefly described, it should be apparent that, drawings described below is only some embodiments of the present invention, for For those of ordinary skill in the art, without creative efforts, it can also be obtained according to these attached drawings other Attached drawing.
Fig. 1 is the structural schematic diagram for the analytical equipment that the prediction telecommunication user of one embodiment of the invention is lost;
Fig. 2 is the structural schematic diagram for the analytical equipment that the prediction telecommunication user of another embodiment of the present invention is lost;
Fig. 3 is the flow diagram of each functional unit analyzing and processing data in Fig. 2;
Fig. 4 is the implementation flow diagram of data preprocessing module in Fig. 1;
Fig. 5 is the implementation flow diagram of model building module in Fig. 1;
Fig. 6 is the final division result figure divided to data set of one embodiment of the invention;
Fig. 7 is the schematic diagram of the fractionation merging mechanism of the model building module of one embodiment of the invention;
Fig. 8 is the flow diagram of the fractionation and merging data in Fig. 7;
Fig. 9 is the flow chart of the prediction model assessment marking of one embodiment of the invention;
Figure 10 is that the telecommunication user of one embodiment of the invention is lost the flow chart of prevention;
Figure 11 is the flow diagram that the prediction telecommunication user of one embodiment of the invention is lost.
Specific embodiment
In order to make the object, technical scheme and advantages of the embodiment of the invention clearer, below in conjunction with the embodiment of the present invention In attached drawing, technical scheme in the embodiment of the invention is clearly and completely described, it is clear that described embodiment is A part of the embodiment of the present invention, instead of all the embodiments.Based on the embodiments of the present invention, those of ordinary skill in the art Every other embodiment obtained without creative efforts, shall fall within the protection scope of the present invention.
It should be noted that in the absence of conflict, the features in the embodiments and the embodiments of the present application can phase Mutually combination.The application is described in detail below with reference to the accompanying drawings and in conjunction with the embodiments.
Fig. 1 is the structural schematic diagram for the analytical equipment that the prediction telecommunication user of one embodiment of the invention is lost.
As shown in Figure 1, the analytical equipment that prediction telecommunication user is lost may include: that data preprocessing module 110, model are built Formwork erection block 120, assessment scoring modules 130 and forecast analysis module 140.Wherein, data preprocessing module 110 can be used for acquiring The characteristic value of the teledata of telecommunication user forms data bins to be analyzed, pulls out data from the overall data of data bins Data sample random division is training set and test set by sample;Model building module 120 can be used for based on fractionation merging machine System and Decision tree classified algorithms establish prediction model for training set;Assessment scoring modules 130 can use test set test prediction mould Type, according to the estimated performance of test result assessment prediction model;Forecast analysis module 140 can use the pre- of test performance qualification The overall data of model analysis are surveyed, and predict the loss user in telecommunication user.
In some embodiments, data preprocessing module 110 may include: acquisition unit, brief unit, derived units and Statistic unit.Wherein, acquisition unit can be used for acquiring the original variable of the teledata of telecommunication user;Brief unit can be used In carrying out integrality and/or reasonableness check to original variable, deletes and check unsanctioned data, form the residue of teledata Variable;Derived units can be used for deriving surplus variable, obtain the primitive character value of teledata;Statistic unit can be with For using Pearson correlation coefficient for statistical analysis to primitive character value, highly relevant characteristic value is obtained, height is deleted Relevant characteristic value generates the characteristic value of teledata.
The embodiment of the present invention can reduce highly relevant attribute, solve by deleting highly relevant characteristic value as a result, The existing customer churn probability merely with single strong relating attribute is more unilateral, pre- as the judgment mode of prediction customer churn Survey the problem of the reasonability difference of result.
In some embodiments, the characteristic value of the teledata for the telecommunication user that data preprocessing module 110 acquires can be with One or more including following characteristics value: the characteristic of the decision behavior of telecommunication user;The usage behavior of telecommunication user Characteristic;The characteristic of the loyalty behavior of telecommunication user.
Wherein, decision behavior characteristic mainly may include: income, monthly rent, set meal type, standard rate, set meal industry The original variables such as type of being engaged in and dosage, customer consumption record.After row integrality and/or reasonableness check, these original changes Amount through it is derivative further obtains user pay mean value, user pay monthly rent ratio, set meal use value, set meal use value rate this 4 A integrated decision-making behavioural characteristic data.
User's usage behavior characteristic can be used for embodying user in the product or service process provided using operator In the usage behavior feature that shows.After row integrality and/or reasonableness check, user's usage behavior characteristic is through spreading out It mainly may include: average talk number after life, local call duration, the national distance duration of call, total long-distance duration of call, overflow Swim the ratio between the duration of call, short message item number, short message item number and talk times, data traffic.
Consumer loyalty degree behavioural characteristic data primarily can be used for description user ordered by operator's product type and Quantity, and alteration trend is used to the recent of products & services ordered by it.By row integrality and/or reasonableness check it Afterwards, consumer loyalty degree behavioural characteristic data mainly may include: product subscription quantity, local call trend, the country after derivative Toll message trend, internal long distance call trend frequency, roaming call trend, short message send trend, flow dosage trend.
In some embodiments, data bins can be used for storing data to be analyzed.The overall data of data bins can be All data in data bins.Data sample is pulled out from the overall data of data bins, the mode randomly selected can be taken, It can also be by the way of being extracted by preset ratio etc..It is the implementation of training set and test set by data sample random division But such as: data sample is randomly divided into 2n parts of small data samples, it, will be another n parts small using n parts of small data samples as training set Data sample is as test set.
Data preprocessing module 110 can be acquired, derive, filter and divide to the characteristic value of separate sources as a result, To meet the requirement of device or software systems.
In some embodiments, model building module 120 may include: fixed ginseng unit, split cells, establish unit and whole Close unit.Wherein, ginseng unit is determined for being lost the first ratio of user and non-streaming appraxia family in training set surely, and really The second ratio of user and non-streaming appraxia family are lost in stator training set;Split cells can be used for calculating the second ratio and first The third ratio of the ratio of ratio is based on third ratio, training set is split as multiple sub- training sets;Establishing unit can be used for Each sub- training set, which is based respectively on, according to Decision tree classified algorithms establishes sub- prediction model;Integral unit can be used for according to weight votes Principle integrates sub- prediction model, generates prediction model.
Model building module 120 can establish rapidly prediction model according to Decision tree classified algorithms for training set, and can as a result, Data skewed popularity is solved the problems, such as to introduce fractionation merging mechanism, and Data Integration mechanism is provided, it is credible to promote prediction.In the part Appearance will also continue to describe below.
Assessing scoring modules 130 may include: test cell, assessment unit and reporting unit.Wherein, test cell can be with For exporting the prediction rule of prediction model, test set is predicted according to prediction rule, the model for obtaining prediction model mentions Rise rate (LIFT) and/or prediction hit rate;Assessment unit can be used for assessing prediction model according to LIFT and hit rate Marking;Reporting unit can be used for when assessing score more than or equal to threshold value, generate the qualified report of assessment;When assessment point When number is less than threshold value, generation assessment is unqualified, needs to resurvey the report of the characteristic value of the teledata of telecommunication user.By This, assessment scoring modules 130 can export prediction rule, according to the LIFT of prediction model and prediction model hit rate come to pre- It surveys model and carries out marking assessment.
Forecast analysis module 140 can carry out global analysis to overall data, record customer analysis as a result, Building Customer Loyalty Degree is given a mark.When score is below or equal to threshold value, predict that the user can be lost;When score is higher than threshold value, prediction should User will not be lost.
In some embodiments, the characteristic value (for example, associated data of telecommunication user) of the teledata of telecommunication user exists After preliminary screening, the acquisition and derivative of data are carried out by system, conversion meets the high-quality data of system input condition, shape At data warehouse to be analyzed, later by detaching data sample (including training set and test set) in warehouse, pass through data mining Algorithm establishes customer churn prediction model, and test using test set and carry out marking assessment according to the accuracy rate of prediction, Overall data are predicted using the prediction model that score value reaches requirement, are given a mark to the loyalty of user.According to marking Whether prediction user can be lost.
It in some embodiments, can also include: loss prevention module.Overall data can be carried out by being lost prevention module Global analysis creates trigger event;Customer analysis is recorded as a result, Building Customer Loyalty degree is given a mark;According to trigger event, there is needle Generation to property simultaneously distributes preset telecommunication user and maintains scheme, including set meal suggestion for revision;It is uniformly coordinated scheduling telecommunication user Maintain the execution of scheme;Telecommunication user feedback result is recorded, the executive condition for maintaining scheme to user for the later period is assessed.
As a result, predict user can be lost in the case where, can use be lost prevention module execute user maintain scheme with Retrieve the user of pre- loss.
Fig. 2 is the structural schematic diagram for the analytical equipment that the prediction telecommunication user of another embodiment of the present invention is lost.
The present embodiment can increase on the basis of the function whether the prediction user of Fig. 1 can be lost to being predicted as flowing The user of mistake executes user and maintains scheme to retrieve the function of the user of pre- loss.
As shown in Fig. 2, the analytical equipment that prediction telecommunication user is lost may include: data verification units 201, split merging Unit 202, prediction model establish unit 203, prediction model assessment unit 204, behavioral value unit 205, customer analysis unit 206, set meal matching unit 207, schemes generation unit 208, interaction prewarning unit 209 and customer relation management unit 210.The dress Setting can not only predict which user can be lost, but also can safeguard to the user being lost in advance.
The embodiment of the present invention can solve following technical problem by above-mentioned functional unit and realize corresponding effect:
1, the computer phase that automatically analysis mining telecommunication user is lost in mass data is utilized by Decision tree classified algorithms Pass rule, when the characteristic information to telecommunication user is lost behavior with telecommunication user and is associated analysis and detection, certainly by system The dynamic multicharacteristic information to telecommunication user carries out lateral information gain and compares, and avoids the interference of the subjective factor of people.
2, understandable customer churn prediction rule is generated, is able to reflect out every relating attribute to customer churn behavior The inner link of combined influence.
3, combining unit is split by default, solves the problems, such as data skewed popularity.
4, prediction model is quickly established on small-scale test set by benefit, to be assessed in real time, to cope with city The variation of field developing direction, it is ensured that the applicability of prediction model.
5 by quickly generating prediction model using training set, solves to calculate that time-consuming cannot find that potential losss is electric in time The problem of credit household.
6, prevention module is lost by telecommunication user, automatically forms maintaining clients scheme according to predictive analysis results, including The fining set meal for being more suitable for user is recommended.
7, by perceiving in number level and predict the potentially possible telecommunication user that can be lost, solve because one people mores due to make At prediction erroneous judgement the problem of.
Fig. 3 is the flow diagram of each functional unit analyzing and processing data in Fig. 2.
As shown in figure 3, the process may comprise steps of:
S301, data verification units 201 can be acquired the data of separate sources, brief processing, to meet function Software systems requirement in module.Can also be for statistical analysis to data, key feature data are derived, and random division goes out to instruct Practice collection and test set.
S302 splits combining unit 202 and the data of training set is split and merged, it is possible thereby to reduce data deviation Influence of the property to result correctness.Prediction model establishes unit 203 can be to using judging that tree algorithm establishes prediction model.It (should Partial content be further described below.)
S303 assesses forecast analysis model using 204 test data set of prediction model assessment unit, forming properties Analysis report records prediction accuracy.
S304, judges whether prediction accuracy is lower than threshold value.When prediction accuracy is not less than threshold value, step is executed S305.When prediction accuracy is lower than threshold value, step S301 is executed.
S305 meets the forecast analysis model that assessment requires, behavior according to score value when prediction accuracy is not less than threshold value Detection unit 205 is extracted decision rule (i.e. prediction rule) using prediction rule extractor, and is drawn using prediction rule execution It holds up and forecast analysis is carried out to overall data, trigger event is created, such as event 1, event 2, event 3, event 4 and event 5.Specifically Event may is that 4G user, and flow more months super to take, and voice the number of minutes is insufficient, set meal is unreasonable etc..
S306: customer analysis unit 206 beats the informativeness of user according to the off-network probability in predictive analysis results Point.Schemes generation unit (maintaining schemes generation and dispatch unit) 208 targetedly selects to distribute default according to trigger event Telecommunication user maintain scheme.
S307: set meal matching unit 207 analyzes the dosage and other customer datas of user, is more suitable to provide for user Set meal, as maintenance user a kind of mode.
S308: interaction prewarning unit 209 is responsible for according to the corresponding medium of maintenance scheme dynamic select (such as network, phone system System), by rational management resource, specifically to notify to execute and maintain scheme, and record telecommunications in CRM system 210 User maintains the executive condition that carries into execution a plan.
Fig. 4 is the implementation flow diagram of data preprocessing module in Fig. 1.
As shown in figure 4, this method may comprise steps of:
S401 acquires data, including user's decision behavior characteristic, user's usage behavior characteristic, user's loyalty Spend behavioural characteristic data.
User's decision behavior characteristic mainly includes income, monthly rent, set meal type, standard rate, packaged service type And the original variables such as dosage, customer consumption record, user's payment mean value, user's payment monthly rent ratio, set are further obtained through derivative The integrated decision-makings behavioural characteristic data such as meal use value, set meal use value rate.
User's usage behavior characteristic is according to for embodying user in the product or service process provided using operator The usage behavior feature shown specifically includes that average talk number, local call duration, national distance call after derivative The ratio between duration, total long-distance duration of call, the roaming duration of call, short message item number, short message item number and talk times, data traffic.
Consumer loyalty degree behavioural characteristic data are mainly used for describing the type and quantity of operator's product ordered by user, And alteration trend is used to the recent of products & services ordered by it.These data mainly may include: to produce after derivative Product lot-size, national distance call trend, internal long distance call trend frequency, roams call trend, is short local call trend Believe transmission trend, flow dosage trend.
S402 carries out data integrity, reasonableness check.Can specifically check data it is no there are blank value, whether there is Unreasonable value or the value beyond value range.
S403 analyzes data using Pearson correlation coefficient r.Wherein:
In equation 1, r indicates that Pearson correlation coefficient, X indicate a characteristic variable, and Y indicates another feature variable,WithIt is the average value of characteristic variable X and characteristic variable Y respectively.
When 0.8≤| r | when < 1, indicate that X and Y is highly relevant, the embodiment of the present invention can remove in highly relevant attribute One of them.
S404, random division data set form training set and test set.
Fig. 5 is the implementation flow diagram of model building module in Fig. 1.
As shown in figure 5, telecommunication user attrition prediction analysis step of the model building module based on data mining decision Tree algorithms It suddenly can be as follows:
S501 determines sample set, attribute to be analyzed.
If N is sample set, if characteristic set to be analyzed is J, if class categories collection is combined into I.
S502 creates root node R.
S503, judges whether root node R belongs to same category I.
S504 returns to R node, is denoted as leaf node if N belongs to same category I, indicates for class i.
S505, judging characteristic collection are combined into whether J is empty or node sample number less than given value.
S506 returns to root node R, is labeled as leaf node if J is that empty or node sample number is less than given value, And indicating R is most classes occur in N.
S507, calculates the information gain-ratio of each J (j1, j2....jn) in N, and selection wherein has highest information gain-ratio Feature as test feature.
Specific information gain-ratio calculating process is as follows:
The concept of entropy is for come the uncertainty of measuring predictive marker value in sample set N, entropy is bigger, and uncertainty is got over Height, formula are as follows:
In formula 2, piIt represents in predictive marker value XiProbability in set N.
Conditional entropy indicates that under the conditions of characteristic variable J value is specific, the uncertainty of predictive marker value, formula is such as Under:
In equation 3, piIndicate characteristic variable J=jiProbability in set N.
Information gain indicates after the information for learning feature J, so that the degree of the uncertain reduction of N, formula are as follows:
G (N | J)=E (N)-E (N | J) (formula 4)
It is asked using what information gain-ratio can effectively solve that the annual reporting law feature more to unique variable value preferentially divided Topic, specific formula is as follows:
In formula 5, and GainRatio (N | J) indicate information gain-ratio.
S509 generates a branch if test feature is discrete features for each different characteristic value, to the section Click through line splitting.
S510, if test feature is that continuous type feature divides the node according to the segmentation threshold of this feature It splits.
The specific algorithm of the Threshold segmentation of continuous type feature can be as follows: continuous type feature is carried out ascending sort. Using the midpoint of any two characteristic value as split point, the information gain of each split point is calculated.For the operation speed of accelerating algorithm Degree, only calculates the information gain for the split point that categorical attribute can be made to change.Specific formula is as follows:
E (continuous feature)=p≤ segmentation thresholdE (≤segmentation threshold)+p> segmentation thresholdE (> segmentation threshold) (formula 7)
G (N | continuous feature)=E (N)-E (continuous feature) (formula 8)
Calculate the information gain-ratio of split point.
According to default maximum branch number, select the maximum several split point of information gain as segmentation threshold.
S511, the new node generated for each division, jumps to S503 and repeats.
S512, finally, carrying out cut operator according to the classification error situation of node.Specific Pruning strategy can following institute Show:
Using original sample set as test data, calculating decision tree, prediction is smart accordingly in beta pruning and not beta pruning Degree, if precision of prediction does not reduce after cutting some subtree, cuts subtree.
Customer churn forecast analysis process is illustrated below by simple use-case, case data is detailed in following table (1):
Table (1)
In table (1), 4 column of centre are for purposes of illustration only, it is 2 that maximum branch amount, which is arranged, to continuous variable in this example, initially User characteristic data is used as under state, last column is used as predictive marker value.
Under original state, off-network number is 2, is 3 in netting index, then the initial information entropy of the column are as follows:
In formula 9, E (S) indicates initial information entropy.
Next calculate separately discrete features net type and whether the conditional entropy of arrearage, 3 classes can be divided into net type, Respectively 4G, 3G, 2G, quantitative proportion 3:1:1, the then conditional entropy of this feature are as follows:
Wherein: E (x) indicates conditional entropy,
The formula of information gain can be such that
G (N | in net type)=E (N)-E (in net type)=0.42 (formula 11)
Whether arrearage is divided into 2 classes, and data ratio is 3:2.Similarly, the conditional entropy of this feature are as follows:
G (N | whether arrearage)=E (N)-E (whether arrearage)=0.02 (formula 14)
For continuous feature, possible split point is selected, and calculates the information gain-ratio of split point according to this, split point is chosen As shown in following table (2):
In the net time 0 0.11 0.12 0.28 0.39
Whether off-network It is no It is no It is no It is It is
Table (2)
Because categorical attribute just changes only between 0.12 and 0.28 in this example, therefore split point is chosen uniquely, takes Value is 0.2, and the calculating process of the specific information gain-ratio of the split point is given below:
G (N | telephone expenses stability bandwidth)=E (N)-E (telephone expenses stability bandwidth)=0.971 (formula 18)
After Comprehensive Correlation above-mentioned formula, it is known that: GainRatio (telephone expenses stability bandwidth) > GainRatio (in net type) > GainRatio (whether arrearage).
Fig. 6 is the final division result figure divided to data set of one embodiment of the invention.
As shown in fig. 6, the feature for selecting telephone expenses stability bandwidth to divide as this, divides data set, to postorder The step of tree repeats this process, obtains final division result figure can be as follows:
S601, data set may include: networked users' ratio be 60%, off-network user's ratio is 40%, off-network number of users It is 5 for 2, networked users' number 3, total number of users.
S602, the feature for selecting telephone expenses stability bandwidth to divide as this, divides data set.
S603, when telephone expenses stability bandwidth≤0.2, Sub Data Set after division can be with are as follows: networked users' ratio is 100%, Off-network user's ratio is 0%, off-network number of users is 0, networked users' number 3, total number of users are 3.
S604, as telephone expenses stability bandwidth > 0.2, Sub Data Set after division can be with are as follows: networked users' ratio is 0%, from Network users ratio is 100%, off-network number of users is 2, networked users' number 0, total number of users are 2.
Forecast analysis is carried out by case data as a result, available following prediction rule:
Prediction rule 1: when stability bandwidth is less than or equal to 0.2, user will not be lost.
Prediction rule 2: the customer churn when stability bandwidth is greater than 0.2.
The data set of the above use-case is smaller, is merely to illustrate algorithm, when running the calculation on mass users characteristic data set When method, more actual prediction rule can be obtained.
Fig. 7 is the schematic diagram of the fractionation merging mechanism of the model building module of one embodiment of the invention.
As shown in fig. 7, the schematic diagram may include split cells 710 and combining unit 720.Training set S is single by splitting Member 710 can be split as training set S1, training set S2, training set S3 and training set S4 etc..Training set S1, training set S2, instruction Practice collection S3 and training set S4 according to the available prediction model 1 of Decision tree classified algorithms, prediction model 2, prediction model 3 and prediction model 4.Prediction model 1, prediction model 2, prediction model 3 and prediction model 4 can be merged into prediction model by combining unit 720.
Fig. 8 is the flow diagram of the fractionation and merging data in Fig. 7.
The ratio that loss number of users accounts for overall user quantity is lower (be lost user's ratio and be generally 1.5-2%), therefore A randomly selected training dataset may have serious data skewed popularity, may if cannot deal carefully with The validity of prediction model can be jeopardized, or even cause that prediction model can not be generated, such as all users are not in training set It is lost user.In the present embodiment, it introduces split cells and combining unit can solve data skewed popularity problem, while avoiding counting According to ambiguity.
As described in Figure 8, which may include steps of:
S801 determines relevant parameter.
(1) S is set as training set.
(2) the total number of users amount that N includes by training set S is set.
(3) ratio for being lost user and networked users is set as 1:x.
(4) it sets expected training subset and is lost the ratio of user and networked users as 1:y.
S802 splits combining unit according to expected data accounting and creates multiple training subsets from training set S, i.e., will account for The higher networked users of overall user ratio it is random and average be assigned to each training subset, it is lower that overall user ratio will be accounted for Loss user it is anticipated that data accounting copy to all training subsets, then according to step 1 relevant parameter, obtain training Integrate number as x/y, each training subset has a loss user of N/ (1+x), there are N × y/ (1+x) a networked users.
S803, it is independent on each training subset to establish prediction model.
S804 individually predicts the customer instance in whole training set by each prediction model.
S805: it splits combining unit and uses weight votes principle, whole prediction result is integrated, specifically, such as Fruit prediction result is that the prediction model number of " user is in net " is n1, and prediction result is that the prediction model number of " customer churn " is n2.Weight w1 is assigned for n1, n2 assigns weight w2, then as w1 × n1 > w2 × n2, splitting combining unit prediction result is " to use Family is in net ", no person, prediction result is " user can be lost ".
In the above-described embodiments, can come wholly or partly by software, hardware, firmware or any combination thereof real It is existing.For example, two functional units will be added to be integrated in one unit, two individual modules etc. can also be divided into.When using soft When part is realized, can entirely or partly it realize in the form of a computer program product.The computer program product includes one A or multiple computer instructions, when run on a computer, so that computer executes described in above-mentioned each embodiment Method.When loading on computers and executing the computer program instructions, entirely or partly generates and implement according to the present invention Process or function described in example.The computer can be general purpose computer, special purpose computer, computer network or other Programmable device.The computer instruction may be stored in a computer readable storage medium, or computer-readable from one Storage medium to another computer readable storage medium transmit, for example, the computer instruction can from a web-site, Computer, server or data center by wired (such as coaxial cable, optical fiber, Digital Subscriber Line (DSL)) or wireless (such as Infrared, wireless, microwave etc.) mode transmitted to another web-site, computer, server or data center.The meter Calculation machine readable storage medium storing program for executing can be any usable medium that computer can access or include that one or more can use Jie The data storage devices such as matter integrated server, data center.The usable medium can be magnetic medium, (for example, floppy disk, Hard disk, tape), optical medium (for example, DVD) or semiconductor medium (such as solid state hard disk Solid State Disk (SSD)) Deng.
Fig. 9 is the flow chart of the prediction model assessment marking of one embodiment of the invention.
The embodiment of the present invention needs to understand in real time the accuracy of prediction model during actual prediction, is with assessment No to need to choose characteristic value again, the specific steps of the appraisal procedure can be as follows:
S901: it determines assessment parameter, exports prediction rule.
S902: it is predicted using prediction rule in test set.
S903: marking assessment is carried out to prediction model according to the LIFT of prediction model and prediction model hit rate.
S904: prediction result is compared with assessment threshold value, if prediction result is more than assessment threshold value, is gone to step S905.No person, goes to step S906.
S905: completing model evaluation, forms assessment report.
S906, if prediction result prompts accuracy rate not up to standard, again selected characteristic value lower than assessment threshold value, detection instruction Practice collection.
In S903, specific LIFT and prediction model hit rate calculation method are as follows:
If A is the quantity of Accurate Prediction off-network user, B is the off-grid quantity of error prediction user, then defines prediction model Hit rate is A/ (A+B).Wherein, when LIFT (model enhancing rate) is referred to using the prediction model, the hit rate of user is not divided by User can be arranged by the loss possibility of prediction using the turnover rate of user when model specific to the embodiment of the present invention Sequence, defining hit rate (X%) is to be lost in the user of the relatively high preceding X% of possibility, the hit rate of customer churn, then LIFT =hit rate (X%)/do not use the turnover rate of the model, such as be without using the whole turnover rate of telecommunication user when the model 2%, X=5, hit rate (5%)=20%, then LIFT=20%/2%=10.
Figure 10 is that the telecommunication user of one embodiment of the invention is lost the flow chart of prevention.
In embodiments of the present invention, it after obtaining meeting the prediction model that assessment requires, needs to utilize the prediction model pair Target data set is analyzed, and is maintained based on the analysis results to carry out effective user, to reach the prevention off-grid mesh of user 's.
As described in Figure 10, the specific steps of the process can be as follows:
S101 extracts prediction rule.
S102 analyzes target data set using prediction rule, obtains analysis result.
S103, based on the analysis results, creation maintain trigger event.
S104 maintains schemes generation and dispatch unit using trigger event, generates user and maintain scheme, will maintain scheme group It is dealt into interactive prewarning unit.
S105, attribute and its real-time resource utilization of the interaction prewarning unit according to communicative channel, is moved using different State dispatching algorithm, unified planning media resource, guidance executes user and maintains scheme, and records scheme implementing result.
Figure 11 is the flow diagram that the prediction telecommunication user of one embodiment of the invention is lost.
As shown in figure 11, the process may include can be with step:
S111 acquires the characteristic value of the teledata of telecommunication user, forms data bins to be analyzed, from the overall of data bins Data sample is pulled out in data, is training set and test set by data sample random division.
S112 establishes prediction model based on merging mechanism and Decision tree classified algorithms are split on training set.
S113 tests prediction model using test set, according to the estimated performance of test result assessment prediction model.
S114 analyzes overall data using the prediction model of test performance qualification, and predicts to be lost in telecommunication user User.
On the one hand, foregoing invention embodiment can acquire the spy of the teledata of telecommunication user by data preprocessing module Value indicative forms data bins to be analyzed, by pulling out data sample from the overall data of data bins, and by data sample with Machine is divided into the preprocessed datas such as training set and test set, can train and extract preferable data in the data of magnanimity, by compared with Good data training and test, can not only reduce operand, reduce time-consuming, and the function that machine learns automatically may be implemented Energy.
On the other hand, the embodiment of the present invention can be based on splitting merging mechanism by model building module and decision tree is calculated Method is eliminated due to data trend and bring the problem of establishing prediction model for training set, can solve data trend The defect of accuracy difference, improves the accuracy of data, and then significantly improve precision of prediction.
Another aspect, the embodiment of the present invention can test prediction model using test set by assessment scoring modules, according to The estimated performance of test result assessment prediction model;It is analyzed using forecast analysis module using the prediction model of test performance qualification Overall data, and predict that the loss user in telecommunication user may be implemented: the time required to effectively reducing whole prediction, realize use The real time implementation of family attrition prediction, high efficiency are lost prevention method by the telecommunication user based on event, realize to customer churn The integration control of prediction and prevention.
In some embodiments, the characteristic value of the teledata of the acquisition telecommunication user in step S111 may include: to adopt Collect the original variable of the teledata of telecommunication user;Integrality and/or reasonableness check are carried out to original variable, deletes and checks not By data, form the surplus variable of teledata;Surplus variable is derived, the primitive character of teledata is obtained Value;It is for statistical analysis to primitive character value using Pearson correlation coefficient, highly relevant characteristic value is obtained, height phase is deleted The characteristic value of pass generates the characteristic value of teledata.
In some embodiments, in step S112 based on merging mechanism and Decision tree classified algorithms are split, built on training set Vertical prediction model may include: that training set is split as multiple sub- training sets;Each height instruction is based respectively on according to Decision tree classified algorithms Practice collection and establishes sub- prediction model;Sub- prediction model is integrated according to weight votes principle, generates prediction model.
In some embodiments, prediction model is tested using test set in step S113, is assessed according to test result pre- The estimated performance for surveying model may include: to export the prediction rule of prediction model, be predicted according to prediction rule test set, Obtain the model enhancing rate LIFT and/or prediction hit rate of prediction model;Prediction model is commented according to LIFT and hit rate Estimate marking;When assessing score more than or equal to threshold value, the qualified report of assessment is generated;When assessing score less than threshold value, It is unqualified to generate assessment, needs to resurvey the report of the characteristic value of the teledata of telecommunication user.
It in some embodiments, can also include: creation triggering thing after predicting the user that can be lost in telecommunication user Part, to the user that can be lost is predicted, triggering executes preset maintenance scheme.
In some embodiments, the characteristic value of teledata includes the one or more of following characteristics value: telecommunication user Decision behavior characteristic;The characteristic of the usage behavior of telecommunication user;The feature of the loyalty behavior of telecommunication user Data.
On the one hand, foregoing invention embodiment can use data mining technology, by carrying out transverse direction to multicharacteristic information Information gain compares automatically, the limitation for reducing analysis personnel and generating by subjective factor of high degree, realizes user's stream Lose the intelligence of forecast analysis.
On the other hand, foregoing invention embodiment can introduce the evaluation mechanism of prediction model, and be established in advance using training set Model is surveyed to effectively reduce whole prediction required time, realizes real time implementation, the high efficiency of customer churn prediction.
Another aspect, foregoing invention embodiment can be lost prevention method by the telecommunication user based on event, realize The integration control that customer churn is predicted and is prevented.
In addition, foregoing invention embodiment introduces the Decision tree classified algorithms for splitting merging mechanism, data trend can solve Problem improves the precision of prediction.
It should be noted that in the absence of conflict, those skilled in the art can according to actual needs will be above-mentioned The sequence of operating procedure is adjusted flexibly, or above-mentioned steps are carried out the operation such as flexible combination.For simplicity, repeating no more Various implementations.In addition, the content of each embodiment can mutual reference.
In addition, the device of the various embodiments described above can be used as the execution in the method for each embodiment of the various embodiments described above Main body may be implemented the corresponding process in each method, realize identical technical effect, for sake of simplicity, content is no longer in this respect It repeats.
Through the above description of the embodiments, those skilled in the art can be understood that each embodiment can It realizes by means of software and necessary general hardware platform, naturally it is also possible to pass through hardware.Based on this understanding, on Stating technical solution, substantially the part that contributes to existing technology can be embodied in the form of software products in other words, should Computer software product may be stored in a computer readable storage medium, such as ROM/RAM, magnetic disk, CD, including several fingers It enables and using so that a computer equipment (can be personal computer, server or the network equipment etc.) executes each implementation Method described in certain parts of example or embodiment.
Finally, it should be noted that the above embodiments are merely illustrative of the technical solutions of the present invention, rather than its limitations;Although Present invention has been described in detail with reference to the aforementioned embodiments, those skilled in the art should understand that: it still may be used To modify the technical solutions described in the foregoing embodiments or equivalent replacement of some of the technical features; And these are modified or replaceed, technical solution of various embodiments of the present invention that it does not separate the essence of the corresponding technical solution spirit and Range.

Claims (14)

1. a kind of analytical equipment that prediction telecommunication user is lost characterized by comprising
Data preprocessing module, the characteristic value of the teledata for acquiring telecommunication user, forms data bins to be analyzed, from institute It states in the overall data of data bins and pulls out data sample, be training set and test set by the data sample random division;
Model building module, for establishing prediction model for the training set based on merging mechanism and Decision tree classified algorithms are split;
Scoring modules are assessed, test the prediction model using the test set, the prediction model is assessed according to test result Estimated performance;
Forecast analysis module analyzes the overall data using the prediction model of the test performance qualification, and predicts the electricity Loss user in credit household.
2. the apparatus according to claim 1, which is characterized in that the data preprocessing module includes:
Acquisition unit, the original variable of the teledata for acquiring the telecommunication user;
Brief unit is deleted for carrying out integrality and/or reasonableness check to the original variable and checks unsanctioned number According to forming the surplus variable of the teledata;
Derived units obtain the primitive character value of the teledata for deriving to the surplus variable;
Statistic unit obtains highly relevant for using Pearson correlation coefficient for statistical analysis to the primitive character value Characteristic value, delete the highly relevant characteristic value, generate the characteristic value of the teledata.
3. the apparatus according to claim 1, which is characterized in that the model building module includes:
Surely join unit, for determining the first ratio for being lost user and non-streaming appraxia family in the training set, and determining son instruction Practice the second ratio concentrated and be lost user and non-streaming appraxia family;
Split cells, the third ratio of the ratio for calculating second ratio and first ratio are based on the third The training set is split as multiple sub- training sets by ratio;
Unit is established, establishes sub- prediction model for being based respectively on each sub- training set according to Decision tree classified algorithms;
Integral unit generates the prediction model for integrating according to weight votes principle to the sub- prediction model.
4. the apparatus according to claim 1, which is characterized in that the assessment scoring modules include:
Test cell carries out the test set according to the prediction rule for exporting the prediction rule of the prediction model Prediction obtains the model enhancing rate LIFT and/or prediction hit rate of the prediction model;
Assessment unit, for carrying out assessment marking to the prediction model according to the LIFT and the hit rate;
Reporting unit, for when assessing score more than or equal to threshold value, generating the qualified report of assessment;When the assessment point When number is less than the threshold value, generation assessment is unqualified, needs to resurvey the characteristic value of the teledata of the telecommunication user Report.
5. the apparatus according to claim 1, which is characterized in that further include:
Pre- Anti-lost module, for creating trigger event, to the user that can be lost is predicted, triggering executes preset maintenance side Case.
6. device according to any one of claims 1-5, which is characterized in that the characteristic value of the teledata include with The one or more of lower eigenvalue:
The characteristic of the decision behavior of the telecommunication user;
The characteristic of the usage behavior of the telecommunication user;
The characteristic of the loyalty behavior of the telecommunication user.
7. a kind of analysis method that prediction telecommunication user is lost characterized by comprising
The characteristic value for acquiring the teledata of telecommunication user, forms data bins to be analyzed, from the overall data of the data bins In pull out data sample, by the data sample random division be training set and test set;
Based on merging mechanism and Decision tree classified algorithms are split, prediction model is established on the training set;
The prediction model is tested using the test set, the estimated performance of the prediction model is assessed according to test result;
The overall data are analyzed using the prediction model of the test performance qualification, and predict to be lost in the telecommunication user User.
8. the method according to the description of claim 7 is characterized in that it is described acquisition telecommunication user teledata characteristic value, Include:
Acquire the original variable of the teledata of the telecommunication user;
Integrality and/or reasonableness check are carried out to the original variable, deletes and checks unsanctioned data, form the telecommunications The surplus variable of data;
The surplus variable is derived, the primitive character value of the teledata is obtained;
It is for statistical analysis to the primitive character value using Pearson correlation coefficient, highly relevant characteristic value is obtained, is deleted The highly relevant characteristic value, generates the characteristic value of the teledata.
9. the method according to the description of claim 7 is characterized in that described based on splitting merging mechanism and Decision tree classified algorithms, Prediction model is established on the training set, comprising:
The training set is split as multiple sub- training sets;
Each sub- training set, which is based respectively on, according to Decision tree classified algorithms establishes sub- prediction model;
The sub- prediction model is integrated according to weight votes principle, generates the prediction model.
10. the method according to the description of claim 7 is characterized in that described test the prediction model using the test set, The estimated performance of the prediction model is assessed according to test result, comprising:
The prediction rule for exporting the prediction model is predicted the test set according to the prediction rule, described in acquisition The model enhancing rate LIFT and/or prediction hit rate of prediction model;
Assessment marking is carried out to the prediction model according to the LIFT and the hit rate;
When assessing score more than or equal to threshold value, the qualified report of assessment is generated;When the assessment score is less than the threshold When value, generation assessment is unqualified, needs to resurvey the report of the characteristic value of the teledata of the telecommunication user.
11. the method according to the description of claim 7 is characterized in that the user that can be lost in the prediction telecommunication user Later, further includes:
Trigger event is created, to the user that can be lost is predicted, triggering executes preset maintenance scheme.
12. method according to any one of claims 7-11, which is characterized in that the characteristic value of the teledata includes The one or more of following characteristics value:
The characteristic of the decision behavior of the telecommunication user;
The characteristic of the usage behavior of the telecommunication user;
The characteristic of the loyalty behavior of the telecommunication user.
13. a kind of analytical equipment that prediction telecommunication user is lost characterized by comprising
Memory, for storing program;
Processor, for executing the program of the memory storage, described program makes the processor execute such as claim Method described in any one of 7-12.
14. a kind of computer readable storage medium characterized by comprising instruction,
When described instruction is run on computers, so that computer executes the side as described in any one of claim 7-12 Method.
CN201710881795.5A 2017-09-26 2017-09-26 Predict device, method and storage medium that telecommunication user is lost Pending CN109558962A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201710881795.5A CN109558962A (en) 2017-09-26 2017-09-26 Predict device, method and storage medium that telecommunication user is lost

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201710881795.5A CN109558962A (en) 2017-09-26 2017-09-26 Predict device, method and storage medium that telecommunication user is lost

Publications (1)

Publication Number Publication Date
CN109558962A true CN109558962A (en) 2019-04-02

Family

ID=65862452

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201710881795.5A Pending CN109558962A (en) 2017-09-26 2017-09-26 Predict device, method and storage medium that telecommunication user is lost

Country Status (1)

Country Link
CN (1) CN109558962A (en)

Cited By (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110033355A (en) * 2019-04-17 2019-07-19 中国联合网络通信集团有限公司 The method and system that telephone expenses set meal is recommended
CN112153636A (en) * 2020-10-29 2020-12-29 浙江鸿程计算机系统有限公司 Method for predicting number portability and roll-out of telecommunication industry user based on machine learning
CN112200375A (en) * 2020-10-15 2021-01-08 中国联合网络通信集团有限公司 Prediction model generation method, prediction model generation device, and computer-readable medium
CN112836877A (en) * 2021-02-04 2021-05-25 广西蜂鸟汽车科技有限公司 Telecommunication customer loss prediction method and system for improving multi-layer perceptron
CN113033909A (en) * 2021-04-08 2021-06-25 中国移动通信集团陕西有限公司 Portable user analysis method, device, equipment and computer storage medium
CN113139715A (en) * 2021-03-30 2021-07-20 北京思特奇信息技术股份有限公司 Comprehensive assessment early warning method and system for loss of group customers in telecommunication industry
CN113259144A (en) * 2020-02-07 2021-08-13 北京京东振世信息技术有限公司 Storage network planning method and device
CN113543117A (en) * 2020-04-22 2021-10-22 中国移动通信集团重庆有限公司 Prediction method and device for number portability user and computing equipment
CN113610552A (en) * 2021-06-25 2021-11-05 清华大学 User loss prediction method and device
CN114143772A (en) * 2021-11-18 2022-03-04 北京思特奇信息技术股份有限公司 Method and system for reducing user off-network rate
CN114399087A (en) * 2021-12-22 2022-04-26 中国电信股份有限公司 User data processing method and device based on Flink stream processing engine
CN114881181A (en) * 2022-07-12 2022-08-09 南昌大学第一附属医院 Feature weighting selection method, system, medium and computer based on big data

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102567391A (en) * 2010-12-20 2012-07-11 中国移动通信集团广东有限公司 Method and device for building classification forecasting mixed model
CN105760889A (en) * 2016-03-01 2016-07-13 中国科学技术大学 Efficient imbalanced data set classification method
CN106203679A (en) * 2016-06-27 2016-12-07 武汉斗鱼网络科技有限公司 A kind of customer loss Forecasting Methodology and system

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102567391A (en) * 2010-12-20 2012-07-11 中国移动通信集团广东有限公司 Method and device for building classification forecasting mixed model
CN105760889A (en) * 2016-03-01 2016-07-13 中国科学技术大学 Efficient imbalanced data set classification method
CN106203679A (en) * 2016-06-27 2016-12-07 武汉斗鱼网络科技有限公司 A kind of customer loss Forecasting Methodology and system

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
陈晔: "基于组合预测的电信客户流失预测分析", 《中国优秀硕士学位论文全文数据库 信息科技辑》 *

Cited By (15)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110033355A (en) * 2019-04-17 2019-07-19 中国联合网络通信集团有限公司 The method and system that telephone expenses set meal is recommended
CN113259144B (en) * 2020-02-07 2024-05-24 北京京东振世信息技术有限公司 Warehouse network planning method and device
CN113259144A (en) * 2020-02-07 2021-08-13 北京京东振世信息技术有限公司 Storage network planning method and device
CN113543117B (en) * 2020-04-22 2022-10-04 中国移动通信集团重庆有限公司 Prediction method and device for number portability user and computing equipment
CN113543117A (en) * 2020-04-22 2021-10-22 中国移动通信集团重庆有限公司 Prediction method and device for number portability user and computing equipment
CN112200375A (en) * 2020-10-15 2021-01-08 中国联合网络通信集团有限公司 Prediction model generation method, prediction model generation device, and computer-readable medium
CN112200375B (en) * 2020-10-15 2023-08-29 中国联合网络通信集团有限公司 Prediction model generation method, prediction model generation device, and computer-readable medium
CN112153636A (en) * 2020-10-29 2020-12-29 浙江鸿程计算机系统有限公司 Method for predicting number portability and roll-out of telecommunication industry user based on machine learning
CN112836877A (en) * 2021-02-04 2021-05-25 广西蜂鸟汽车科技有限公司 Telecommunication customer loss prediction method and system for improving multi-layer perceptron
CN113139715A (en) * 2021-03-30 2021-07-20 北京思特奇信息技术股份有限公司 Comprehensive assessment early warning method and system for loss of group customers in telecommunication industry
CN113033909A (en) * 2021-04-08 2021-06-25 中国移动通信集团陕西有限公司 Portable user analysis method, device, equipment and computer storage medium
CN113610552A (en) * 2021-06-25 2021-11-05 清华大学 User loss prediction method and device
CN114143772A (en) * 2021-11-18 2022-03-04 北京思特奇信息技术股份有限公司 Method and system for reducing user off-network rate
CN114399087A (en) * 2021-12-22 2022-04-26 中国电信股份有限公司 User data processing method and device based on Flink stream processing engine
CN114881181A (en) * 2022-07-12 2022-08-09 南昌大学第一附属医院 Feature weighting selection method, system, medium and computer based on big data

Similar Documents

Publication Publication Date Title
CN109558962A (en) Predict device, method and storage medium that telecommunication user is lost
US10896203B2 (en) Digital analytics system
CN107229708A (en) A kind of personalized trip service big data application system and method
CN110991875B (en) Platform user quality evaluation system
Asane-Otoo Carbon footprint and emission determinants in Africa
CN103796183B (en) A kind of refuse messages recognition methods and device
CN105184315A (en) Quality inspection treatment method and system
CN101620692A (en) Method for analyzing customer churn of mobile communication service
CN104217088B (en) The optimization method and system of operator&#39;s mobile service resource
CN105069025A (en) Intelligent aggregation visualization and management and control system for big data
CN101320449A (en) Power distribution network estimation method based on combination appraisement method
CN108023768A (en) Network event chain establishment method and network event chain establish system
Li et al. Enhancing telco service quality with big data enabled churn analysis: infrastructure, model, and deployment
CN103250376A (en) Method and system for carrying out predictive analysis relating to nodes of a communication network
Marques et al. Congressmen in the age of social network sites: Brazilian representatives and Twitter use
CN109919675A (en) Communication user upshift prediction probability recognition methods neural network based and system
CN107977855B (en) Method and device for managing user information
Nasiri Khiavi et al. Comparative applicability of MCDM‐SWOT based techniques for developing integrated watershed management framework
CN117911085A (en) User management system, method and terminal based on enterprise marketing
TWI662809B (en) Obstacle location system and maintenance method for image streaming service
Roets et al. Evaluation of railway traffic control efficiency and its determinants
Wang Research on bank marketing behavior based on machine learning
CN108234596A (en) Aviation information-pushing method and device
Ferwerda et al. Leveraging the power of place: A data-driven decision helper to improve the location decisions of economic immigrants
CN106817710A (en) The localization method and device of a kind of network problem

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20190402