CN109558962A - Predict device, method and storage medium that telecommunication user is lost - Google Patents
Predict device, method and storage medium that telecommunication user is lost Download PDFInfo
- Publication number
- CN109558962A CN109558962A CN201710881795.5A CN201710881795A CN109558962A CN 109558962 A CN109558962 A CN 109558962A CN 201710881795 A CN201710881795 A CN 201710881795A CN 109558962 A CN109558962 A CN 109558962A
- Authority
- CN
- China
- Prior art keywords
- data
- prediction
- prediction model
- user
- lost
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q10/00—Administration; Management
- G06Q10/04—Forecasting or optimisation specially adapted for administrative or management purposes, e.g. linear programming or "cutting stock problem"
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F2216/00—Indexing scheme relating to additional aspects of information retrieval not explicitly covered by G06F16/00 and subgroups
- G06F2216/03—Data mining
Landscapes
- Engineering & Computer Science (AREA)
- Business, Economics & Management (AREA)
- Human Resources & Organizations (AREA)
- Economics (AREA)
- Strategic Management (AREA)
- Marketing (AREA)
- Game Theory and Decision Science (AREA)
- Entrepreneurship & Innovation (AREA)
- Development Economics (AREA)
- Operations Research (AREA)
- Quality & Reliability (AREA)
- Tourism & Hospitality (AREA)
- Physics & Mathematics (AREA)
- General Business, Economics & Management (AREA)
- General Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
Abstract
The invention discloses a kind of analytical equipment, method and storage mediums that prediction telecommunication user is lost.This method comprises: data preprocessing module, the characteristic value of the teledata for acquiring telecommunication user, form data bins to be analyzed, data sample is pulled out from the overall data of data bins, is training set and test set by data sample random division;Model building module, for establishing prediction model for training set based on merging mechanism and Decision tree classified algorithms are split;Scoring modules are assessed, prediction model are tested using test set, according to the estimated performance of test result assessment prediction model;Forecast analysis module analyzes overall data using the prediction model of test performance qualification, and predicts the loss user in telecommunication user.The function that machine learns automatically not only may be implemented in the embodiment of the present invention as a result, but also the problem of can solve data trend, eliminates due to data trend and the defect of bring accuracy difference, improve the accuracy and precision of prediction of data.
Description
Technical field
The invention belongs to telecommunications big data field more particularly to it is a kind of prediction telecommunication user be lost analytical equipment,
Method and storage medium.
Background technique
With the fast development of network communication technology, the type of telecommunications service is more and more, and the selection of telecommunication user is also got over
Come wider.Telecom operators are also preventing frequent customer to be lost while exploring new client actively.Telecommunication user attrition prediction point
Analysis technology receives the attention of telecom operators.Traditional telecommunication user attrition prediction mainly uses questionnaire form or compares controlled
Experiment, obtains the information gain of telecommunication user, and the information gain of telecommunication user is compared with preset threshold, obtains telecommunications
The strong relating attribute of user.Finally probability is lost using the telecommunication user of the strong relating attribute to be lost in advance to carry out telecommunication user
It is alert.
But applicant it has been investigated that: the leakage of telecommunication user generally requires to carry out more relating attributes comprehensive point
Analysis, merely with more unilateral, the prediction as the judgment mode of prediction customer churn of the customer churn probability of single strong relating attribute
As a result reasonability is also lacking, and the precision of prediction is relatively low.
The precision of prediction for how improving telecommunication user loss, becomes industry technical problem urgently to be resolved.
Summary of the invention
Precision in order to solve the problems, such as telecommunication user attrition prediction is lower, and the embodiment of the invention provides a kind of prediction electricity
Analytical equipment, method and the storage medium that credit household is lost.
In a first aspect, providing a kind of analytical equipment that prediction telecommunication user is lost.The device includes:
Data preprocessing module, the characteristic value of the teledata for acquiring telecommunication user, forms data bins to be analyzed,
Data sample is pulled out from the overall data of data bins, is training set and test set by data sample random division;
Model building module, for establishing prediction model for training set based on merging mechanism and Decision tree classified algorithms are split;
Scoring modules are assessed, prediction model are tested using test set, according to the predictability of test result assessment prediction model
Energy;
Forecast analysis module is analyzed overall data using the prediction model of test performance qualification, and is predicted in telecommunication user
Loss user.
Second aspect provides a kind of analysis method that prediction telecommunication user is lost.This method comprises:
The characteristic value for acquiring the teledata of telecommunication user, forms data bins to be analyzed, from the overall data of data bins
In pull out data sample, by data sample random division be training set and test set;
Based on merging mechanism and Decision tree classified algorithms are split, prediction model is established on training set;
Prediction model is tested using test set, according to the estimated performance of test result assessment prediction model;
Overall data are analyzed using the prediction model of test performance qualification, and predict the user that can be lost in telecommunication user.
The third aspect provides a kind of analytical equipment that prediction telecommunication user is lost.The device includes:
Memory, for storing program;
Processor, for executing the program of memory storage, the method that program makes processor execute above-mentioned second aspect.
Fourth aspect provides a kind of computer readable storage medium.The storage medium includes instruction,
When instruction is run on computers, so that the method that computer executes above-mentioned second aspect.
5th aspect, provides a kind of computer program product comprising instruction.When the product is run on computers,
So that computer executes method described in above-mentioned various aspects.
6th aspect, provides a kind of computer program.When the computer program is run on computers, so that calculating
Machine executes method described in above-mentioned various aspects.
On the one hand, foregoing invention embodiment can acquire the spy of the teledata of telecommunication user by data preprocessing module
Value indicative forms data bins to be analyzed, by pulling out data sample from the overall data of data bins, and by data sample with
Machine is divided into the preprocessed datas such as training set and test set, can train and extract preferable data in the data of magnanimity, by compared with
Good data training and test, can not only reduce operand, reduce time-consuming, and the function that machine learns automatically may be implemented
Energy.
On the other hand, foregoing invention embodiment can be based on splitting merging mechanism by model building module and decision tree is calculated
Method is eliminated due to data trend and bring the problem of establishing prediction model for training set, can solve data trend
The defect of accuracy difference, improves the accuracy of data, and then significantly improve precision of prediction.
Another aspect, foregoing invention embodiment can test prediction model, root using test set by assessment scoring modules
According to the estimated performance of test result assessment prediction model;The prediction model point of test performance qualification is utilized using forecast analysis module
Overall data are analysed, and predict that the loss user in telecommunication user may be implemented: the time required to effectively reducing whole prediction, being realized
The real time implementation of customer churn prediction, high efficiency are lost prevention method by the telecommunication user based on event, realize and flow to user
Lose the integration control of prediction and prevention.
Detailed description of the invention
In order to illustrate the technical solution of the embodiments of the present invention more clearly, will make below to required in the embodiment of the present invention
Attached drawing is briefly described, it should be apparent that, drawings described below is only some embodiments of the present invention, for
For those of ordinary skill in the art, without creative efforts, it can also be obtained according to these attached drawings other
Attached drawing.
Fig. 1 is the structural schematic diagram for the analytical equipment that the prediction telecommunication user of one embodiment of the invention is lost;
Fig. 2 is the structural schematic diagram for the analytical equipment that the prediction telecommunication user of another embodiment of the present invention is lost;
Fig. 3 is the flow diagram of each functional unit analyzing and processing data in Fig. 2;
Fig. 4 is the implementation flow diagram of data preprocessing module in Fig. 1;
Fig. 5 is the implementation flow diagram of model building module in Fig. 1;
Fig. 6 is the final division result figure divided to data set of one embodiment of the invention;
Fig. 7 is the schematic diagram of the fractionation merging mechanism of the model building module of one embodiment of the invention;
Fig. 8 is the flow diagram of the fractionation and merging data in Fig. 7;
Fig. 9 is the flow chart of the prediction model assessment marking of one embodiment of the invention;
Figure 10 is that the telecommunication user of one embodiment of the invention is lost the flow chart of prevention;
Figure 11 is the flow diagram that the prediction telecommunication user of one embodiment of the invention is lost.
Specific embodiment
In order to make the object, technical scheme and advantages of the embodiment of the invention clearer, below in conjunction with the embodiment of the present invention
In attached drawing, technical scheme in the embodiment of the invention is clearly and completely described, it is clear that described embodiment is
A part of the embodiment of the present invention, instead of all the embodiments.Based on the embodiments of the present invention, those of ordinary skill in the art
Every other embodiment obtained without creative efforts, shall fall within the protection scope of the present invention.
It should be noted that in the absence of conflict, the features in the embodiments and the embodiments of the present application can phase
Mutually combination.The application is described in detail below with reference to the accompanying drawings and in conjunction with the embodiments.
Fig. 1 is the structural schematic diagram for the analytical equipment that the prediction telecommunication user of one embodiment of the invention is lost.
As shown in Figure 1, the analytical equipment that prediction telecommunication user is lost may include: that data preprocessing module 110, model are built
Formwork erection block 120, assessment scoring modules 130 and forecast analysis module 140.Wherein, data preprocessing module 110 can be used for acquiring
The characteristic value of the teledata of telecommunication user forms data bins to be analyzed, pulls out data from the overall data of data bins
Data sample random division is training set and test set by sample;Model building module 120 can be used for based on fractionation merging machine
System and Decision tree classified algorithms establish prediction model for training set;Assessment scoring modules 130 can use test set test prediction mould
Type, according to the estimated performance of test result assessment prediction model;Forecast analysis module 140 can use the pre- of test performance qualification
The overall data of model analysis are surveyed, and predict the loss user in telecommunication user.
In some embodiments, data preprocessing module 110 may include: acquisition unit, brief unit, derived units and
Statistic unit.Wherein, acquisition unit can be used for acquiring the original variable of the teledata of telecommunication user;Brief unit can be used
In carrying out integrality and/or reasonableness check to original variable, deletes and check unsanctioned data, form the residue of teledata
Variable;Derived units can be used for deriving surplus variable, obtain the primitive character value of teledata;Statistic unit can be with
For using Pearson correlation coefficient for statistical analysis to primitive character value, highly relevant characteristic value is obtained, height is deleted
Relevant characteristic value generates the characteristic value of teledata.
The embodiment of the present invention can reduce highly relevant attribute, solve by deleting highly relevant characteristic value as a result,
The existing customer churn probability merely with single strong relating attribute is more unilateral, pre- as the judgment mode of prediction customer churn
Survey the problem of the reasonability difference of result.
In some embodiments, the characteristic value of the teledata for the telecommunication user that data preprocessing module 110 acquires can be with
One or more including following characteristics value: the characteristic of the decision behavior of telecommunication user;The usage behavior of telecommunication user
Characteristic;The characteristic of the loyalty behavior of telecommunication user.
Wherein, decision behavior characteristic mainly may include: income, monthly rent, set meal type, standard rate, set meal industry
The original variables such as type of being engaged in and dosage, customer consumption record.After row integrality and/or reasonableness check, these original changes
Amount through it is derivative further obtains user pay mean value, user pay monthly rent ratio, set meal use value, set meal use value rate this 4
A integrated decision-making behavioural characteristic data.
User's usage behavior characteristic can be used for embodying user in the product or service process provided using operator
In the usage behavior feature that shows.After row integrality and/or reasonableness check, user's usage behavior characteristic is through spreading out
It mainly may include: average talk number after life, local call duration, the national distance duration of call, total long-distance duration of call, overflow
Swim the ratio between the duration of call, short message item number, short message item number and talk times, data traffic.
Consumer loyalty degree behavioural characteristic data primarily can be used for description user ordered by operator's product type and
Quantity, and alteration trend is used to the recent of products & services ordered by it.By row integrality and/or reasonableness check it
Afterwards, consumer loyalty degree behavioural characteristic data mainly may include: product subscription quantity, local call trend, the country after derivative
Toll message trend, internal long distance call trend frequency, roaming call trend, short message send trend, flow dosage trend.
In some embodiments, data bins can be used for storing data to be analyzed.The overall data of data bins can be
All data in data bins.Data sample is pulled out from the overall data of data bins, the mode randomly selected can be taken,
It can also be by the way of being extracted by preset ratio etc..It is the implementation of training set and test set by data sample random division
But such as: data sample is randomly divided into 2n parts of small data samples, it, will be another n parts small using n parts of small data samples as training set
Data sample is as test set.
Data preprocessing module 110 can be acquired, derive, filter and divide to the characteristic value of separate sources as a result,
To meet the requirement of device or software systems.
In some embodiments, model building module 120 may include: fixed ginseng unit, split cells, establish unit and whole
Close unit.Wherein, ginseng unit is determined for being lost the first ratio of user and non-streaming appraxia family in training set surely, and really
The second ratio of user and non-streaming appraxia family are lost in stator training set;Split cells can be used for calculating the second ratio and first
The third ratio of the ratio of ratio is based on third ratio, training set is split as multiple sub- training sets;Establishing unit can be used for
Each sub- training set, which is based respectively on, according to Decision tree classified algorithms establishes sub- prediction model;Integral unit can be used for according to weight votes
Principle integrates sub- prediction model, generates prediction model.
Model building module 120 can establish rapidly prediction model according to Decision tree classified algorithms for training set, and can as a result,
Data skewed popularity is solved the problems, such as to introduce fractionation merging mechanism, and Data Integration mechanism is provided, it is credible to promote prediction.In the part
Appearance will also continue to describe below.
Assessing scoring modules 130 may include: test cell, assessment unit and reporting unit.Wherein, test cell can be with
For exporting the prediction rule of prediction model, test set is predicted according to prediction rule, the model for obtaining prediction model mentions
Rise rate (LIFT) and/or prediction hit rate;Assessment unit can be used for assessing prediction model according to LIFT and hit rate
Marking;Reporting unit can be used for when assessing score more than or equal to threshold value, generate the qualified report of assessment;When assessment point
When number is less than threshold value, generation assessment is unqualified, needs to resurvey the report of the characteristic value of the teledata of telecommunication user.By
This, assessment scoring modules 130 can export prediction rule, according to the LIFT of prediction model and prediction model hit rate come to pre-
It surveys model and carries out marking assessment.
Forecast analysis module 140 can carry out global analysis to overall data, record customer analysis as a result, Building Customer Loyalty
Degree is given a mark.When score is below or equal to threshold value, predict that the user can be lost;When score is higher than threshold value, prediction should
User will not be lost.
In some embodiments, the characteristic value (for example, associated data of telecommunication user) of the teledata of telecommunication user exists
After preliminary screening, the acquisition and derivative of data are carried out by system, conversion meets the high-quality data of system input condition, shape
At data warehouse to be analyzed, later by detaching data sample (including training set and test set) in warehouse, pass through data mining
Algorithm establishes customer churn prediction model, and test using test set and carry out marking assessment according to the accuracy rate of prediction,
Overall data are predicted using the prediction model that score value reaches requirement, are given a mark to the loyalty of user.According to marking
Whether prediction user can be lost.
It in some embodiments, can also include: loss prevention module.Overall data can be carried out by being lost prevention module
Global analysis creates trigger event;Customer analysis is recorded as a result, Building Customer Loyalty degree is given a mark;According to trigger event, there is needle
Generation to property simultaneously distributes preset telecommunication user and maintains scheme, including set meal suggestion for revision;It is uniformly coordinated scheduling telecommunication user
Maintain the execution of scheme;Telecommunication user feedback result is recorded, the executive condition for maintaining scheme to user for the later period is assessed.
As a result, predict user can be lost in the case where, can use be lost prevention module execute user maintain scheme with
Retrieve the user of pre- loss.
Fig. 2 is the structural schematic diagram for the analytical equipment that the prediction telecommunication user of another embodiment of the present invention is lost.
The present embodiment can increase on the basis of the function whether the prediction user of Fig. 1 can be lost to being predicted as flowing
The user of mistake executes user and maintains scheme to retrieve the function of the user of pre- loss.
As shown in Fig. 2, the analytical equipment that prediction telecommunication user is lost may include: data verification units 201, split merging
Unit 202, prediction model establish unit 203, prediction model assessment unit 204, behavioral value unit 205, customer analysis unit
206, set meal matching unit 207, schemes generation unit 208, interaction prewarning unit 209 and customer relation management unit 210.The dress
Setting can not only predict which user can be lost, but also can safeguard to the user being lost in advance.
The embodiment of the present invention can solve following technical problem by above-mentioned functional unit and realize corresponding effect:
1, the computer phase that automatically analysis mining telecommunication user is lost in mass data is utilized by Decision tree classified algorithms
Pass rule, when the characteristic information to telecommunication user is lost behavior with telecommunication user and is associated analysis and detection, certainly by system
The dynamic multicharacteristic information to telecommunication user carries out lateral information gain and compares, and avoids the interference of the subjective factor of people.
2, understandable customer churn prediction rule is generated, is able to reflect out every relating attribute to customer churn behavior
The inner link of combined influence.
3, combining unit is split by default, solves the problems, such as data skewed popularity.
4, prediction model is quickly established on small-scale test set by benefit, to be assessed in real time, to cope with city
The variation of field developing direction, it is ensured that the applicability of prediction model.
5 by quickly generating prediction model using training set, solves to calculate that time-consuming cannot find that potential losss is electric in time
The problem of credit household.
6, prevention module is lost by telecommunication user, automatically forms maintaining clients scheme according to predictive analysis results, including
The fining set meal for being more suitable for user is recommended.
7, by perceiving in number level and predict the potentially possible telecommunication user that can be lost, solve because one people mores due to make
At prediction erroneous judgement the problem of.
Fig. 3 is the flow diagram of each functional unit analyzing and processing data in Fig. 2.
As shown in figure 3, the process may comprise steps of:
S301, data verification units 201 can be acquired the data of separate sources, brief processing, to meet function
Software systems requirement in module.Can also be for statistical analysis to data, key feature data are derived, and random division goes out to instruct
Practice collection and test set.
S302 splits combining unit 202 and the data of training set is split and merged, it is possible thereby to reduce data deviation
Influence of the property to result correctness.Prediction model establishes unit 203 can be to using judging that tree algorithm establishes prediction model.It (should
Partial content be further described below.)
S303 assesses forecast analysis model using 204 test data set of prediction model assessment unit, forming properties
Analysis report records prediction accuracy.
S304, judges whether prediction accuracy is lower than threshold value.When prediction accuracy is not less than threshold value, step is executed
S305.When prediction accuracy is lower than threshold value, step S301 is executed.
S305 meets the forecast analysis model that assessment requires, behavior according to score value when prediction accuracy is not less than threshold value
Detection unit 205 is extracted decision rule (i.e. prediction rule) using prediction rule extractor, and is drawn using prediction rule execution
It holds up and forecast analysis is carried out to overall data, trigger event is created, such as event 1, event 2, event 3, event 4 and event 5.Specifically
Event may is that 4G user, and flow more months super to take, and voice the number of minutes is insufficient, set meal is unreasonable etc..
S306: customer analysis unit 206 beats the informativeness of user according to the off-network probability in predictive analysis results
Point.Schemes generation unit (maintaining schemes generation and dispatch unit) 208 targetedly selects to distribute default according to trigger event
Telecommunication user maintain scheme.
S307: set meal matching unit 207 analyzes the dosage and other customer datas of user, is more suitable to provide for user
Set meal, as maintenance user a kind of mode.
S308: interaction prewarning unit 209 is responsible for according to the corresponding medium of maintenance scheme dynamic select (such as network, phone system
System), by rational management resource, specifically to notify to execute and maintain scheme, and record telecommunications in CRM system 210
User maintains the executive condition that carries into execution a plan.
Fig. 4 is the implementation flow diagram of data preprocessing module in Fig. 1.
As shown in figure 4, this method may comprise steps of:
S401 acquires data, including user's decision behavior characteristic, user's usage behavior characteristic, user's loyalty
Spend behavioural characteristic data.
User's decision behavior characteristic mainly includes income, monthly rent, set meal type, standard rate, packaged service type
And the original variables such as dosage, customer consumption record, user's payment mean value, user's payment monthly rent ratio, set are further obtained through derivative
The integrated decision-makings behavioural characteristic data such as meal use value, set meal use value rate.
User's usage behavior characteristic is according to for embodying user in the product or service process provided using operator
The usage behavior feature shown specifically includes that average talk number, local call duration, national distance call after derivative
The ratio between duration, total long-distance duration of call, the roaming duration of call, short message item number, short message item number and talk times, data traffic.
Consumer loyalty degree behavioural characteristic data are mainly used for describing the type and quantity of operator's product ordered by user,
And alteration trend is used to the recent of products & services ordered by it.These data mainly may include: to produce after derivative
Product lot-size, national distance call trend, internal long distance call trend frequency, roams call trend, is short local call trend
Believe transmission trend, flow dosage trend.
S402 carries out data integrity, reasonableness check.Can specifically check data it is no there are blank value, whether there is
Unreasonable value or the value beyond value range.
S403 analyzes data using Pearson correlation coefficient r.Wherein:
In equation 1, r indicates that Pearson correlation coefficient, X indicate a characteristic variable, and Y indicates another feature variable,WithIt is the average value of characteristic variable X and characteristic variable Y respectively.
When 0.8≤| r | when < 1, indicate that X and Y is highly relevant, the embodiment of the present invention can remove in highly relevant attribute
One of them.
S404, random division data set form training set and test set.
Fig. 5 is the implementation flow diagram of model building module in Fig. 1.
As shown in figure 5, telecommunication user attrition prediction analysis step of the model building module based on data mining decision Tree algorithms
It suddenly can be as follows:
S501 determines sample set, attribute to be analyzed.
If N is sample set, if characteristic set to be analyzed is J, if class categories collection is combined into I.
S502 creates root node R.
S503, judges whether root node R belongs to same category I.
S504 returns to R node, is denoted as leaf node if N belongs to same category I, indicates for class i.
S505, judging characteristic collection are combined into whether J is empty or node sample number less than given value.
S506 returns to root node R, is labeled as leaf node if J is that empty or node sample number is less than given value,
And indicating R is most classes occur in N.
S507, calculates the information gain-ratio of each J (j1, j2....jn) in N, and selection wherein has highest information gain-ratio
Feature as test feature.
Specific information gain-ratio calculating process is as follows:
The concept of entropy is for come the uncertainty of measuring predictive marker value in sample set N, entropy is bigger, and uncertainty is got over
Height, formula are as follows:
In formula 2, piIt represents in predictive marker value XiProbability in set N.
Conditional entropy indicates that under the conditions of characteristic variable J value is specific, the uncertainty of predictive marker value, formula is such as
Under:
In equation 3, piIndicate characteristic variable J=jiProbability in set N.
Information gain indicates after the information for learning feature J, so that the degree of the uncertain reduction of N, formula are as follows:
G (N | J)=E (N)-E (N | J) (formula 4)
It is asked using what information gain-ratio can effectively solve that the annual reporting law feature more to unique variable value preferentially divided
Topic, specific formula is as follows:
In formula 5, and GainRatio (N | J) indicate information gain-ratio.
S509 generates a branch if test feature is discrete features for each different characteristic value, to the section
Click through line splitting.
S510, if test feature is that continuous type feature divides the node according to the segmentation threshold of this feature
It splits.
The specific algorithm of the Threshold segmentation of continuous type feature can be as follows: continuous type feature is carried out ascending sort.
Using the midpoint of any two characteristic value as split point, the information gain of each split point is calculated.For the operation speed of accelerating algorithm
Degree, only calculates the information gain for the split point that categorical attribute can be made to change.Specific formula is as follows:
E (continuous feature)=p≤ segmentation thresholdE (≤segmentation threshold)+p> segmentation thresholdE (> segmentation threshold) (formula 7)
G (N | continuous feature)=E (N)-E (continuous feature) (formula 8)
Calculate the information gain-ratio of split point.
According to default maximum branch number, select the maximum several split point of information gain as segmentation threshold.
S511, the new node generated for each division, jumps to S503 and repeats.
S512, finally, carrying out cut operator according to the classification error situation of node.Specific Pruning strategy can following institute
Show:
Using original sample set as test data, calculating decision tree, prediction is smart accordingly in beta pruning and not beta pruning
Degree, if precision of prediction does not reduce after cutting some subtree, cuts subtree.
Customer churn forecast analysis process is illustrated below by simple use-case, case data is detailed in following table (1):
Table (1)
In table (1), 4 column of centre are for purposes of illustration only, it is 2 that maximum branch amount, which is arranged, to continuous variable in this example, initially
User characteristic data is used as under state, last column is used as predictive marker value.
Under original state, off-network number is 2, is 3 in netting index, then the initial information entropy of the column are as follows:
In formula 9, E (S) indicates initial information entropy.
Next calculate separately discrete features net type and whether the conditional entropy of arrearage, 3 classes can be divided into net type,
Respectively 4G, 3G, 2G, quantitative proportion 3:1:1, the then conditional entropy of this feature are as follows:
Wherein: E (x) indicates conditional entropy,
The formula of information gain can be such that
G (N | in net type)=E (N)-E (in net type)=0.42 (formula 11)
Whether arrearage is divided into 2 classes, and data ratio is 3:2.Similarly, the conditional entropy of this feature are as follows:
G (N | whether arrearage)=E (N)-E (whether arrearage)=0.02 (formula 14)
For continuous feature, possible split point is selected, and calculates the information gain-ratio of split point according to this, split point is chosen
As shown in following table (2):
In the net time | 0 | 0.11 | 0.12 | 0.28 | 0.39 |
Whether off-network | It is no | It is no | It is no | It is | It is |
Table (2)
Because categorical attribute just changes only between 0.12 and 0.28 in this example, therefore split point is chosen uniquely, takes
Value is 0.2, and the calculating process of the specific information gain-ratio of the split point is given below:
G (N | telephone expenses stability bandwidth)=E (N)-E (telephone expenses stability bandwidth)=0.971 (formula 18)
After Comprehensive Correlation above-mentioned formula, it is known that: GainRatio (telephone expenses stability bandwidth) > GainRatio (in net type) >
GainRatio (whether arrearage).
Fig. 6 is the final division result figure divided to data set of one embodiment of the invention.
As shown in fig. 6, the feature for selecting telephone expenses stability bandwidth to divide as this, divides data set, to postorder
The step of tree repeats this process, obtains final division result figure can be as follows:
S601, data set may include: networked users' ratio be 60%, off-network user's ratio is 40%, off-network number of users
It is 5 for 2, networked users' number 3, total number of users.
S602, the feature for selecting telephone expenses stability bandwidth to divide as this, divides data set.
S603, when telephone expenses stability bandwidth≤0.2, Sub Data Set after division can be with are as follows: networked users' ratio is 100%,
Off-network user's ratio is 0%, off-network number of users is 0, networked users' number 3, total number of users are 3.
S604, as telephone expenses stability bandwidth > 0.2, Sub Data Set after division can be with are as follows: networked users' ratio is 0%, from
Network users ratio is 100%, off-network number of users is 2, networked users' number 0, total number of users are 2.
Forecast analysis is carried out by case data as a result, available following prediction rule:
Prediction rule 1: when stability bandwidth is less than or equal to 0.2, user will not be lost.
Prediction rule 2: the customer churn when stability bandwidth is greater than 0.2.
The data set of the above use-case is smaller, is merely to illustrate algorithm, when running the calculation on mass users characteristic data set
When method, more actual prediction rule can be obtained.
Fig. 7 is the schematic diagram of the fractionation merging mechanism of the model building module of one embodiment of the invention.
As shown in fig. 7, the schematic diagram may include split cells 710 and combining unit 720.Training set S is single by splitting
Member 710 can be split as training set S1, training set S2, training set S3 and training set S4 etc..Training set S1, training set S2, instruction
Practice collection S3 and training set S4 according to the available prediction model 1 of Decision tree classified algorithms, prediction model 2, prediction model 3 and prediction model
4.Prediction model 1, prediction model 2, prediction model 3 and prediction model 4 can be merged into prediction model by combining unit 720.
Fig. 8 is the flow diagram of the fractionation and merging data in Fig. 7.
The ratio that loss number of users accounts for overall user quantity is lower (be lost user's ratio and be generally 1.5-2%), therefore
A randomly selected training dataset may have serious data skewed popularity, may if cannot deal carefully with
The validity of prediction model can be jeopardized, or even cause that prediction model can not be generated, such as all users are not in training set
It is lost user.In the present embodiment, it introduces split cells and combining unit can solve data skewed popularity problem, while avoiding counting
According to ambiguity.
As described in Figure 8, which may include steps of:
S801 determines relevant parameter.
(1) S is set as training set.
(2) the total number of users amount that N includes by training set S is set.
(3) ratio for being lost user and networked users is set as 1:x.
(4) it sets expected training subset and is lost the ratio of user and networked users as 1:y.
S802 splits combining unit according to expected data accounting and creates multiple training subsets from training set S, i.e., will account for
The higher networked users of overall user ratio it is random and average be assigned to each training subset, it is lower that overall user ratio will be accounted for
Loss user it is anticipated that data accounting copy to all training subsets, then according to step 1 relevant parameter, obtain training
Integrate number as x/y, each training subset has a loss user of N/ (1+x), there are N × y/ (1+x) a networked users.
S803, it is independent on each training subset to establish prediction model.
S804 individually predicts the customer instance in whole training set by each prediction model.
S805: it splits combining unit and uses weight votes principle, whole prediction result is integrated, specifically, such as
Fruit prediction result is that the prediction model number of " user is in net " is n1, and prediction result is that the prediction model number of " customer churn " is
n2.Weight w1 is assigned for n1, n2 assigns weight w2, then as w1 × n1 > w2 × n2, splitting combining unit prediction result is " to use
Family is in net ", no person, prediction result is " user can be lost ".
In the above-described embodiments, can come wholly or partly by software, hardware, firmware or any combination thereof real
It is existing.For example, two functional units will be added to be integrated in one unit, two individual modules etc. can also be divided into.When using soft
When part is realized, can entirely or partly it realize in the form of a computer program product.The computer program product includes one
A or multiple computer instructions, when run on a computer, so that computer executes described in above-mentioned each embodiment
Method.When loading on computers and executing the computer program instructions, entirely or partly generates and implement according to the present invention
Process or function described in example.The computer can be general purpose computer, special purpose computer, computer network or other
Programmable device.The computer instruction may be stored in a computer readable storage medium, or computer-readable from one
Storage medium to another computer readable storage medium transmit, for example, the computer instruction can from a web-site,
Computer, server or data center by wired (such as coaxial cable, optical fiber, Digital Subscriber Line (DSL)) or wireless (such as
Infrared, wireless, microwave etc.) mode transmitted to another web-site, computer, server or data center.The meter
Calculation machine readable storage medium storing program for executing can be any usable medium that computer can access or include that one or more can use Jie
The data storage devices such as matter integrated server, data center.The usable medium can be magnetic medium, (for example, floppy disk,
Hard disk, tape), optical medium (for example, DVD) or semiconductor medium (such as solid state hard disk Solid State Disk (SSD))
Deng.
Fig. 9 is the flow chart of the prediction model assessment marking of one embodiment of the invention.
The embodiment of the present invention needs to understand in real time the accuracy of prediction model during actual prediction, is with assessment
No to need to choose characteristic value again, the specific steps of the appraisal procedure can be as follows:
S901: it determines assessment parameter, exports prediction rule.
S902: it is predicted using prediction rule in test set.
S903: marking assessment is carried out to prediction model according to the LIFT of prediction model and prediction model hit rate.
S904: prediction result is compared with assessment threshold value, if prediction result is more than assessment threshold value, is gone to step
S905.No person, goes to step S906.
S905: completing model evaluation, forms assessment report.
S906, if prediction result prompts accuracy rate not up to standard, again selected characteristic value lower than assessment threshold value, detection instruction
Practice collection.
In S903, specific LIFT and prediction model hit rate calculation method are as follows:
If A is the quantity of Accurate Prediction off-network user, B is the off-grid quantity of error prediction user, then defines prediction model
Hit rate is A/ (A+B).Wherein, when LIFT (model enhancing rate) is referred to using the prediction model, the hit rate of user is not divided by
User can be arranged by the loss possibility of prediction using the turnover rate of user when model specific to the embodiment of the present invention
Sequence, defining hit rate (X%) is to be lost in the user of the relatively high preceding X% of possibility, the hit rate of customer churn, then LIFT
=hit rate (X%)/do not use the turnover rate of the model, such as be without using the whole turnover rate of telecommunication user when the model
2%, X=5, hit rate (5%)=20%, then LIFT=20%/2%=10.
Figure 10 is that the telecommunication user of one embodiment of the invention is lost the flow chart of prevention.
In embodiments of the present invention, it after obtaining meeting the prediction model that assessment requires, needs to utilize the prediction model pair
Target data set is analyzed, and is maintained based on the analysis results to carry out effective user, to reach the prevention off-grid mesh of user
's.
As described in Figure 10, the specific steps of the process can be as follows:
S101 extracts prediction rule.
S102 analyzes target data set using prediction rule, obtains analysis result.
S103, based on the analysis results, creation maintain trigger event.
S104 maintains schemes generation and dispatch unit using trigger event, generates user and maintain scheme, will maintain scheme group
It is dealt into interactive prewarning unit.
S105, attribute and its real-time resource utilization of the interaction prewarning unit according to communicative channel, is moved using different
State dispatching algorithm, unified planning media resource, guidance executes user and maintains scheme, and records scheme implementing result.
Figure 11 is the flow diagram that the prediction telecommunication user of one embodiment of the invention is lost.
As shown in figure 11, the process may include can be with step:
S111 acquires the characteristic value of the teledata of telecommunication user, forms data bins to be analyzed, from the overall of data bins
Data sample is pulled out in data, is training set and test set by data sample random division.
S112 establishes prediction model based on merging mechanism and Decision tree classified algorithms are split on training set.
S113 tests prediction model using test set, according to the estimated performance of test result assessment prediction model.
S114 analyzes overall data using the prediction model of test performance qualification, and predicts to be lost in telecommunication user
User.
On the one hand, foregoing invention embodiment can acquire the spy of the teledata of telecommunication user by data preprocessing module
Value indicative forms data bins to be analyzed, by pulling out data sample from the overall data of data bins, and by data sample with
Machine is divided into the preprocessed datas such as training set and test set, can train and extract preferable data in the data of magnanimity, by compared with
Good data training and test, can not only reduce operand, reduce time-consuming, and the function that machine learns automatically may be implemented
Energy.
On the other hand, the embodiment of the present invention can be based on splitting merging mechanism by model building module and decision tree is calculated
Method is eliminated due to data trend and bring the problem of establishing prediction model for training set, can solve data trend
The defect of accuracy difference, improves the accuracy of data, and then significantly improve precision of prediction.
Another aspect, the embodiment of the present invention can test prediction model using test set by assessment scoring modules, according to
The estimated performance of test result assessment prediction model;It is analyzed using forecast analysis module using the prediction model of test performance qualification
Overall data, and predict that the loss user in telecommunication user may be implemented: the time required to effectively reducing whole prediction, realize use
The real time implementation of family attrition prediction, high efficiency are lost prevention method by the telecommunication user based on event, realize to customer churn
The integration control of prediction and prevention.
In some embodiments, the characteristic value of the teledata of the acquisition telecommunication user in step S111 may include: to adopt
Collect the original variable of the teledata of telecommunication user;Integrality and/or reasonableness check are carried out to original variable, deletes and checks not
By data, form the surplus variable of teledata;Surplus variable is derived, the primitive character of teledata is obtained
Value;It is for statistical analysis to primitive character value using Pearson correlation coefficient, highly relevant characteristic value is obtained, height phase is deleted
The characteristic value of pass generates the characteristic value of teledata.
In some embodiments, in step S112 based on merging mechanism and Decision tree classified algorithms are split, built on training set
Vertical prediction model may include: that training set is split as multiple sub- training sets;Each height instruction is based respectively on according to Decision tree classified algorithms
Practice collection and establishes sub- prediction model;Sub- prediction model is integrated according to weight votes principle, generates prediction model.
In some embodiments, prediction model is tested using test set in step S113, is assessed according to test result pre-
The estimated performance for surveying model may include: to export the prediction rule of prediction model, be predicted according to prediction rule test set,
Obtain the model enhancing rate LIFT and/or prediction hit rate of prediction model;Prediction model is commented according to LIFT and hit rate
Estimate marking;When assessing score more than or equal to threshold value, the qualified report of assessment is generated;When assessing score less than threshold value,
It is unqualified to generate assessment, needs to resurvey the report of the characteristic value of the teledata of telecommunication user.
It in some embodiments, can also include: creation triggering thing after predicting the user that can be lost in telecommunication user
Part, to the user that can be lost is predicted, triggering executes preset maintenance scheme.
In some embodiments, the characteristic value of teledata includes the one or more of following characteristics value: telecommunication user
Decision behavior characteristic;The characteristic of the usage behavior of telecommunication user;The feature of the loyalty behavior of telecommunication user
Data.
On the one hand, foregoing invention embodiment can use data mining technology, by carrying out transverse direction to multicharacteristic information
Information gain compares automatically, the limitation for reducing analysis personnel and generating by subjective factor of high degree, realizes user's stream
Lose the intelligence of forecast analysis.
On the other hand, foregoing invention embodiment can introduce the evaluation mechanism of prediction model, and be established in advance using training set
Model is surveyed to effectively reduce whole prediction required time, realizes real time implementation, the high efficiency of customer churn prediction.
Another aspect, foregoing invention embodiment can be lost prevention method by the telecommunication user based on event, realize
The integration control that customer churn is predicted and is prevented.
In addition, foregoing invention embodiment introduces the Decision tree classified algorithms for splitting merging mechanism, data trend can solve
Problem improves the precision of prediction.
It should be noted that in the absence of conflict, those skilled in the art can according to actual needs will be above-mentioned
The sequence of operating procedure is adjusted flexibly, or above-mentioned steps are carried out the operation such as flexible combination.For simplicity, repeating no more
Various implementations.In addition, the content of each embodiment can mutual reference.
In addition, the device of the various embodiments described above can be used as the execution in the method for each embodiment of the various embodiments described above
Main body may be implemented the corresponding process in each method, realize identical technical effect, for sake of simplicity, content is no longer in this respect
It repeats.
Through the above description of the embodiments, those skilled in the art can be understood that each embodiment can
It realizes by means of software and necessary general hardware platform, naturally it is also possible to pass through hardware.Based on this understanding, on
Stating technical solution, substantially the part that contributes to existing technology can be embodied in the form of software products in other words, should
Computer software product may be stored in a computer readable storage medium, such as ROM/RAM, magnetic disk, CD, including several fingers
It enables and using so that a computer equipment (can be personal computer, server or the network equipment etc.) executes each implementation
Method described in certain parts of example or embodiment.
Finally, it should be noted that the above embodiments are merely illustrative of the technical solutions of the present invention, rather than its limitations;Although
Present invention has been described in detail with reference to the aforementioned embodiments, those skilled in the art should understand that: it still may be used
To modify the technical solutions described in the foregoing embodiments or equivalent replacement of some of the technical features;
And these are modified or replaceed, technical solution of various embodiments of the present invention that it does not separate the essence of the corresponding technical solution spirit and
Range.
Claims (14)
1. a kind of analytical equipment that prediction telecommunication user is lost characterized by comprising
Data preprocessing module, the characteristic value of the teledata for acquiring telecommunication user, forms data bins to be analyzed, from institute
It states in the overall data of data bins and pulls out data sample, be training set and test set by the data sample random division;
Model building module, for establishing prediction model for the training set based on merging mechanism and Decision tree classified algorithms are split;
Scoring modules are assessed, test the prediction model using the test set, the prediction model is assessed according to test result
Estimated performance;
Forecast analysis module analyzes the overall data using the prediction model of the test performance qualification, and predicts the electricity
Loss user in credit household.
2. the apparatus according to claim 1, which is characterized in that the data preprocessing module includes:
Acquisition unit, the original variable of the teledata for acquiring the telecommunication user;
Brief unit is deleted for carrying out integrality and/or reasonableness check to the original variable and checks unsanctioned number
According to forming the surplus variable of the teledata;
Derived units obtain the primitive character value of the teledata for deriving to the surplus variable;
Statistic unit obtains highly relevant for using Pearson correlation coefficient for statistical analysis to the primitive character value
Characteristic value, delete the highly relevant characteristic value, generate the characteristic value of the teledata.
3. the apparatus according to claim 1, which is characterized in that the model building module includes:
Surely join unit, for determining the first ratio for being lost user and non-streaming appraxia family in the training set, and determining son instruction
Practice the second ratio concentrated and be lost user and non-streaming appraxia family;
Split cells, the third ratio of the ratio for calculating second ratio and first ratio are based on the third
The training set is split as multiple sub- training sets by ratio;
Unit is established, establishes sub- prediction model for being based respectively on each sub- training set according to Decision tree classified algorithms;
Integral unit generates the prediction model for integrating according to weight votes principle to the sub- prediction model.
4. the apparatus according to claim 1, which is characterized in that the assessment scoring modules include:
Test cell carries out the test set according to the prediction rule for exporting the prediction rule of the prediction model
Prediction obtains the model enhancing rate LIFT and/or prediction hit rate of the prediction model;
Assessment unit, for carrying out assessment marking to the prediction model according to the LIFT and the hit rate;
Reporting unit, for when assessing score more than or equal to threshold value, generating the qualified report of assessment;When the assessment point
When number is less than the threshold value, generation assessment is unqualified, needs to resurvey the characteristic value of the teledata of the telecommunication user
Report.
5. the apparatus according to claim 1, which is characterized in that further include:
Pre- Anti-lost module, for creating trigger event, to the user that can be lost is predicted, triggering executes preset maintenance side
Case.
6. device according to any one of claims 1-5, which is characterized in that the characteristic value of the teledata include with
The one or more of lower eigenvalue:
The characteristic of the decision behavior of the telecommunication user;
The characteristic of the usage behavior of the telecommunication user;
The characteristic of the loyalty behavior of the telecommunication user.
7. a kind of analysis method that prediction telecommunication user is lost characterized by comprising
The characteristic value for acquiring the teledata of telecommunication user, forms data bins to be analyzed, from the overall data of the data bins
In pull out data sample, by the data sample random division be training set and test set;
Based on merging mechanism and Decision tree classified algorithms are split, prediction model is established on the training set;
The prediction model is tested using the test set, the estimated performance of the prediction model is assessed according to test result;
The overall data are analyzed using the prediction model of the test performance qualification, and predict to be lost in the telecommunication user
User.
8. the method according to the description of claim 7 is characterized in that it is described acquisition telecommunication user teledata characteristic value,
Include:
Acquire the original variable of the teledata of the telecommunication user;
Integrality and/or reasonableness check are carried out to the original variable, deletes and checks unsanctioned data, form the telecommunications
The surplus variable of data;
The surplus variable is derived, the primitive character value of the teledata is obtained;
It is for statistical analysis to the primitive character value using Pearson correlation coefficient, highly relevant characteristic value is obtained, is deleted
The highly relevant characteristic value, generates the characteristic value of the teledata.
9. the method according to the description of claim 7 is characterized in that described based on splitting merging mechanism and Decision tree classified algorithms,
Prediction model is established on the training set, comprising:
The training set is split as multiple sub- training sets;
Each sub- training set, which is based respectively on, according to Decision tree classified algorithms establishes sub- prediction model;
The sub- prediction model is integrated according to weight votes principle, generates the prediction model.
10. the method according to the description of claim 7 is characterized in that described test the prediction model using the test set,
The estimated performance of the prediction model is assessed according to test result, comprising:
The prediction rule for exporting the prediction model is predicted the test set according to the prediction rule, described in acquisition
The model enhancing rate LIFT and/or prediction hit rate of prediction model;
Assessment marking is carried out to the prediction model according to the LIFT and the hit rate;
When assessing score more than or equal to threshold value, the qualified report of assessment is generated;When the assessment score is less than the threshold
When value, generation assessment is unqualified, needs to resurvey the report of the characteristic value of the teledata of the telecommunication user.
11. the method according to the description of claim 7 is characterized in that the user that can be lost in the prediction telecommunication user
Later, further includes:
Trigger event is created, to the user that can be lost is predicted, triggering executes preset maintenance scheme.
12. method according to any one of claims 7-11, which is characterized in that the characteristic value of the teledata includes
The one or more of following characteristics value:
The characteristic of the decision behavior of the telecommunication user;
The characteristic of the usage behavior of the telecommunication user;
The characteristic of the loyalty behavior of the telecommunication user.
13. a kind of analytical equipment that prediction telecommunication user is lost characterized by comprising
Memory, for storing program;
Processor, for executing the program of the memory storage, described program makes the processor execute such as claim
Method described in any one of 7-12.
14. a kind of computer readable storage medium characterized by comprising instruction,
When described instruction is run on computers, so that computer executes the side as described in any one of claim 7-12
Method.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710881795.5A CN109558962A (en) | 2017-09-26 | 2017-09-26 | Predict device, method and storage medium that telecommunication user is lost |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710881795.5A CN109558962A (en) | 2017-09-26 | 2017-09-26 | Predict device, method and storage medium that telecommunication user is lost |
Publications (1)
Publication Number | Publication Date |
---|---|
CN109558962A true CN109558962A (en) | 2019-04-02 |
Family
ID=65862452
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201710881795.5A Pending CN109558962A (en) | 2017-09-26 | 2017-09-26 | Predict device, method and storage medium that telecommunication user is lost |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN109558962A (en) |
Cited By (12)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110033355A (en) * | 2019-04-17 | 2019-07-19 | 中国联合网络通信集团有限公司 | The method and system that telephone expenses set meal is recommended |
CN112153636A (en) * | 2020-10-29 | 2020-12-29 | 浙江鸿程计算机系统有限公司 | Method for predicting number portability and roll-out of telecommunication industry user based on machine learning |
CN112200375A (en) * | 2020-10-15 | 2021-01-08 | 中国联合网络通信集团有限公司 | Prediction model generation method, prediction model generation device, and computer-readable medium |
CN112836877A (en) * | 2021-02-04 | 2021-05-25 | 广西蜂鸟汽车科技有限公司 | Telecommunication customer loss prediction method and system for improving multi-layer perceptron |
CN113033909A (en) * | 2021-04-08 | 2021-06-25 | 中国移动通信集团陕西有限公司 | Portable user analysis method, device, equipment and computer storage medium |
CN113139715A (en) * | 2021-03-30 | 2021-07-20 | 北京思特奇信息技术股份有限公司 | Comprehensive assessment early warning method and system for loss of group customers in telecommunication industry |
CN113259144A (en) * | 2020-02-07 | 2021-08-13 | 北京京东振世信息技术有限公司 | Storage network planning method and device |
CN113543117A (en) * | 2020-04-22 | 2021-10-22 | 中国移动通信集团重庆有限公司 | Prediction method and device for number portability user and computing equipment |
CN113610552A (en) * | 2021-06-25 | 2021-11-05 | 清华大学 | User loss prediction method and device |
CN114143772A (en) * | 2021-11-18 | 2022-03-04 | 北京思特奇信息技术股份有限公司 | Method and system for reducing user off-network rate |
CN114399087A (en) * | 2021-12-22 | 2022-04-26 | 中国电信股份有限公司 | User data processing method and device based on Flink stream processing engine |
CN114881181A (en) * | 2022-07-12 | 2022-08-09 | 南昌大学第一附属医院 | Feature weighting selection method, system, medium and computer based on big data |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102567391A (en) * | 2010-12-20 | 2012-07-11 | 中国移动通信集团广东有限公司 | Method and device for building classification forecasting mixed model |
CN105760889A (en) * | 2016-03-01 | 2016-07-13 | 中国科学技术大学 | Efficient imbalanced data set classification method |
CN106203679A (en) * | 2016-06-27 | 2016-12-07 | 武汉斗鱼网络科技有限公司 | A kind of customer loss Forecasting Methodology and system |
-
2017
- 2017-09-26 CN CN201710881795.5A patent/CN109558962A/en active Pending
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102567391A (en) * | 2010-12-20 | 2012-07-11 | 中国移动通信集团广东有限公司 | Method and device for building classification forecasting mixed model |
CN105760889A (en) * | 2016-03-01 | 2016-07-13 | 中国科学技术大学 | Efficient imbalanced data set classification method |
CN106203679A (en) * | 2016-06-27 | 2016-12-07 | 武汉斗鱼网络科技有限公司 | A kind of customer loss Forecasting Methodology and system |
Non-Patent Citations (1)
Title |
---|
陈晔: "基于组合预测的电信客户流失预测分析", 《中国优秀硕士学位论文全文数据库 信息科技辑》 * |
Cited By (15)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110033355A (en) * | 2019-04-17 | 2019-07-19 | 中国联合网络通信集团有限公司 | The method and system that telephone expenses set meal is recommended |
CN113259144B (en) * | 2020-02-07 | 2024-05-24 | 北京京东振世信息技术有限公司 | Warehouse network planning method and device |
CN113259144A (en) * | 2020-02-07 | 2021-08-13 | 北京京东振世信息技术有限公司 | Storage network planning method and device |
CN113543117B (en) * | 2020-04-22 | 2022-10-04 | 中国移动通信集团重庆有限公司 | Prediction method and device for number portability user and computing equipment |
CN113543117A (en) * | 2020-04-22 | 2021-10-22 | 中国移动通信集团重庆有限公司 | Prediction method and device for number portability user and computing equipment |
CN112200375A (en) * | 2020-10-15 | 2021-01-08 | 中国联合网络通信集团有限公司 | Prediction model generation method, prediction model generation device, and computer-readable medium |
CN112200375B (en) * | 2020-10-15 | 2023-08-29 | 中国联合网络通信集团有限公司 | Prediction model generation method, prediction model generation device, and computer-readable medium |
CN112153636A (en) * | 2020-10-29 | 2020-12-29 | 浙江鸿程计算机系统有限公司 | Method for predicting number portability and roll-out of telecommunication industry user based on machine learning |
CN112836877A (en) * | 2021-02-04 | 2021-05-25 | 广西蜂鸟汽车科技有限公司 | Telecommunication customer loss prediction method and system for improving multi-layer perceptron |
CN113139715A (en) * | 2021-03-30 | 2021-07-20 | 北京思特奇信息技术股份有限公司 | Comprehensive assessment early warning method and system for loss of group customers in telecommunication industry |
CN113033909A (en) * | 2021-04-08 | 2021-06-25 | 中国移动通信集团陕西有限公司 | Portable user analysis method, device, equipment and computer storage medium |
CN113610552A (en) * | 2021-06-25 | 2021-11-05 | 清华大学 | User loss prediction method and device |
CN114143772A (en) * | 2021-11-18 | 2022-03-04 | 北京思特奇信息技术股份有限公司 | Method and system for reducing user off-network rate |
CN114399087A (en) * | 2021-12-22 | 2022-04-26 | 中国电信股份有限公司 | User data processing method and device based on Flink stream processing engine |
CN114881181A (en) * | 2022-07-12 | 2022-08-09 | 南昌大学第一附属医院 | Feature weighting selection method, system, medium and computer based on big data |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN109558962A (en) | Predict device, method and storage medium that telecommunication user is lost | |
US10896203B2 (en) | Digital analytics system | |
CN107229708A (en) | A kind of personalized trip service big data application system and method | |
CN110991875B (en) | Platform user quality evaluation system | |
Asane-Otoo | Carbon footprint and emission determinants in Africa | |
CN103796183B (en) | A kind of refuse messages recognition methods and device | |
CN105184315A (en) | Quality inspection treatment method and system | |
CN101620692A (en) | Method for analyzing customer churn of mobile communication service | |
CN104217088B (en) | The optimization method and system of operator's mobile service resource | |
CN105069025A (en) | Intelligent aggregation visualization and management and control system for big data | |
CN101320449A (en) | Power distribution network estimation method based on combination appraisement method | |
CN108023768A (en) | Network event chain establishment method and network event chain establish system | |
Li et al. | Enhancing telco service quality with big data enabled churn analysis: infrastructure, model, and deployment | |
CN103250376A (en) | Method and system for carrying out predictive analysis relating to nodes of a communication network | |
Marques et al. | Congressmen in the age of social network sites: Brazilian representatives and Twitter use | |
CN109919675A (en) | Communication user upshift prediction probability recognition methods neural network based and system | |
CN107977855B (en) | Method and device for managing user information | |
Nasiri Khiavi et al. | Comparative applicability of MCDM‐SWOT based techniques for developing integrated watershed management framework | |
CN117911085A (en) | User management system, method and terminal based on enterprise marketing | |
TWI662809B (en) | Obstacle location system and maintenance method for image streaming service | |
Roets et al. | Evaluation of railway traffic control efficiency and its determinants | |
Wang | Research on bank marketing behavior based on machine learning | |
CN108234596A (en) | Aviation information-pushing method and device | |
Ferwerda et al. | Leveraging the power of place: A data-driven decision helper to improve the location decisions of economic immigrants | |
CN106817710A (en) | The localization method and device of a kind of network problem |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20190402 |