CN109886756A

CN109886756A - Communication user upshift prediction probability recognition methods and system based on integrated model

Info

Publication number: CN109886756A
Application number: CN201910161182.3A
Authority: CN
Inventors: 周洪峰; 雷奥林; 邹秋艳
Original assignee: Shenzhen Microproducts To Mdt Infotech Ltd
Current assignee: Shenzhen Microproducts To Mdt Infotech Ltd
Priority date: 2019-03-04
Filing date: 2019-03-04
Publication date: 2019-06-14

Abstract

The present invention provides a kind of communication user upshift prediction probability recognition methods and system based on integrated model, the communication user upshift prediction probability recognition methods is the following steps are included: step S1, user basic information data are stored into database, user basic information data set S is obtained；Step S2 carries out Data Detection to the user basic information data set S, and obtains data set after standardization；Step S3, by data setAs input, the data that will be upshiftd with set meal whether is handled the number of households involved moonIntegrated model training is carried out as output；Step S4, by updated user basic information data setIt is input in trained integrated model, the output predictive data set for whether handling set meal upshift the number of households involved moon.The present invention is by carrying out Data Detection and analysis to user basic information data set S, in conjunction with the training of integrated model, and then realizes predictive data setOutput, the specific aim of prediction, relevance and accuracy are very high, as a result very clear.

Description

Communication user upshift prediction probability recognition methods and system based on integrated model

Technical field

The present invention relates to a kind of communication user upshift prediction method more particularly to a kind of communication users based on integrated model Upshift prediction probability recognition methods, and be related to using the communication user upshift prediction probability recognition methods based on integrated model Communication user upshift prediction probability identifying system.

Background technique

The intelligent terminals such as smart phone are very universal, and with digitized progress, the set meal of communication user also can Change therewith, but present user's upshift prediction probability identification and the relevance probability of the existing communication set meal of user are relatively low, That is the upshift prediction probability identifying system do not predicted and analyzed for the existing communication set meal of user now, this If sample, user is not easy to pointedly adjust set meal in time, while also will increase the management difficulty of operator, influences operator Marketing effectiveness and marketing accuracy.

Summary of the invention

It is a kind of convenient for user's direct viewing predictive data set the technical problem to be solved by the present invention is to need to provide, accomplish Digitlization, convenient for can pointedly adjust set meal in time or take different marketing measures, reduces the management difficulty of operator, And effectively improve the marketing effectiveness of operator and the communication user upshift prediction probability knowledge based on integrated model of marketing accuracy Other method；It also needs to further provide for using the logical of the communication user upshift prediction probability recognition methods based on integrated model Interrogate user's upshift prediction probability identifying system.

In this regard, the present invention provides a kind of communication user upshift prediction probability recognition methods based on integrated model, including with Lower step:

Step S1 stores user basic information data into database, obtains user basic information data set S；

Step S2 carries out Data Detection to the user basic information data set S, and is counted after standardization According to collection S_t；

Step S3, by data set S_tAs input, the data y that will be upshiftd with set meal whether is handled the number of households involved moon_nAs export into The training of row integrated model, obtains trained integrated model；

Step S4, by updated user basic information data set S_xIt is input to the trained integrated model of step S3 In, it exports and shows with the predictive data set S for whether handling set meal upshift the number of households involved moon_y。

A further improvement of the present invention is that the step S2 includes following sub-step:

Step S201 detects exceptional value and missing values in the user basic information data set S；

Exceptional value is set missing values by step S202；

Step S203 obtains the data source S_ after data cleansing to missing values zero padding；

Step S204 is standardized data source S_ to obtain data set S_t。

A further improvement of the present invention is that passing through formula in the step S204To data source S_ It is standardized to obtain data set S_t, wherein x_nThe nth data of data set in data source S_ are represented, what n was represented is several According to sample size,The mean value of all sample datas is represented, σ represents the standard deviation of all sample datas.

A further improvement of the present invention is that passing through formulaCalculate the mean value of all sample datasPass through FormulaCalculate the standard deviation sigma of all sample datas.

A further improvement of the present invention is that the integrated model contains Logic Regression Models, determines in the step S3 Plan tree-model and model-naive Bayesian；By the data set S_t, will be with whether handling set meal upshift the number of households involved moon as input Data y_nAs output, it is trained simultaneously in Logic Regression Models, decision-tree model and model-naive Bayesian respectively, and The Logic Regression Models, decision-tree model and model-naive Bayesian are separately optimized by way of verifying in the integrated mould Weight in type.

A further improvement of the present invention is that the step S3 includes following sub-step:

Step S301, by the data set S_tIt is split, is split as training dataset S_tAWith test data set S_tB；

Step S302, by training dataset S_tAAs input, by the training dataset S_tAIt is corresponding whether to use the number of households involved moon Handle the data y of set meal upshift_n1As output, respectively in Logic Regression Models, decision-tree model and model-naive Bayesian It is trained simultaneously；

Step S303, in the training process, by the test data set S_tBAnd test data set S_tBCorresponding user Whether the data y of set meal upshift is handled within secondary month_n2As verify data, respectively to Logic Regression Models, decision-tree model and simplicity Training data in Bayesian model is verified, if being verified, by its weight divided by weight constant d_t, otherwise weighed Weight is multiplied by weight constant d_t。

A further improvement of the present invention is that in the step S301, the training dataset S_tAWith test data set S_tB Between ratio control be 7:3.

A further improvement of the present invention is that in the step S3, when starting to train, the Logic Regression Models, Weight proportion between decision-tree model and model-naive Bayesian is 1:1:1；The weight constant d_tBetween 0~1 Random number.

A further improvement of the present invention is that in the step S1, the user basic information data set S include user only One mark ID, set dinner cost, traffic fee, the first trimester telephone expenses amount of money, super set number, super set traffic fee, integral, user gradation, year Age, gender, call class, net duration, first trimester voice duration mean value, roaming service customer, first trimester GPRS flow, first three Any one or a few in a month telephone expenses mean value and first trimester flow mean value.

The communication user upshift prediction probability identifying system based on integrated model that the present invention also provides a kind of uses as above The communication user upshift prediction probability recognition methods based on integrated model.

Compared with prior art, the beneficial effects of the present invention are: by user basic information data set S carry out data Detection and analysis in conjunction with the training of integrated model, and then realizes predictive data set S_yOutput, pass through the instruction of the integrated model Practice (study), better prediction effect can be obtained, have stronger generalization ability, therefore, the specific aim of this prediction, association Property and accuracy are all very high, convenient for being intuitive to see predictive data set S by user's upshift prediction system_y, realize prediction Systematization, it is as a result very clear and timely and convenient, suggest providing good number to provide more standby targetedly set meal upshift According to basis, different marketing measures are taken convenient for the pointedly set meal of the different clients group of adjustment in time or specific aim, are reduced The management difficulty of operator effectively improves the marketing effectiveness and marketing accuracy of operator.

Detailed description of the invention

Fig. 1 is the workflow schematic diagram of an embodiment of the present invention；

Fig. 2 is the operation principle schematic diagram of an embodiment of the present invention；

Fig. 3 is the data flow schematic illustration of an embodiment of the present invention.

Specific embodiment

With reference to the accompanying drawing, preferably embodiment of the invention is described in further detail.

As shown in Figure 1 to Figure 3, this example provides a kind of communication user upshift prediction probability identification side based on integrated model Method, comprising the following steps:

Step S1 and step S2 described in this example are the processes that data prediction is carried out to user basic information data set S；Institute Stating step S3 is by existing historical data, by the data set S of user_tAs input, will be with the number of households involved moon in historical data The no data y for handling set meal upshift_nAs output, and then train required integrated model when being predicted；And institute Stating step S4 then is using the step S3 trained integrated model, by updated user basic information data set S_xIt is defeated Enter into the trained integrated model and be trained, and then passes through updated user basic information data set S_xAnd training The predictive data set S that good integrated model output is upshiftd with set meal whether is handled the number of households involved moon_y, and show, to realize accuracy more High communication user upshift prediction probability identifies that the identification of communication user upshift prediction probability described in this example is referred to communication user Set meal upshift demand predicted that is, this example realizes the higher communication user set meal upshift prediction of accuracy.

In step S1 described in this example, the user basic information data set S includes user's unique ID, set dinner cost, stream Amount take, the first trimester telephone expenses amount of money, super set number, super set traffic fee, integral, user gradation, the age, gender, class of conversing, Net duration, first trimester voice duration mean value, roaming service customer, first trimester GPRS flow, first trimester telephone expenses mean value and preceding Any one or a few in three months flow mean values.

Updated user basic information data set S described in this example_xIt equally include user's unique ID, set dinner cost, stream Amount take, the first trimester telephone expenses amount of money, super set number, super set traffic fee, integral, user gradation, the age, gender, class of conversing, Net duration, first trimester voice duration mean value, roaming service customer, first trimester GPRS flow, first trimester telephone expenses mean value and preceding Any one or a few in three months flow mean values, the updated user basic information data set S_xIt is preferred that by updated User basic information data, which pass through the pretreatment of the step S1 and step S2 and then are input to the step S3, to be trained Integrated model, and then improve its forecasting accuracy.

More specifically, the related data of the user basic information data set S of needs is first stored in corresponding number by this example According in library, the essential information of the user basic information data set S includes following dimension: user's unique ID, set dinner cost, stream Amount takes, telephone expenses amount of money last month, the month before last telephone expenses amount of money, the telephone expenses amount of money of upper the month before last, super set number, surpasses set traffic fee, integral, use Family grade, the age, gender, call class, net duration, first trimester voice duration mean value, roaming service customer, GPRS last month stream Amount, the month before last GPRS flow, GPRS of upper the month before last flow, first trimester telephone expenses mean value and first trimester flow mean value etc., this User basic information data set comprising these dimensions is referred to as S by example.

Then data set A, data set A will be set as with the data for data this dimension for whether handling set meal upshift the number of households involved moon It is associated between user basic information data set S by user's unique ID, i.e. predictive data set S described in this example_yWith It is associated between user basic information data set S by user's unique ID, the data after association is then stored in number According in library.Selection user data sample total is n.

Wherein, { s₁,s₂,....s_n∈ S, s_nIndicate that the corresponding essential information data of a certain user such as number the use for being 1 The corresponding set dinner cost in family, traffic fee, telephone expenses amount of money last month, the month before last telephone expenses amount of money, the telephone expenses amount of money of upper the month before last, super set number, Super set traffic fee, integral, user gradation, the age, gender, call class, in net duration, first trimester voice duration mean value, unrestrained Swim user, GPRS flow last month, the month before last GPRS flow, GPRS of upper the month before last flow, first trimester telephone expenses mean value and first three A month flow mean value etc..

{y₁,y₂,.....y_n∈ A, y_nWhat is indicated is whether to handle set meal with the number of households involved moon in a certain user's history data The data of upshift, this be it is known, for training integrated model, y_nIn 0 represent and do not need to handle set meal upshift (upgrading), 1 generation Table needs to handle set meal upshift (upgrading)；That is A is the data set to be upshiftd with set meal whether is handled the number of households involved moon in historical data, most The data instance of storage in the database is presented below eventually:

User's unique ID	Age	Networking duration	Integral	....	It is upshiftd with set meal whether is handled the number of households involved moon
						1	22	3	324	....	1
2	24	2	456	....	0
						3	56	1	245	....	0
4	15	6	786	....	0
						....	....	....	....	....	....
n	76	11	1025	....	1

Step S2 described in this example includes following sub-step:

Exceptional value is set missing values by step S202；

Step S204 is standardized data source S_ to obtain data set S_t。

More specifically, Data Detection is carried out to the user basic information data set S being put in storage, predominantly detects data set Exceptional value and missing values in S；Wherein, exceptional value is primarily referred to as the value different from common sense, due to systematic error, human error or The variation of person's inherent data is so that they and overall behavioural characteristic, structure or correlation etc. are different.This example preferably uses picture The method of box traction substation identifies, draws box traction substation and only needs to take maximum value minimum, upper quartile, median and lower quartile Number can be drawn.Data on upper quartile and under lower quartile are considered as exceptional value.Such as in the age Then it is determined as exceptional value if it is negative.After step S201 described in this example finds exceptional value, it is set to lack in step S202 Mistake value, and handled using the missing values processing mode of step S203, the missing values processing mode is to be lacked using zero padding Mistake value is stuffed entirely with missing values to be 0.It is empty under some field of missing values or numerical value is not considered as missing then Value.

In step S204 described in this example, pass through formulaData source S_ is standardized and is counted According to collection S_t, wherein x_nRepresent data source S_{_}The nth data of middle data set, what n was represented is the sample size of data,Represent institute There is the mean value of sample data, σ represents the standard deviation of all sample datas.

This example passes through formulaCalculate the mean value of all sample datasPass through formulaMeter Calculate the standard deviation sigma of all sample datas.Data set S is obtained after standardization in this way_t, data set S_tIn data symbols Standardization normal distribution, mean value 0, standard deviation 1.

In step S3 described in this example, the integrated model contains Logic Regression Models, decision-tree model and simple pattra leaves This model；By the data set S_tAs input, the data y that will be upshiftd with set meal whether is handled the number of households involved moon_nAs output, respectively It is trained in Logic Regression Models, decision-tree model and model-naive Bayesian, and is distinguished by way of verifying simultaneously Optimize the weight of the Logic Regression Models, decision-tree model and model-naive Bayesian in the integrated model.

The reason of this example setting integration module, is that, since this example is using the study for having feature, this example passes through integrated Model is trained and learns, and prediction is carried out by combining multiple and different models can effectively improve accuracy.

The integrated model contains Logic Regression Models, decision-tree model and model-naive Bayesian, logistic regression mould Integrated model is done using the method for multiple models ballot between type, decision-tree model and model-naive Bayesian, i.e., to each A model is configured respective weight, this weight is adjusted in real time according to trained result, accurate to improve prediction Degree.

This example is by data set S_tAs input, the data y that will be upshiftd with set meal whether is handled the number of households involved moon_nIt is instructed as output Practice and learn, remembers x=S_t, remember y=y_n, wherein y value in [0,1].

Step S3 described in this example includes following sub-step:

Step S303, in the training process, by the test data set S_tBAnd test data set S_tBCorresponding user Whether the data y of set meal upshift is handled within secondary month_n2As verify data, respectively to Logic Regression Models, decision-tree model and simplicity Training dataset S in Bayesian model_tAAnd the data y to be upshiftd with set meal whether is handled the number of households involved moon_n1It is verified, if testing Card passes through, then by its weight divided by weight constant d_t, otherwise by its weight multiplied by weight constant d_t。

In step S301 described in this example, by the data set S_tIt is split as training dataset S_tAWith test data set S_tBBetween Ratio be preferably controlled to 7:3；The step S3 start train when, the Logic Regression Models, decision-tree model with And the weight proportion between model-naive Bayesian is 1:1:1；The weight constant d_tFor the random number between 0~1.

Training dataset S described in this example_tAFor the data set for realizing training integrated model, the test data set S_tB For the data set for realizing verifying integrated model；The weight W of the Logistic Regression module_A, decision-tree model weight W_BAnd The weight W of model-naive Bayesian_CDefault initial values are preferably all 1, during subsequent training, pass through weight constant d_tInto Row adjustment in real time and update；The weight constant d_tRandom number for pre-set constant, between preferably 0~1.Certainly, The selection of these parameters of this example is the preferred value of this example, in practical applications, can be made by oneself according to the actual situation Justice-reparation changes and is arranged.

That is, the preferably described data set S of this example_tIt is split as training dataset S_tAWith test data set S_tB, preferably according to training Data set S_tA: test data set S_tB=7:3 is divided, then by training dataset S therein_tARespectively Logic Regression Models, It is trained accordingly in decision-tree model and model-naive Bayesian this 3 models, gives this 3 models point when training Not Fu weight initial value be 1, the weighted value of the Logistic Regression module is labeled as W_A, the weighted value of the decision-tree model is labeled as W_B, the weighted value of the model-naive Bayesian is labeled as W_C；During training, if Logic Regression Models, decision tree mould Any one category of model mistake in type and model-naive Bayesian this 3 models, i.e. training result and test data set S_tB Corresponding data are inconsistent, then the model is determined as classification error, to the weighted value of the model multiplied by weight constant d_t, into And reduce its weight for participating in training；If in this 3 models of Logic Regression Models, decision-tree model and model-naive Bayesian Any one category of model it is correct, i.e. training result and test data set S_tBCorresponding data are consistent, then with to the mould The weighted value of type is divided by weight constant d_t, and then increase its weight for participating in training.

Wherein, the Logic Regression Models are 0 or 1 using binary classification model, i.e. the y value of sample output.It is given One data point can be 0 to classification and classification is 1 to calculate separately probability, and the classification of maximum probability will be selected as prediction classification.Note Data set S_tIn multiple dimensions (such as user's unique ID covers dinner cost, traffic fee, telephone expenses amount of money last month, the month before last telephone expenses The amount of money, the telephone expenses amount of money of upper the month before last, super set number, super set traffic fee, integral etc., are detailed in and data set S are initially defined above_tPortion Point) quantity be m, i.e. m be user information data set S_ttIn number of dimensions, x_mIt represents and derives from input data set S_tIn a certain item In the data value for a certain user that dimension is m-th, logistic regression judgment formula is as follows: z=w₁*x₁+w₂*x₂+....w_mx_m+b。

Result output is finally carried out using sigmoid function, the weighted value w of m-th of number of dimensions is calculated with this_mWith Amount of bias b.Because y is known, that is, the use in input data It is known that whether the number of households involved moon, which is lost,.Pass through formula z=w₁*x₁+w₂*x₂+....w_mx_m+ b uses more than two equations then The weighted value w of m-th of number of dimensions can be back-calculated to obtain_mWith amount of bias b.

Decision-tree model described in this example obtains a Machine learning classifiers by study, this Machine learning classifiers energy It is enough that correct classification is provided to emerging object.Decision tree needs to select different attributes when doing decision, here Attributions selection is defined according to information theory, and formula is as follows:Wherein m namely above y Value, the so value of m are preferably 2, p_iFor data set S_tIn i-th of classification probability, p when calculating_i=(S_tIn belong to i-th The record number of the set of classification/| S_t|), Info (S_t) indicate data set S_tThe separated information content of different classes.

Model-naive Bayesian described in this example passes through data set S_t={ s_t1,s_t2,.....s_tnJudge these data sets With whether handling set meal upshift y, the then probability under each classification of this group of feature, a herein the number of households involved moon_iThe value of y is represented, a_iFor prediction result, i.e. a_iValue be just 0 or 1.Formula indicates are as follows: class probability

It is then to think that the given each feature of sample is uncorrelated to other features according to the thought of naive Bayesian.According to Maximize obtains the classification of this group of feature:

Wherein,

classify(f₁,f₂,....f_n) it is the classification results for giving sample group data characteristics, f_nRefer to data set S_tMiddle use The totality of family essential information data characteristics, S_tiFor data set S_tIn i-th of data.

That is, step S3 described in this example be based on integrated model for the passing behavior of user carry out different set meal groups into Row upshift prediction identification probability, by data set S_tAs input, set meal upshift data y whether will be handled with the number of households involved moon_nAs output The training for carrying out integrated model updates the integrated model training for reaching in optimization and more having in continuous weight, and carries out integrated mould The output of type.

Step S4 described in this example is by updated user basic information data set S_xIt is trained to be input to the step S3 In integrated model, exports and show with the predictive data set S for whether handling set meal upshift the number of households involved moon_y.More specifically, this example base User's upshift prediction identification probability is carried out in integrated model, using the trained integrated model of step S3 to updated use Family essential information data set S_xIt is predicted (i.e. trained), updated user basic information data set S_xFrom updated User basic information associated data set, for example before when do model training be user's April or February to four The historical data of the user basic information data of the moon is exactly May updated user basic information data when prediction It is input in the trained integrated model of step S3, and then exports timing node and be whether May needs to carry out set meal liter Prediction result (the predictive data set S of shelves_y), it then can be according to obtaining as a result, the user of set meal upshift can be handled to prediction Carry out business marketing.

Step S4 described in this example preferably includes two sub-steps: step S401, by updated user basic information data Collect S_xIt is input in the trained integrated model of step S3, the output prediction data for whether handling set meal upshift the number of households involved moon Collect S_y；Step S402, by predictive data set S_yIt is shown by user's upshift prediction system.The step S402 for realizing The displaying of user's upshift prediction probability identifying system, i.e., by predictive data set S_yBy user's upshift prediction system demonstration, fortune is allowed Battalion quotient can market directly against that can handle the user of set meal upshift within secondary month, be also convenient for user and targetedly change certainly Oneself set meal.

This example also provides a kind of communication user upshift prediction probability identifying system based on integrated model, uses institute as above The communication user upshift prediction probability recognition methods based on integrated model stated.

In conclusion this example is by carrying out Data Detection and analysis to user basic information data set S, in conjunction with integrated model Training, and then realize predictive data set S_yOutput, by the training (study) of the integrated model, can obtain preferably Prediction effect has stronger generalization ability, and therefore, specific aim, relevance and the accuracy of this prediction are all very high, is convenient for Predictive data set S is intuitive to see by user's upshift prediction system_y, forecasting system is realized, it is as a result very clear, and and When it is convenient, suggest providing good data basis to provide more standby targetedly set meal upshift, convenient for pointedly in time The set meal or specific aim for adjusting different clients group take different marketing measures, reduce the management difficulty of operator, effectively mention The marketing effectiveness and marketing accuracy of high operator.

The above content is a further detailed description of the present invention in conjunction with specific preferred embodiments, and it cannot be said that Specific implementation of the invention is only limited to these instructions.For those of ordinary skill in the art to which the present invention belongs, exist Under the premise of not departing from present inventive concept, a number of simple deductions or replacements can also be made, all shall be regarded as belonging to of the invention Protection scope.

Claims

1. a kind of communication user upshift prediction probability recognition methods based on integrated model, which comprises the following steps:

Step S2 carries out Data Detection to the user basic information data set S, and obtains data set after standardization S_t；

Step S3, by data set S_tAs input, the data y that will be upshiftd with set meal whether is handled the number of households involved moon_nCollected as output At model training, trained integrated model is obtained；

Step S4, by updated user basic information data set S_xIt is input in the trained integrated model of step S3, it is defeated Out and displaying with whether handle the number of households involved moon set meal upshift predictive data set S_y。

2. the communication user upshift prediction probability recognition methods according to claim 1 based on integrated model, feature exist In the step S2 includes following sub-step:

Exceptional value is set missing values by step S202；

Step S204 is standardized data source S_ to obtain data set S_t。

3. the communication user upshift prediction probability recognition methods according to claim 2 based on integrated model, feature exist In passing through formula in the step S204Data source S_ is standardized to obtain data set S_t, In, x_nRepresent data source S_—The nth data of middle data set, what n was represented is the sample size of data,Represent all sample numbers According to mean value, σ represents the standard deviation of all sample datas.

4. the communication user upshift prediction probability recognition methods according to claim 3 based on integrated model, feature exist In passing through formulaCalculate the mean value of all sample datasPass through formulaCalculate all samples The standard deviation sigma of notebook data.

5. the communication user upshift prediction probability identification side according to any one of claims 1 to 4 based on integrated model Method, which is characterized in that in the step S3, the integrated model contains Logic Regression Models, decision-tree model and simple shellfish This model of leaf；By the data set S_tAs input, the data y that will be upshiftd with set meal whether is handled the number of households involved moon_nAs output, divide It is not trained in Logic Regression Models, decision-tree model and model-naive Bayesian, and is divided by way of verifying simultaneously Do not optimize the weight of the Logic Regression Models, decision-tree model and model-naive Bayesian in the integrated model.

6. the communication user upshift prediction probability recognition methods according to claim 5 based on integrated model, feature exist In the step S3 includes following sub-step:

Step S302, by training dataset S_tAAs input, by the training dataset S_tAIt is corresponding whether to be handled with the number of households involved moon The data y of set meal upshift_n1As output, respectively in Logic Regression Models, decision-tree model and model-naive Bayesian simultaneously It is trained；

Step S303, in the training process, by the test data set S_tBAnd test data set S_tBIt is corresponding to use the number of households involved moon Whether the data y of set meal upshift is handled_n2As verify data, respectively to Logic Regression Models, decision-tree model and simple pattra leaves Training data in this model is verified, if being verified, by its weight divided by weight constant d_t, otherwise its weight is multiplied With weight constant d_t。

7. the communication user upshift prediction probability recognition methods according to claim 6 based on integrated model, feature exist In, in the step S301, the training dataset S_tAWith test data set S_tBBetween ratio control be 7:3.

8. the communication user upshift prediction probability recognition methods according to claim 6 based on integrated model, feature exist In, in the step S3, when starting to train, the Logic Regression Models, decision-tree model and model-naive Bayesian Between weight proportion be 1:1:1；The weight constant d_tFor the random number between 0~1.

9. the communication user upshift prediction probability identification side according to any one of claims 1 to 4 based on integrated model Method, which is characterized in that in the step S1, the user basic information data set S include user's unique ID, set dinner cost, Traffic fee, the first trimester telephone expenses amount of money, super set number, super set traffic fee, integral, user gradation, the age, gender, call class, Net duration, first trimester voice duration mean value, roaming service customer, first trimester GPRS flow, first trimester telephone expenses mean value and Any one or a few in first trimester flow mean value.

10. a kind of communication user upshift prediction probability identifying system based on integrated model, which is characterized in that use such as right It is required that the communication user upshift prediction probability recognition methods described in 1 to 9 any one based on integrated model.