CN109886756A - Communication user upshift prediction probability recognition methods and system based on integrated model - Google Patents
Communication user upshift prediction probability recognition methods and system based on integrated model Download PDFInfo
- Publication number
- CN109886756A CN109886756A CN201910161182.3A CN201910161182A CN109886756A CN 109886756 A CN109886756 A CN 109886756A CN 201910161182 A CN201910161182 A CN 201910161182A CN 109886756 A CN109886756 A CN 109886756A
- Authority
- CN
- China
- Prior art keywords
- data set
- model
- data
- integrated model
- user
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Abstract
The present invention provides a kind of communication user upshift prediction probability recognition methods and system based on integrated model, the communication user upshift prediction probability recognition methods is the following steps are included: step S1, user basic information data are stored into database, user basic information data set S is obtained;Step S2 carries out Data Detection to the user basic information data set S, and obtains data set after standardization;Step S3, by data setAs input, the data that will be upshiftd with set meal whether is handled the number of households involved moonIntegrated model training is carried out as output;Step S4, by updated user basic information data setIt is input in trained integrated model, the output predictive data set for whether handling set meal upshift the number of households involved moon.The present invention is by carrying out Data Detection and analysis to user basic information data set S, in conjunction with the training of integrated model, and then realizes predictive data setOutput, the specific aim of prediction, relevance and accuracy are very high, as a result very clear.
Description
Technical field
The present invention relates to a kind of communication user upshift prediction method more particularly to a kind of communication users based on integrated model
Upshift prediction probability recognition methods, and be related to using the communication user upshift prediction probability recognition methods based on integrated model
Communication user upshift prediction probability identifying system.
Background technique
The intelligent terminals such as smart phone are very universal, and with digitized progress, the set meal of communication user also can
Change therewith, but present user's upshift prediction probability identification and the relevance probability of the existing communication set meal of user are relatively low,
That is the upshift prediction probability identifying system do not predicted and analyzed for the existing communication set meal of user now, this
If sample, user is not easy to pointedly adjust set meal in time, while also will increase the management difficulty of operator, influences operator
Marketing effectiveness and marketing accuracy.
Summary of the invention
It is a kind of convenient for user's direct viewing predictive data set the technical problem to be solved by the present invention is to need to provide, accomplish
Digitlization, convenient for can pointedly adjust set meal in time or take different marketing measures, reduces the management difficulty of operator,
And effectively improve the marketing effectiveness of operator and the communication user upshift prediction probability knowledge based on integrated model of marketing accuracy
Other method;It also needs to further provide for using the logical of the communication user upshift prediction probability recognition methods based on integrated model
Interrogate user's upshift prediction probability identifying system.
In this regard, the present invention provides a kind of communication user upshift prediction probability recognition methods based on integrated model, including with
Lower step:
Step S1 stores user basic information data into database, obtains user basic information data set S;
Step S2 carries out Data Detection to the user basic information data set S, and is counted after standardization
According to collection St;
Step S3, by data set StAs input, the data y that will be upshiftd with set meal whether is handled the number of households involved moonnAs export into
The training of row integrated model, obtains trained integrated model;
Step S4, by updated user basic information data set SxIt is input to the trained integrated model of step S3
In, it exports and shows with the predictive data set S for whether handling set meal upshift the number of households involved moony。
A further improvement of the present invention is that the step S2 includes following sub-step:
Step S201 detects exceptional value and missing values in the user basic information data set S;
Exceptional value is set missing values by step S202;
Step S203 obtains the data source S_ after data cleansing to missing values zero padding;
Step S204 is standardized data source S_ to obtain data set St。
A further improvement of the present invention is that passing through formula in the step S204To data source S_
It is standardized to obtain data set St, wherein xnThe nth data of data set in data source S_ are represented, what n was represented is several
According to sample size,The mean value of all sample datas is represented, σ represents the standard deviation of all sample datas.
A further improvement of the present invention is that passing through formulaCalculate the mean value of all sample datasPass through
FormulaCalculate the standard deviation sigma of all sample datas.
A further improvement of the present invention is that the integrated model contains Logic Regression Models, determines in the step S3
Plan tree-model and model-naive Bayesian;By the data set St, will be with whether handling set meal upshift the number of households involved moon as input
Data ynAs output, it is trained simultaneously in Logic Regression Models, decision-tree model and model-naive Bayesian respectively, and
The Logic Regression Models, decision-tree model and model-naive Bayesian are separately optimized by way of verifying in the integrated mould
Weight in type.
A further improvement of the present invention is that the step S3 includes following sub-step:
Step S301, by the data set StIt is split, is split as training dataset StAWith test data set StB;
Step S302, by training dataset StAAs input, by the training dataset StAIt is corresponding whether to use the number of households involved moon
Handle the data y of set meal upshiftn1As output, respectively in Logic Regression Models, decision-tree model and model-naive Bayesian
It is trained simultaneously;
Step S303, in the training process, by the test data set StBAnd test data set StBCorresponding user
Whether the data y of set meal upshift is handled within secondary monthn2As verify data, respectively to Logic Regression Models, decision-tree model and simplicity
Training data in Bayesian model is verified, if being verified, by its weight divided by weight constant dt, otherwise weighed
Weight is multiplied by weight constant dt。
A further improvement of the present invention is that in the step S301, the training dataset StAWith test data set StB
Between ratio control be 7:3.
A further improvement of the present invention is that in the step S3, when starting to train, the Logic Regression Models,
Weight proportion between decision-tree model and model-naive Bayesian is 1:1:1;The weight constant dtBetween 0~1
Random number.
A further improvement of the present invention is that in the step S1, the user basic information data set S include user only
One mark ID, set dinner cost, traffic fee, the first trimester telephone expenses amount of money, super set number, super set traffic fee, integral, user gradation, year
Age, gender, call class, net duration, first trimester voice duration mean value, roaming service customer, first trimester GPRS flow, first three
Any one or a few in a month telephone expenses mean value and first trimester flow mean value.
The communication user upshift prediction probability identifying system based on integrated model that the present invention also provides a kind of uses as above
The communication user upshift prediction probability recognition methods based on integrated model.
Compared with prior art, the beneficial effects of the present invention are: by user basic information data set S carry out data
Detection and analysis in conjunction with the training of integrated model, and then realizes predictive data set SyOutput, pass through the instruction of the integrated model
Practice (study), better prediction effect can be obtained, have stronger generalization ability, therefore, the specific aim of this prediction, association
Property and accuracy are all very high, convenient for being intuitive to see predictive data set S by user's upshift prediction systemy, realize prediction
Systematization, it is as a result very clear and timely and convenient, suggest providing good number to provide more standby targetedly set meal upshift
According to basis, different marketing measures are taken convenient for the pointedly set meal of the different clients group of adjustment in time or specific aim, are reduced
The management difficulty of operator effectively improves the marketing effectiveness and marketing accuracy of operator.
Detailed description of the invention
Fig. 1 is the workflow schematic diagram of an embodiment of the present invention;
Fig. 2 is the operation principle schematic diagram of an embodiment of the present invention;
Fig. 3 is the data flow schematic illustration of an embodiment of the present invention.
Specific embodiment
With reference to the accompanying drawing, preferably embodiment of the invention is described in further detail.
As shown in Figure 1 to Figure 3, this example provides a kind of communication user upshift prediction probability identification side based on integrated model
Method, comprising the following steps:
Step S1 stores user basic information data into database, obtains user basic information data set S;
Step S2 carries out Data Detection to the user basic information data set S, and is counted after standardization
According to collection St;
Step S3, by data set StAs input, the data y that will be upshiftd with set meal whether is handled the number of households involved moonnAs export into
The training of row integrated model, obtains trained integrated model;
Step S4, by updated user basic information data set SxIt is input to the trained integrated model of step S3
In, it exports and shows with the predictive data set S for whether handling set meal upshift the number of households involved moony。
Step S1 and step S2 described in this example are the processes that data prediction is carried out to user basic information data set S;Institute
Stating step S3 is by existing historical data, by the data set S of usertAs input, will be with the number of households involved moon in historical data
The no data y for handling set meal upshiftnAs output, and then train required integrated model when being predicted;And institute
Stating step S4 then is using the step S3 trained integrated model, by updated user basic information data set SxIt is defeated
Enter into the trained integrated model and be trained, and then passes through updated user basic information data set SxAnd training
The predictive data set S that good integrated model output is upshiftd with set meal whether is handled the number of households involved moony, and show, to realize accuracy more
High communication user upshift prediction probability identifies that the identification of communication user upshift prediction probability described in this example is referred to communication user
Set meal upshift demand predicted that is, this example realizes the higher communication user set meal upshift prediction of accuracy.
In step S1 described in this example, the user basic information data set S includes user's unique ID, set dinner cost, stream
Amount take, the first trimester telephone expenses amount of money, super set number, super set traffic fee, integral, user gradation, the age, gender, class of conversing,
Net duration, first trimester voice duration mean value, roaming service customer, first trimester GPRS flow, first trimester telephone expenses mean value and preceding
Any one or a few in three months flow mean values.
Updated user basic information data set S described in this examplexIt equally include user's unique ID, set dinner cost, stream
Amount take, the first trimester telephone expenses amount of money, super set number, super set traffic fee, integral, user gradation, the age, gender, class of conversing,
Net duration, first trimester voice duration mean value, roaming service customer, first trimester GPRS flow, first trimester telephone expenses mean value and preceding
Any one or a few in three months flow mean values, the updated user basic information data set SxIt is preferred that by updated
User basic information data, which pass through the pretreatment of the step S1 and step S2 and then are input to the step S3, to be trained
Integrated model, and then improve its forecasting accuracy.
More specifically, the related data of the user basic information data set S of needs is first stored in corresponding number by this example
According in library, the essential information of the user basic information data set S includes following dimension: user's unique ID, set dinner cost, stream
Amount takes, telephone expenses amount of money last month, the month before last telephone expenses amount of money, the telephone expenses amount of money of upper the month before last, super set number, surpasses set traffic fee, integral, use
Family grade, the age, gender, call class, net duration, first trimester voice duration mean value, roaming service customer, GPRS last month stream
Amount, the month before last GPRS flow, GPRS of upper the month before last flow, first trimester telephone expenses mean value and first trimester flow mean value etc., this
User basic information data set comprising these dimensions is referred to as S by example.
Then data set A, data set A will be set as with the data for data this dimension for whether handling set meal upshift the number of households involved moon
It is associated between user basic information data set S by user's unique ID, i.e. predictive data set S described in this exampleyWith
It is associated between user basic information data set S by user's unique ID, the data after association is then stored in number
According in library.Selection user data sample total is n.
Wherein, { s1,s2,....sn∈ S, snIndicate that the corresponding essential information data of a certain user such as number the use for being 1
The corresponding set dinner cost in family, traffic fee, telephone expenses amount of money last month, the month before last telephone expenses amount of money, the telephone expenses amount of money of upper the month before last, super set number,
Super set traffic fee, integral, user gradation, the age, gender, call class, in net duration, first trimester voice duration mean value, unrestrained
Swim user, GPRS flow last month, the month before last GPRS flow, GPRS of upper the month before last flow, first trimester telephone expenses mean value and first three
A month flow mean value etc..
{y1,y2,.....yn∈ A, ynWhat is indicated is whether to handle set meal with the number of households involved moon in a certain user's history data
The data of upshift, this be it is known, for training integrated model, ynIn 0 represent and do not need to handle set meal upshift (upgrading), 1 generation
Table needs to handle set meal upshift (upgrading);That is A is the data set to be upshiftd with set meal whether is handled the number of households involved moon in historical data, most
The data instance of storage in the database is presented below eventually:
User's unique ID | Age | Networking duration | Integral | .... | It is upshiftd with set meal whether is handled the number of households involved moon |
1 | 22 | 3 | 324 | .... | 1 |
2 | 24 | 2 | 456 | .... | 0 |
3 | 56 | 1 | 245 | .... | 0 |
4 | 15 | 6 | 786 | .... | 0 |
.... | .... | .... | .... | .... | .... |
n | 76 | 11 | 1025 | .... | 1 |
Step S2 described in this example includes following sub-step:
Step S201 detects exceptional value and missing values in the user basic information data set S;
Exceptional value is set missing values by step S202;
Step S203 obtains the data source S_ after data cleansing to missing values zero padding;
Step S204 is standardized data source S_ to obtain data set St。
More specifically, Data Detection is carried out to the user basic information data set S being put in storage, predominantly detects data set
Exceptional value and missing values in S;Wherein, exceptional value is primarily referred to as the value different from common sense, due to systematic error, human error or
The variation of person's inherent data is so that they and overall behavioural characteristic, structure or correlation etc. are different.This example preferably uses picture
The method of box traction substation identifies, draws box traction substation and only needs to take maximum value minimum, upper quartile, median and lower quartile
Number can be drawn.Data on upper quartile and under lower quartile are considered as exceptional value.Such as in the age
Then it is determined as exceptional value if it is negative.After step S201 described in this example finds exceptional value, it is set to lack in step S202
Mistake value, and handled using the missing values processing mode of step S203, the missing values processing mode is to be lacked using zero padding
Mistake value is stuffed entirely with missing values to be 0.It is empty under some field of missing values or numerical value is not considered as missing then
Value.
In step S204 described in this example, pass through formulaData source S_ is standardized and is counted
According to collection St, wherein xnRepresent data source S_The nth data of middle data set, what n was represented is the sample size of data,Represent institute
There is the mean value of sample data, σ represents the standard deviation of all sample datas.
This example passes through formulaCalculate the mean value of all sample datasPass through formulaMeter
Calculate the standard deviation sigma of all sample datas.Data set S is obtained after standardization in this wayt, data set StIn data symbols
Standardization normal distribution, mean value 0, standard deviation 1.
In step S3 described in this example, the integrated model contains Logic Regression Models, decision-tree model and simple pattra leaves
This model;By the data set StAs input, the data y that will be upshiftd with set meal whether is handled the number of households involved moonnAs output, respectively
It is trained in Logic Regression Models, decision-tree model and model-naive Bayesian, and is distinguished by way of verifying simultaneously
Optimize the weight of the Logic Regression Models, decision-tree model and model-naive Bayesian in the integrated model.
The reason of this example setting integration module, is that, since this example is using the study for having feature, this example passes through integrated
Model is trained and learns, and prediction is carried out by combining multiple and different models can effectively improve accuracy.
The integrated model contains Logic Regression Models, decision-tree model and model-naive Bayesian, logistic regression mould
Integrated model is done using the method for multiple models ballot between type, decision-tree model and model-naive Bayesian, i.e., to each
A model is configured respective weight, this weight is adjusted in real time according to trained result, accurate to improve prediction
Degree.
This example is by data set StAs input, the data y that will be upshiftd with set meal whether is handled the number of households involved moonnIt is instructed as output
Practice and learn, remembers x=St, remember y=yn, wherein y value in [0,1].
Step S3 described in this example includes following sub-step:
Step S301, by the data set StIt is split, is split as training dataset StAWith test data set StB;
Step S302, by training dataset StAAs input, by the training dataset StAIt is corresponding whether to use the number of households involved moon
Handle the data y of set meal upshiftn1As output, respectively in Logic Regression Models, decision-tree model and model-naive Bayesian
It is trained simultaneously;
Step S303, in the training process, by the test data set StBAnd test data set StBCorresponding user
Whether the data y of set meal upshift is handled within secondary monthn2As verify data, respectively to Logic Regression Models, decision-tree model and simplicity
Training dataset S in Bayesian modeltAAnd the data y to be upshiftd with set meal whether is handled the number of households involved moonn1It is verified, if testing
Card passes through, then by its weight divided by weight constant dt, otherwise by its weight multiplied by weight constant dt。
In step S301 described in this example, by the data set StIt is split as training dataset StAWith test data set StBBetween
Ratio be preferably controlled to 7:3;The step S3 start train when, the Logic Regression Models, decision-tree model with
And the weight proportion between model-naive Bayesian is 1:1:1;The weight constant dtFor the random number between 0~1.
Training dataset S described in this exampletAFor the data set for realizing training integrated model, the test data set StB
For the data set for realizing verifying integrated model;The weight W of the Logistic Regression moduleA, decision-tree model weight WBAnd
The weight W of model-naive BayesianCDefault initial values are preferably all 1, during subsequent training, pass through weight constant dtInto
Row adjustment in real time and update;The weight constant dtRandom number for pre-set constant, between preferably 0~1.Certainly,
The selection of these parameters of this example is the preferred value of this example, in practical applications, can be made by oneself according to the actual situation
Justice-reparation changes and is arranged.
That is, the preferably described data set S of this exampletIt is split as training dataset StAWith test data set StB, preferably according to training
Data set StA: test data set StB=7:3 is divided, then by training dataset S thereintARespectively Logic Regression Models,
It is trained accordingly in decision-tree model and model-naive Bayesian this 3 models, gives this 3 models point when training
Not Fu weight initial value be 1, the weighted value of the Logistic Regression module is labeled as WA, the weighted value of the decision-tree model is labeled as
WB, the weighted value of the model-naive Bayesian is labeled as WC;During training, if Logic Regression Models, decision tree mould
Any one category of model mistake in type and model-naive Bayesian this 3 models, i.e. training result and test data set StB
Corresponding data are inconsistent, then the model is determined as classification error, to the weighted value of the model multiplied by weight constant dt, into
And reduce its weight for participating in training;If in this 3 models of Logic Regression Models, decision-tree model and model-naive Bayesian
Any one category of model it is correct, i.e. training result and test data set StBCorresponding data are consistent, then with to the mould
The weighted value of type is divided by weight constant dt, and then increase its weight for participating in training.
Wherein, the Logic Regression Models are 0 or 1 using binary classification model, i.e. the y value of sample output.It is given
One data point can be 0 to classification and classification is 1 to calculate separately probability, and the classification of maximum probability will be selected as prediction classification.Note
Data set StIn multiple dimensions (such as user's unique ID covers dinner cost, traffic fee, telephone expenses amount of money last month, the month before last telephone expenses
The amount of money, the telephone expenses amount of money of upper the month before last, super set number, super set traffic fee, integral etc., are detailed in and data set S are initially defined abovetPortion
Point) quantity be m, i.e. m be user information data set SttIn number of dimensions, xmIt represents and derives from input data set StIn a certain item
In the data value for a certain user that dimension is m-th, logistic regression judgment formula is as follows: z=w1*x1+w2*x2+....wmxm+b。
Result output is finally carried out using sigmoid function, the weighted value w of m-th of number of dimensions is calculated with thismWith
Amount of bias b.Because y is known, that is, the use in input data
It is known that whether the number of households involved moon, which is lost,.Pass through formula z=w1*x1+w2*x2+....wmxm+ b uses more than two equations then
The weighted value w of m-th of number of dimensions can be back-calculated to obtainmWith amount of bias b.
Decision-tree model described in this example obtains a Machine learning classifiers by study, this Machine learning classifiers energy
It is enough that correct classification is provided to emerging object.Decision tree needs to select different attributes when doing decision, here
Attributions selection is defined according to information theory, and formula is as follows:Wherein m namely above y
Value, the so value of m are preferably 2, piFor data set StIn i-th of classification probability, p when calculatingi=(StIn belong to i-th
The record number of the set of classification/| St|), Info (St) indicate data set StThe separated information content of different classes.
Model-naive Bayesian described in this example passes through data set St={ st1,st2,.....stnJudge these data sets
With whether handling set meal upshift y, the then probability under each classification of this group of feature, a herein the number of households involved mooniThe value of y is represented,
aiFor prediction result, i.e. aiValue be just 0 or 1.Formula indicates are as follows: class probability
It is then to think that the given each feature of sample is uncorrelated to other features according to the thought of naive Bayesian.According to
Maximize obtains the classification of this group of feature:
Wherein,
classify(f1,f2,....fn) it is the classification results for giving sample group data characteristics, fnRefer to data set StMiddle use
The totality of family essential information data characteristics, StiFor data set StIn i-th of data.
That is, step S3 described in this example be based on integrated model for the passing behavior of user carry out different set meal groups into
Row upshift prediction identification probability, by data set StAs input, set meal upshift data y whether will be handled with the number of households involved moonnAs output
The training for carrying out integrated model updates the integrated model training for reaching in optimization and more having in continuous weight, and carries out integrated mould
The output of type.
Step S4 described in this example is by updated user basic information data set SxIt is trained to be input to the step S3
In integrated model, exports and show with the predictive data set S for whether handling set meal upshift the number of households involved moony.More specifically, this example base
User's upshift prediction identification probability is carried out in integrated model, using the trained integrated model of step S3 to updated use
Family essential information data set SxIt is predicted (i.e. trained), updated user basic information data set SxFrom updated
User basic information associated data set, for example before when do model training be user's April or February to four
The historical data of the user basic information data of the moon is exactly May updated user basic information data when prediction
It is input in the trained integrated model of step S3, and then exports timing node and be whether May needs to carry out set meal liter
Prediction result (the predictive data set S of shelvesy), it then can be according to obtaining as a result, the user of set meal upshift can be handled to prediction
Carry out business marketing.
Step S4 described in this example preferably includes two sub-steps: step S401, by updated user basic information data
Collect SxIt is input in the trained integrated model of step S3, the output prediction data for whether handling set meal upshift the number of households involved moon
Collect Sy;Step S402, by predictive data set SyIt is shown by user's upshift prediction system.The step S402 for realizing
The displaying of user's upshift prediction probability identifying system, i.e., by predictive data set SyBy user's upshift prediction system demonstration, fortune is allowed
Battalion quotient can market directly against that can handle the user of set meal upshift within secondary month, be also convenient for user and targetedly change certainly
Oneself set meal.
This example also provides a kind of communication user upshift prediction probability identifying system based on integrated model, uses institute as above
The communication user upshift prediction probability recognition methods based on integrated model stated.
In conclusion this example is by carrying out Data Detection and analysis to user basic information data set S, in conjunction with integrated model
Training, and then realize predictive data set SyOutput, by the training (study) of the integrated model, can obtain preferably
Prediction effect has stronger generalization ability, and therefore, specific aim, relevance and the accuracy of this prediction are all very high, is convenient for
Predictive data set S is intuitive to see by user's upshift prediction systemy, forecasting system is realized, it is as a result very clear, and and
When it is convenient, suggest providing good data basis to provide more standby targetedly set meal upshift, convenient for pointedly in time
The set meal or specific aim for adjusting different clients group take different marketing measures, reduce the management difficulty of operator, effectively mention
The marketing effectiveness and marketing accuracy of high operator.
The above content is a further detailed description of the present invention in conjunction with specific preferred embodiments, and it cannot be said that
Specific implementation of the invention is only limited to these instructions.For those of ordinary skill in the art to which the present invention belongs, exist
Under the premise of not departing from present inventive concept, a number of simple deductions or replacements can also be made, all shall be regarded as belonging to of the invention
Protection scope.
Claims (10)
1. a kind of communication user upshift prediction probability recognition methods based on integrated model, which comprises the following steps:
Step S1 stores user basic information data into database, obtains user basic information data set S;
Step S2 carries out Data Detection to the user basic information data set S, and obtains data set after standardization
St;
Step S3, by data set StAs input, the data y that will be upshiftd with set meal whether is handled the number of households involved moonnCollected as output
At model training, trained integrated model is obtained;
Step S4, by updated user basic information data set SxIt is input in the trained integrated model of step S3, it is defeated
Out and displaying with whether handle the number of households involved moon set meal upshift predictive data set Sy。
2. the communication user upshift prediction probability recognition methods according to claim 1 based on integrated model, feature exist
In the step S2 includes following sub-step:
Step S201 detects exceptional value and missing values in the user basic information data set S;
Exceptional value is set missing values by step S202;
Step S203 obtains the data source S_ after data cleansing to missing values zero padding;
Step S204 is standardized data source S_ to obtain data set St。
3. the communication user upshift prediction probability recognition methods according to claim 2 based on integrated model, feature exist
In passing through formula in the step S204Data source S_ is standardized to obtain data set St,
In, xnRepresent data source S—The nth data of middle data set, what n was represented is the sample size of data,Represent all sample numbers
According to mean value, σ represents the standard deviation of all sample datas.
4. the communication user upshift prediction probability recognition methods according to claim 3 based on integrated model, feature exist
In passing through formulaCalculate the mean value of all sample datasPass through formulaCalculate all samples
The standard deviation sigma of notebook data.
5. the communication user upshift prediction probability identification side according to any one of claims 1 to 4 based on integrated model
Method, which is characterized in that in the step S3, the integrated model contains Logic Regression Models, decision-tree model and simple shellfish
This model of leaf;By the data set StAs input, the data y that will be upshiftd with set meal whether is handled the number of households involved moonnAs output, divide
It is not trained in Logic Regression Models, decision-tree model and model-naive Bayesian, and is divided by way of verifying simultaneously
Do not optimize the weight of the Logic Regression Models, decision-tree model and model-naive Bayesian in the integrated model.
6. the communication user upshift prediction probability recognition methods according to claim 5 based on integrated model, feature exist
In the step S3 includes following sub-step:
Step S301, by the data set StIt is split, is split as training dataset StAWith test data set StB;
Step S302, by training dataset StAAs input, by the training dataset StAIt is corresponding whether to be handled with the number of households involved moon
The data y of set meal upshiftn1As output, respectively in Logic Regression Models, decision-tree model and model-naive Bayesian simultaneously
It is trained;
Step S303, in the training process, by the test data set StBAnd test data set StBIt is corresponding to use the number of households involved moon
Whether the data y of set meal upshift is handledn2As verify data, respectively to Logic Regression Models, decision-tree model and simple pattra leaves
Training data in this model is verified, if being verified, by its weight divided by weight constant dt, otherwise its weight is multiplied
With weight constant dt。
7. the communication user upshift prediction probability recognition methods according to claim 6 based on integrated model, feature exist
In, in the step S301, the training dataset StAWith test data set StBBetween ratio control be 7:3.
8. the communication user upshift prediction probability recognition methods according to claim 6 based on integrated model, feature exist
In, in the step S3, when starting to train, the Logic Regression Models, decision-tree model and model-naive Bayesian
Between weight proportion be 1:1:1;The weight constant dtFor the random number between 0~1.
9. the communication user upshift prediction probability identification side according to any one of claims 1 to 4 based on integrated model
Method, which is characterized in that in the step S1, the user basic information data set S include user's unique ID, set dinner cost,
Traffic fee, the first trimester telephone expenses amount of money, super set number, super set traffic fee, integral, user gradation, the age, gender, call class,
Net duration, first trimester voice duration mean value, roaming service customer, first trimester GPRS flow, first trimester telephone expenses mean value and
Any one or a few in first trimester flow mean value.
10. a kind of communication user upshift prediction probability identifying system based on integrated model, which is characterized in that use such as right
It is required that the communication user upshift prediction probability recognition methods described in 1 to 9 any one based on integrated model.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910161182.3A CN109886756A (en) | 2019-03-04 | 2019-03-04 | Communication user upshift prediction probability recognition methods and system based on integrated model |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910161182.3A CN109886756A (en) | 2019-03-04 | 2019-03-04 | Communication user upshift prediction probability recognition methods and system based on integrated model |
Publications (1)
Publication Number | Publication Date |
---|---|
CN109886756A true CN109886756A (en) | 2019-06-14 |
Family
ID=66930582
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201910161182.3A Pending CN109886756A (en) | 2019-03-04 | 2019-03-04 | Communication user upshift prediction probability recognition methods and system based on integrated model |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN109886756A (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN114449569A (en) * | 2020-11-02 | 2022-05-06 | 中国移动通信集团广东有限公司 | User traffic usage processing method, network device and service processing system |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN105389713A (en) * | 2015-10-15 | 2016-03-09 | 南京大学 | Mobile data traffic package recommendation algorithm based on user historical data |
CN106022505A (en) * | 2016-04-28 | 2016-10-12 | 华为技术有限公司 | Method and device of predicting user off-grid |
CN106779079A (en) * | 2016-11-23 | 2017-05-31 | 北京师范大学 | A kind of forecasting system and method that state is grasped based on the knowledge point that multimodal data drives |
CN106845731A (en) * | 2017-02-20 | 2017-06-13 | 重庆邮电大学 | A kind of potential renewal user based on multi-model fusion has found method |
CN107480687A (en) * | 2016-06-08 | 2017-12-15 | 富士通株式会社 | Information processor and information processing method |
-
2019
- 2019-03-04 CN CN201910161182.3A patent/CN109886756A/en active Pending
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN105389713A (en) * | 2015-10-15 | 2016-03-09 | 南京大学 | Mobile data traffic package recommendation algorithm based on user historical data |
CN106022505A (en) * | 2016-04-28 | 2016-10-12 | 华为技术有限公司 | Method and device of predicting user off-grid |
CN107480687A (en) * | 2016-06-08 | 2017-12-15 | 富士通株式会社 | Information processor and information processing method |
CN106779079A (en) * | 2016-11-23 | 2017-05-31 | 北京师范大学 | A kind of forecasting system and method that state is grasped based on the knowledge point that multimodal data drives |
CN106845731A (en) * | 2017-02-20 | 2017-06-13 | 重庆邮电大学 | A kind of potential renewal user based on multi-model fusion has found method |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN114449569A (en) * | 2020-11-02 | 2022-05-06 | 中国移动通信集团广东有限公司 | User traffic usage processing method, network device and service processing system |
CN114449569B (en) * | 2020-11-02 | 2024-01-16 | 中国移动通信集团广东有限公司 | User traffic usage processing method, network equipment and service processing system |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN107851097B (en) | Data analysis system, data analysis method, data analysis program, and storage medium | |
CN103927675B (en) | Judge the method and device of age of user section | |
CN107766929B (en) | Model analysis method and device | |
US7328218B2 (en) | Constrained tree structure method and system | |
CN109447364B (en) | Label-based electric power customer complaint prediction method | |
CN103761254B (en) | Method for matching and recommending service themes in various fields | |
CN108681970A (en) | Finance product method for pushing, system and computer storage media based on big data | |
CN107861951A (en) | Session subject identifying method in intelligent customer service | |
CN109189904A (en) | Individuation search method and system | |
CN109345302A (en) | Machine learning model training method, device, storage medium and computer equipment | |
CN107507038B (en) | Electricity charge sensitive user analysis method based on stacking and bagging algorithms | |
CN104750674B (en) | A kind of man-machine conversation's satisfaction degree estimation method and system | |
CN109933660B (en) | API information search method towards natural language form based on handout and website | |
CN110163647A (en) | A kind of data processing method and device | |
CN107230108A (en) | The processing method and processing device of business datum | |
CN109886755A (en) | A kind of communication user attrition prediction method and system based on evolution algorithm | |
CN107622326A (en) | User's classification, available resources Forecasting Methodology, device and equipment | |
CN110288350A (en) | User's Value Prediction Methods, device, equipment and storage medium | |
CN104850868A (en) | Customer segmentation method based on k-means and neural network cluster | |
CN104572915B (en) | One kind is based on the enhanced customer incident relatedness computation method of content environment | |
CN113435627A (en) | Work order track information-based electric power customer complaint prediction method and device | |
CN109919675A (en) | Communication user upshift prediction probability recognition methods neural network based and system | |
CN108122173A (en) | A kind of conglomerate load forecasting method based on depth belief network | |
CN105164672A (en) | Content classification | |
CN110310012A (en) | Data analysing method, device, equipment and computer readable storage medium |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20190614 |