CN108090787A - A kind of call bill data depth based on Apriori algorithm is excavated and the method for user's behavior prediction - Google Patents

A kind of call bill data depth based on Apriori algorithm is excavated and the method for user's behavior prediction Download PDF

Info

Publication number
CN108090787A
CN108090787A CN201711368254.9A CN201711368254A CN108090787A CN 108090787 A CN108090787 A CN 108090787A CN 201711368254 A CN201711368254 A CN 201711368254A CN 108090787 A CN108090787 A CN 108090787A
Authority
CN
China
Prior art keywords
user
apriori algorithm
excavated
ticket
prediction
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201711368254.9A
Other languages
Chinese (zh)
Inventor
曹万鹏
罗云彬
李鹏
李�浩
徐青
史辉
林绍福
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing University of Technology
Original Assignee
Beijing University of Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing University of Technology filed Critical Beijing University of Technology
Priority to CN201711368254.9A priority Critical patent/CN108090787A/en
Publication of CN108090787A publication Critical patent/CN108090787A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/02Marketing; Price estimation or determination; Fundraising
    • G06Q30/0201Market modelling; Market analysis; Collecting market data
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/02Marketing; Price estimation or determination; Fundraising
    • G06Q30/0201Market modelling; Market analysis; Collecting market data
    • G06Q30/0202Market predictions or forecasting for commercial activities

Abstract

A kind of method that the present invention discloses call bill data depth excavation based on Apriori algorithm and user's behavior prediction, includes the following steps:Step 1 corresponds to behavioural characteristic according to relevant information in ticket and user, looks for out the incidence relation frequency of the peculiar ticket information of some of which and some subsequent behavioural characteristics of user;Step 2, the frequency intensity occurred according to above-mentioned incidence relation, are ranked up above-mentioned relation.The continuous item compared with High relevancy that has in the top is found out, and then the behavior of different user is excavated according to this rule, is counted;Step 3, the prediction that user's corelation behaviour etc. is provided based on Apriori algorithm.

Description

A kind of call bill data depth based on Apriori algorithm is excavated and user's behavior prediction Method
Technical field
The invention belongs to big data analysis and machine learning method, more particularly to one kind based on if Apriori algorithm Forms data depth is excavated and the method for user's behavior prediction.
Background technology
Include the dialing numbers of current talking in telecom operators' ticket big data, dial the time, dial duration, access The information such as base station, often some mobile phone usage behaviors subsequent with user or operation are associated for these information, by these letters The depth of breath is excavated, and can be realized the relevant prediction of some terminal users, and then be obtained on user behavior feature and demand The useful information reflected indirectly, these information are instructing the decision-making of the service operation of operator and auxiliary activities provider Etc. have highly important reference value, it may have huge economic value and social benefit.
Apriori algorithm is widely used in moving communicating field.Mobile value-added service is increasingly becoming Mobile Communications Market On most vibrant, the most potential, business that attracts most attention.With the recovery of industry, more and more value-added services are shown by force The growth momentum of strength, the characteristics of showing using diversification, brand marketing, management centralization, cooperation depth.For this Trend, widely applied Apriori algorithm is applied by many companies in correlation rule data mining.
Apriori algorithm is a kind of frequent item set algorithm of Mining Association Rules, and core concept is given birth to by Candidate Set Carry out Mining Frequent Itemsets Based into downward closing two stages of detection with plot.Its basic thought is:All frequency collection are found out first, The frequency that these item collections occur is at least as predefined minimum support.Then collected by frequency and generate Strong association rule, this A little rules must are fulfilled for minimum support and Minimum support4.Then desired rule is generated using the frequency collection found, generated only The strictly all rules of item comprising set, the right part of each of which rule only have one, here using the definition of middle rule. Once these rules are generated, then only those are just left more than the rule for the Minimum support4 that user gives.In order to All frequency collection are generated, have used recursive method.
If there is a correlation rule, its support and confidence level be both greater than the minimum support that pre-defines with Confidence level, just it is referred to as Strong association rule for we.Strong association rule can be used for understanding the hiding relation between item.So it closes The main purpose of connection analysis is exactly in order to find Strong association rule, and Apriori algorithm is then mainly used to help to find strong association Rule.
Based on above-mentioned principle, set forth herein a kind of call bill data depth excavation based on Apriori algorithm and user behaviors The method and system of prediction, by the user bill big data information of magnanimity, to the forward and backward use that different information occur in ticket There is the excavation of the frequent item set of behavior, statistics in family, provides the useful letter reflected indirectly on user behavior feature and demand Breath, and then Accurate Prediction is made to the corelation behaviour of user.
The content of the invention
Ticket big data information for magnanimity and the frequent item set wherein contained propose a kind of to be based on Apriori algorithm Call bill data depth excavate and the method for user's behavior prediction, by the user bill big data information of magnanimity, to ticket The behavioural characteristic frequency for the forward and backward generation that the related implicit information of middle related information items and user information different from ticket occur The excavation of numerous item collection, statistics provide the Accurate Prediction of user's corelation behaviour etc. based on Apriori algorithm.
A kind of call bill data depth based on Apriori algorithm is excavated and the method for user's behavior prediction, including walking as follows Suddenly:
Step 1 corresponds to behavioural characteristic according to relevant information in ticket and user, looks for out the peculiar ticket information of some of which (such as after big flow consumption, user more often dials and supplements with money with the incidence relation frequency of some subsequent behavioural characteristics of user Phone is supplemented with money;With after the call of some or some particular numbers, user is always and some other particular person or specific number Code is taken on the telephone;The user for logging in some websites also necessarily logs in other some specific websites in the recent period;It has carried out under some resources The user of load also can download other resource simultaneously;Some people one appear in certain time point, certain position will be to certain class crowd Or number call phone etc.);
Step 2, the frequency intensity occurred according to above-mentioned incidence relation, are ranked up above-mentioned relation.It will be in the top It is found out with the continuous item compared with High relevancy, and then the behavior of different user is excavated according to this rule, is counted, Jin Erti Preceding accurate prediction;
Step 3, the prediction that user's corelation behaviour etc. is provided based on Apriori algorithm, first to different in ticket big data User is handled, is classified, and is carried out just classification based on attribute informations such as gender, age, region, consumption habits, is avoided because of data Forecasting inaccuracy caused by the inconsistency that diversity generates is true, to eliminate the shadow of the excessively multipair analysis result accuracy of data type It rings.
Compared with prior art, the present invention has following apparent advantage and advantageous effect:
(1) excavate set forth herein a kind of call bill data depth based on Apriori algorithm and the method for user's behavior prediction, By in the user bill big data information of magnanimity, to the related implicit information and ticket of related information items in ticket and user The excavation for the behavioural characteristic frequent item set that user occurs, statistics, user is provided based on Apriori algorithm after middle item of information, ticket The prediction of corelation behaviour etc..
(2) it is first right before phone bill using Apriori algorithm according to being excavated because the diversity of teledata Above-mentioned data are classified, selected, eliminate distracter, and then avoid predicting caused by teledata diversity not accurate enough The problem of.
Description of the drawings
Fig. 1 is that a kind of call bill data depth based on Apriori algorithm proposed by the invention is excavated and user behavior is pre- The method flow diagram of survey;
Specific embodiment
The present invention will be further described with reference to the accompanying drawings and detailed description.
The flow chart of method involved in the present invention is as shown in Figure 1, comprise the following steps:
Ticket big data information for magnanimity and the frequent item set wherein contained propose a kind of to be based on Apriori algorithm Call bill data depth excavate and the method for user's behavior prediction, by the user bill big data information of magnanimity, to ticket In, after ticket user occur the frequent item set of behavior excavation, statistics, provide the prediction of user's corelation behaviour etc..
A kind of call bill data depth based on Apriori algorithm is excavated and the method for user's behavior prediction, passes through magnanimity In user bill big data information, to the excavation of the frequent item set of behavior, statistics occurs in user in ticket, after ticket, use is provided The prediction of family corelation behaviour etc..
A kind of call bill data depth based on Apriori algorithm is excavated and the method for user's behavior prediction, including walking as follows Suddenly:
1st, behavioural characteristic is corresponded to according to relevant information in ticket and user, some of which is looked for out based on Apriori algorithm Peculiar ticket information and some subsequent behavioural characteristics of user incidence relation frequency (such as after big flow consumption, Yong Hugeng It often dials and supplements phone with money and supplemented with money;With after some or the call of some particular numbers, user is always and some other is special Determine people or particular number is taken on the telephone;The user for logging in some websites also necessarily logs in other some specific websites in the recent period;It carries out The user of some resource downloadings also can download other resource simultaneously;Some people are once appearing in certain time point, certain position It can be to certain class crowd or number call phone etc.);
2nd, based on this frequent 1 Item Sets C of generation candidate1={ { after big flow consumption }, { user, which dials, supplements phone progress with money Supplement with money, { dial some particular person or particular number taken on the telephone }, { logging in some specific websites }, { downloading some resources }, { appearing in certain time point, certain position }, { to certain class crowd or number call phone } }.
3rd, transaction database is scanned, is defined as D, its each of which item record T, is C1Subset, then association rule All it is then shaped like A->The expression formula of B, A, B are C1Subset, and the intersection of A and B is sky, and support is defined as probability P (AUB).Calculate the frequent 1 Item Sets C of candidate1In each support of the Item Sets in D, and then can be obtained according to minimum support Go out frequent 1 Item Sets L1
4th, similarly, according to L1Generate the frequent 2 Item Sets C of candidate2
5th, transaction database D is scanned, calculates the frequent 2 Item Sets C of candidate2In each support of the Item Sets in D, can be with Draw frequent 2 Item Sets L2
According to the frequency intensity that above-mentioned incidence relation occurs, above-mentioned relation is ranked up.By it is in the top have compared with The continuous item of High relevancy is found out, and then the behavior of different user is excavated according to this rule, is counted, and then essence in advance Really prediction;
6th, similarly, according to L2Generate the frequent 3 Item Sets C of candidate3
7th, transaction database D is scanned, calculates the frequent 3 Item Sets C of candidate3In each support of the Item Sets in D, can be with Draw frequent 3 Item Sets L3
8th, repeat the above steps, until that cannot find " k item collections ".
If L=L1UL2UL3U…ULk-1, further respectively calculate correlation rule confidence level, it can be deduced that frequently association rule Then, so find different item between incidence relation.
9th, it is to further improve accuracy of the Apriori algorithm to ticket analysis result, improves user's behavior prediction precision, The present invention first handles different user in ticket big data, is classified, based on categories such as gender, age, region, consumption habits Property information carry out just classification, avoid because data diversity generate inconsistency caused by forecasting inaccuracy it is true, to eliminate data The influence of the excessively multipair analysis result accuracy of type.

Claims (1)

1. a kind of call bill data depth based on Apriori algorithm is excavated and the method for user's behavior prediction, which is characterized in that bag Include following steps:
Step 1 corresponds to behavioural characteristic according to relevant information in ticket and user, looks for out the peculiar ticket information of some of which with using The incidence relation frequency of some subsequent behavioural characteristics of family;
Step 2, the frequency intensity occurred according to above-mentioned incidence relation, are ranked up above-mentioned relation.In the top is had Continuous item compared with High relevancy is found out, and then the behavior of different user is excavated according to this rule, is counted;
Step 3, the prediction that user's corelation behaviour etc. is provided based on Apriori algorithm.
CN201711368254.9A 2017-12-18 2017-12-18 A kind of call bill data depth based on Apriori algorithm is excavated and the method for user's behavior prediction Pending CN108090787A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201711368254.9A CN108090787A (en) 2017-12-18 2017-12-18 A kind of call bill data depth based on Apriori algorithm is excavated and the method for user's behavior prediction

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201711368254.9A CN108090787A (en) 2017-12-18 2017-12-18 A kind of call bill data depth based on Apriori algorithm is excavated and the method for user's behavior prediction

Publications (1)

Publication Number Publication Date
CN108090787A true CN108090787A (en) 2018-05-29

Family

ID=62176017

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201711368254.9A Pending CN108090787A (en) 2017-12-18 2017-12-18 A kind of call bill data depth based on Apriori algorithm is excavated and the method for user's behavior prediction

Country Status (1)

Country Link
CN (1) CN108090787A (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109635006A (en) * 2018-12-17 2019-04-16 山大地纬软件股份有限公司 Social security business association rule digging and recommendation apparatus and method based on Apriori
CN109784721A (en) * 2019-01-15 2019-05-21 东莞市友才网络科技有限公司 A kind of plateform system of employment data analysis and data mining analysis
CN112215646A (en) * 2020-10-12 2021-01-12 四川长虹电器股份有限公司 Brand promotion method based on improved Aprion algorithm
CN113935787A (en) * 2021-12-15 2022-01-14 山东柏源技术有限公司 Financial information management system based on association rule mining algorithm

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105590232A (en) * 2014-11-11 2016-05-18 中国移动通信集团广东有限公司 Client relation generation method and apparatus, and electronic device

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105590232A (en) * 2014-11-11 2016-05-18 中国移动通信集团广东有限公司 Client relation generation method and apparatus, and electronic device

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
S. GANESHMOORTHY: ""An Improved Intellectual Analysis Precedence and Storage for Business Intelligence from Web Uses Access Data"", 《COMPUTATIONAL ADVANCEMENT IN COMMUNICATION CIRCUITS AND SYSTEMS》 *
林湘粤等: ""基于海量数据的用户点击模式识别"", 《电信网技术》 *

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109635006A (en) * 2018-12-17 2019-04-16 山大地纬软件股份有限公司 Social security business association rule digging and recommendation apparatus and method based on Apriori
CN109784721A (en) * 2019-01-15 2019-05-21 东莞市友才网络科技有限公司 A kind of plateform system of employment data analysis and data mining analysis
CN109784721B (en) * 2019-01-15 2021-01-26 广东度才子集团有限公司 Employment data analysis and data mining analysis platform system
CN112215646A (en) * 2020-10-12 2021-01-12 四川长虹电器股份有限公司 Brand promotion method based on improved Aprion algorithm
CN113935787A (en) * 2021-12-15 2022-01-14 山东柏源技术有限公司 Financial information management system based on association rule mining algorithm

Similar Documents

Publication Publication Date Title
CN109615116B (en) Telecommunication fraud event detection method and system
CN108090787A (en) A kind of call bill data depth based on Apriori algorithm is excavated and the method for user's behavior prediction
CN104076944B (en) A kind of method and apparatus of chatting facial expression input
CN106778876A (en) User classification method and system based on mobile subscriber track similitude
CN102722709B (en) Method and device for identifying garbage pictures
CN102110170B (en) System with information distribution and search functions and information distribution method
CN109902216A (en) A kind of data collection and analysis method based on social networks
CN102083010B (en) Method and equipment for screening user information
CN104572688A (en) Information push method and device
CN105869035A (en) Mobile user credit evaluation method and apparatus
CN106022708A (en) Method for predicting employee resignation
CN104463603A (en) Credit assessment method and system
EP2652909B1 (en) Method and system for carrying out predictive analysis relating to nodes of a communication network
CN103957516A (en) Junk short message filtering method and engine
CN109711746A (en) A kind of credit estimation method and system based on complex network
Guo et al. GroupMe: Supporting group formation with mobile sensing and social graph mining
CN109118155A (en) A kind of method and device generating operation model
CN107644106A (en) The internuncial method of automatic mining business, terminal device and storage medium
CN109389501A (en) A kind of calculating equipment, computing system
CN110502702A (en) User's behavior prediction method and device
Manley et al. New forms of data for understanding urban activity in developing countries
CN108776857A (en) NPS short messages method of investigation and study, system, computer equipment and storage medium
Maji et al. Data warehouse based analysis on CDR to retain and acquire customers by targeted marketing
CN113850630A (en) Satisfaction degree prediction method and device, storage medium and electronic equipment
CN106293354B (en) Shortcut menu self-adaptive display control method, server and portable terminal

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
WD01 Invention patent application deemed withdrawn after publication
WD01 Invention patent application deemed withdrawn after publication

Application publication date: 20180529