CN109117873A - A kind of user behavior analysis method based on Bayesian Classification Arithmetic - Google Patents

A kind of user behavior analysis method based on Bayesian Classification Arithmetic Download PDF

Info

Publication number
CN109117873A
CN109117873A CN201810819701.6A CN201810819701A CN109117873A CN 109117873 A CN109117873 A CN 109117873A CN 201810819701 A CN201810819701 A CN 201810819701A CN 109117873 A CN109117873 A CN 109117873A
Authority
CN
China
Prior art keywords
user
analysis method
behavior analysis
bayesian classification
probability
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201810819701.6A
Other languages
Chinese (zh)
Inventor
杨斌
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Chongqing Fumin Bank Co Ltd
Original Assignee
Chongqing Fumin Bank Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Chongqing Fumin Bank Co Ltd filed Critical Chongqing Fumin Bank Co Ltd
Priority to CN201810819701.6A priority Critical patent/CN109117873A/en
Publication of CN109117873A publication Critical patent/CN109117873A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2415Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on parametric or probabilistic models, e.g. based on likelihood ratio or false acceptance rate versus a false rejection rate
    • G06F18/24155Bayesian classification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/02Marketing; Price estimation or determination; Fundraising
    • G06Q30/0201Market modelling; Market analysis; Collecting market data
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/02Marketing; Price estimation or determination; Fundraising
    • G06Q30/0201Market modelling; Market analysis; Collecting market data
    • G06Q30/0203Market surveys; Market polls

Abstract

The user behavior analysis method based on Bayesian Classification Arithmetic that the invention discloses a kind of, comprising the following steps: S1 acquires user operation records;S2 analyzes user operation records, the conditional probability between the probability of happening and each user's operation of each user's operation is calculated, to train the Bayes classifier for judging next operation of user according to current user operation;S3 judges next operation of user to user's current operation is inputted described in Bayes classifier;S4 repeats step S3 with next operation user current operation of the user, until an operation is specified end operation under user, obtains user behavior sequence.With the advantage for having played mass data sample, the technical effect of precision of analysis and analysis efficiency is improved.

Description

A kind of user behavior analysis method based on Bayesian Classification Arithmetic
Technical field
The present invention relates to data analysis technique fields, and in particular to a kind of user behavior based on Bayesian Classification Arithmetic point Analysis method
Background technique
The optimization of service procedure is paid much attention in current service trade, wherein the result that can often use user behavior analysis is made For the foundation of optimization.
Tradition is done user behavior analysis majority and is manually analyzed, inefficiency, lesser in user behavior data amount When, it is still able to satisfy demand, but also bring the inaccurate problem of result simultaneously;However as the hair of Electronic Commerce in China Exhibition, more and more users are taken through the convenient form such as webpage, cell phone application and receive service, this its behavioral data is also easier to It is acquired and records, this allows for the source of user behavior data and data volume and is all able to magnanimity increase, and manual analysis Mode can not make good use of mass data sample well at all, now need the advantage that can give full play to mass data sample, mention The user behavior analysis method of high analyte efficiency and precision of analysis.
Summary of the invention
The invention is intended to provide a kind of user behavior analysis method, the advantage of mass data sample is played, improves analysis effect Rate and precision of analysis.
The user behavior analysis method based on Bayesian Classification Arithmetic in the present invention, including the following contents:
S1 acquires user operation records;
S2 analyzes user operation records, calculates the condition between the probability of happening and each user's operation of each user's operation Probability, to train the Bayes classifier for judging next operation of user according to current user operation;
S3 judges next operation of user to user's current operation is inputted described in Bayes classifier;
S4, for user's current operation, repeats step S3 with next operation of the user, until an operation is to refer under user Determine end operation, obtains user behavior sequence.
The generation of each user's operation can be calculated using known user's operation behavior as sample by this method Probability and conditional probability, these two types of probability are the bases that Bayes classifier is used to predict under user an operation, in this patent " training " finally obtains Bayes classifier by the probability calculation to all samples, which can learn currently generally The probability for all operations that next may occur is calculated in the case where rate, and probability soprano is then used as the prediction of classifier Output can input the user's operation of a starting, predict its next operation, then under this using trained classifier One operation is used as current operation to be predicted, and so on, when the next operation predicted is the behaviour that can be considered as end When making, stops prediction, can so obtain a complete user behavior sequence, take different beginning and ends, it can be with A plurality of user behavior sequence is obtained, the work that such as service procedure optimization depends on user's behavior prediction is used to help.
The present invention passes through the advantage that training Bayes classifier plays mass data sample, improves precision of analysis, can be with By the execution of computer system intelligence, algorithm is simple and easy, reduces manpower intervention, to improve analysis efficiency.
Further, in step S1, that is transferred when by application performance management system calling and obtaining user using application program is connect Mouthful, to characterize user operation records.
Application performance management (APM) system has carried out comprehensive monitoring to the application run on line, has especially had recorded User operates called interface every time, and each interface is then corresponding with specific operating function in the application, then passes through The result of identifying call can learn which kind of operation user has carried out, and be a kind of convenient-to-running and efficient acquisition user behaviour The method noted down.
Further, in step S2, by the user operation records in same group of other data of analysis, user behaviour is obtained Make behavior queue, and user's operation probability is calculated with next operation of operation and the operation in the queue.
This method needs to acquire the training classifier that the continuous multiple operations of user just can be more accurate, and user is continuous Multiple interfaces, which are invoked in APM system, to be identified by the same group (Group id), answer this by group to obtain use Family operation behavior queue is reliable, is conducive to the training of classifier.
Further, the bayesian algorithm uses Gauss bayesian algorithm.
Gauss bayesian algorithm, the also referred to as Gauss of bayesian algorithm indicate, more accurate for the prediction of continuous probability value, It is suitable for application scenarios of the invention.
Further, the user's operation behavior queue passes through MYSQL, HADOOP, HIVE, one of REDIS or a variety of Database is stored.
APM system would generally use the database form of oneself often to use the database of the dereferenceds relation object such as ES, will Its unloading is MYSQL, HADOOP, HIVE, and one of REDIS or a variety of incidence relation class databases facilitate data to transfer It calculates.
It further, further include removing to be defined previously as invalid user operation records in S2.
Some abnormal operations of user will also tend to be collected, such as can leave after logging in, this generic operation has For fear of the accurate calculating of probability, removed before carrying out probability calculation, obtained classifier is more accurate.
Further, further include continuous collecting user operation records, and update Bayes with the user operation records after expanding The step of classifier.
With the increase of data volume, prediction probability is more accurate.
Detailed description of the invention
Fig. 1 is the process signal of the user behavior analysis method based on Bayesian Classification Arithmetic in the embodiment of the present invention Figure.
Specific embodiment
It is further described below by specific embodiment:
Embodiment is substantially as shown in Fig. 1:
The interface transferred when first by application performance management system calling and obtaining user using application program, to characterize user Operation note.Application performance management (APM) system has carried out comprehensive monitoring to the application run on line, has especially recorded User operates called interface every time, and each interface is then corresponding with specific operating function in the application, then leads to The result for crossing identifying call can learn which kind of operation user has carried out, and be a kind of convenient-to-running and efficiently obtain user The method of operation note.
User operation records are extracted from the ES database of APM system, pass through the institute in same group of other data of analysis User operation records are stated, user's operation behavior queue is obtained.Current embodiment require that the continuous multiple operations of acquisition user just can be more Accurately to train classifier, and continuously be invoked in APM system can be by the same group (Group for multiple interfaces by user Id it) to identify, answers this and obtains by group user's operation behavior queue reliable, be conducive to the training of classifier.
After obtaining all user's operation behavior queues, removes be defined previously as invalid user operation records first.With Some abnormal operations at family will also tend to be collected, such as can leave after logging in, this generic operation is an impediment to probability It is accurate to calculate, removed before carrying out probability calculation, obtained classifier is more accurate.
User's operation behavior queue after processing is converted into association type data format and is stored in association type database, can Using MYSQL, HADOOP, HIVE, one of REDIS or multitype database;
Be used as sample by operation known to these, calculate each user's operation probability of happening and each user's operation Between conditional probability, these two types of probability are the bases that Bayes classifier is used to predict under user an operation, by all samples This probability calculation finally obtains Bayes classifier, which can calculate in the case where learning current probability connect under Carry out the probability for all operations that may occur, and probability soprano is then used as the prediction of classifier to export.
Bayesian algorithm in this implementation uses Gauss bayesian algorithm.
Gauss bayesian algorithm, the also referred to as Gauss of bayesian algorithm indicate, more accurate for the prediction of continuous probability value, It is suitable for application scenarios of the invention.
When being predicted, user's current operation is inputted to trained Bayes classifier, to judge user's Next operation;Then with next operation user current operation of the user, step S3 is repeated, until an operation is finger under user Determine end operation, obtains the behavior sequence of user.
By this method, using known user behavior as sample, calculate can each user's operation probability of happening And conditional probability, these two types of probability are the bases that Bayes classifier is used to predict an operation under user, by all samples Probability calculation finally obtain Bayes classifier, next which can calculate in the case where learning current probability The probability for all operations that may occur, and probability soprano is then used as the prediction of classifier to export, and utilizes trained classification Device can input the user's operation of a starting, predict its next operation, then carry out using next operation as current operation Prediction, and so on, when the next operation predicted is an operation that can be considered as end, stop prediction, so To obtain a complete user behavior sequence, different beginning and ends is taken, available a plurality of user behavior sequence is used In the work for helping such as service procedure optimization to depend on user's behavior prediction.
For example, with user " login " operation for starting point, predicting that next operation is " I by taking a Mobile banking APP as an example Account ", next one is " transferring accounts ", next one operation is " exiting ", and " exiting " is this inferior to the terminal that this is specified in advance Operation has then obtained behavior sequence " login " → check " my account " → " transferring accounts " → " exiting ", such as institute above It states, these operations are what the special interface called according to it was defined out.
Continuous with application program is used, and also needs continuous collecting user operation records, and grasped with the user after expanding It notes down and updates Bayes classifier.With the increase of data volume, prediction probability is also more accurate.When by one update of setting Between, it is ensured that the newest classifier for being all predicted every time is predicted.
In other embodiments of the invention, further comprise during the result of prediction is applied to user's operation Step after user clicks a certain operation, needs to fill in relevant information still by taking a Mobile banking APP as an example, as account is believed Breath etc.;The predicted good user behavior sequence before at this time, while reading the resource idle condition of current device;If The resource idle degrees of current device are higher than the ratio of setting, then in the interface for calling directly next step from the background, read in advance The data needed, carrying out before user clicks to enter next operation ready-made can prepare, and read those and do not need user again The data being manually entered;Next operation when the user clicks, and just predicted operation when, all preparations are ready, Due to using preoperative form, idling-resource and user that equipment is utilized input the time of information, compared to original It waits user to click next mode for operating and making a response again, greatly shortens the waiting time of user, the resource benefit of equipment It is also improved with rate.
The multiple user behavior sequences predicted in advance can be stored in user terminal and regularly update, due to the first of user It is a operation be usually log in, so the user behavior sequence used for the first time must be register be starting point sequence, when with When the next step operation at family is no longer the next operation saved in this sequence, then re-read using current operation as starting point Sequence achievees the effect that rectify a deviation in time.
What has been described above is only an embodiment of the present invention, and the common sense such as well known specific structure and characteristic are not made herein in scheme Excessive description, technical field that the present invention belongs to is all before one skilled in the art know the applying date or priority date Ordinary technical knowledge can know the prior art all in the field, and have using routine experiment hand before the date The ability of section, one skilled in the art can improve and be implemented in conjunction with self-ability under the enlightenment that the application provides This programme, some typical known features or known method should not become one skilled in the art and implement the application Obstacle.It should be pointed out that for those skilled in the art, without departing from the structure of the invention, can also make Several modifications and improvements out, these also should be considered as protection scope of the present invention, these all will not influence the effect that the present invention is implemented Fruit and patent practicability.The scope of protection required by this application should be based on the content of the claims, the tool in specification The records such as body embodiment can be used for explaining the content of claim.

Claims (7)

1. a kind of user behavior analysis method based on Bayesian Classification Arithmetic, it is characterised in that: the following steps are included:
S1 acquires user operation records;
S2 analyzes user operation records, and the condition calculated between the probability of happening and each user's operation of each user's operation is general Rate, to train the Bayes classifier for judging next operation of user according to current user operation;
S3 judges next operation of user to user's current operation is inputted described in Bayes classifier;
S4 repeats step S3 with next operation user current operation of the user, until an operation terminates under user to be specified Operation, obtains user behavior sequence.
2. the user behavior analysis method according to claim 1 based on Bayesian Classification Arithmetic, it is characterised in that: step In S1, the interface transferred when by application performance management system calling and obtaining user using application program, to characterize user's operation note Record.
3. the user behavior analysis method according to claim 2 based on Bayesian Classification Arithmetic, it is characterised in that: step In S2, by the user operation records in same group of other data of analysis, user's operation behavior queue is obtained, and with the team Next operation of operation and the operation in column calculates user's operation probability.
4. the user behavior analysis method according to claim 1 based on Bayesian Classification Arithmetic, it is characterised in that: described Gauss bayesian algorithm is used by bayesian algorithm.
5. the user behavior analysis method according to claim 1 based on Bayesian Classification Arithmetic, it is characterised in that: described By MYSQL, HADOOP, HIVE, one of REDIS or multitype database are stored for user's operation behavior queue.
6. the user behavior analysis method according to claim 1 based on Bayesian Classification Arithmetic, it is characterised in that: in S2 In further include removing to be defined previously as invalid user operation records.
7. the user behavior analysis method according to claim 1 based on Bayesian Classification Arithmetic, it is characterised in that: in S2 In further include removing to be defined previously as invalid user operation records.
CN201810819701.6A 2018-07-24 2018-07-24 A kind of user behavior analysis method based on Bayesian Classification Arithmetic Pending CN109117873A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810819701.6A CN109117873A (en) 2018-07-24 2018-07-24 A kind of user behavior analysis method based on Bayesian Classification Arithmetic

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810819701.6A CN109117873A (en) 2018-07-24 2018-07-24 A kind of user behavior analysis method based on Bayesian Classification Arithmetic

Publications (1)

Publication Number Publication Date
CN109117873A true CN109117873A (en) 2019-01-01

Family

ID=64862983

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810819701.6A Pending CN109117873A (en) 2018-07-24 2018-07-24 A kind of user behavior analysis method based on Bayesian Classification Arithmetic

Country Status (1)

Country Link
CN (1) CN109117873A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109117874A (en) * 2018-07-25 2019-01-01 北京小米移动软件有限公司 Operation behavior prediction technique and device
CN111797861A (en) * 2019-04-09 2020-10-20 Oppo广东移动通信有限公司 Information processing method, information processing apparatus, storage medium, and electronic device

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101008907A (en) * 2007-01-26 2007-08-01 清华大学 Load-aware IO performance optimization methods based on Bayesian decision
CN102087713A (en) * 2009-12-04 2011-06-08 索尼公司 Information processing device, information processing method, and program
CN102737037A (en) * 2011-04-07 2012-10-17 北京搜狗科技发展有限公司 Webpage pre-reading method, device and browser
CN105589914A (en) * 2015-07-20 2016-05-18 广州市动景计算机科技有限公司 Webpage pre-reading method and apparatus and intelligent terminal device
US20160379268A1 (en) * 2013-12-10 2016-12-29 Tencent Technology (Shenzhen) Company Limited User behavior data analysis method and device

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101008907A (en) * 2007-01-26 2007-08-01 清华大学 Load-aware IO performance optimization methods based on Bayesian decision
CN102087713A (en) * 2009-12-04 2011-06-08 索尼公司 Information processing device, information processing method, and program
CN102737037A (en) * 2011-04-07 2012-10-17 北京搜狗科技发展有限公司 Webpage pre-reading method, device and browser
US20160379268A1 (en) * 2013-12-10 2016-12-29 Tencent Technology (Shenzhen) Company Limited User behavior data analysis method and device
CN105589914A (en) * 2015-07-20 2016-05-18 广州市动景计算机科技有限公司 Webpage pre-reading method and apparatus and intelligent terminal device

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
张国印等: "《基于贝叶斯网络的Android恶意行为检测方法》", 《计算机工程与应用》 *
黄文茜等: "《基于用户行为分析的智能终端应用管理优化》", 《计算机系统应用》 *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109117874A (en) * 2018-07-25 2019-01-01 北京小米移动软件有限公司 Operation behavior prediction technique and device
CN111797861A (en) * 2019-04-09 2020-10-20 Oppo广东移动通信有限公司 Information processing method, information processing apparatus, storage medium, and electronic device

Similar Documents

Publication Publication Date Title
Verenich et al. Survey and cross-benchmark comparison of remaining time prediction methods in business process monitoring
US6542881B1 (en) System and method for revealing necessary and sufficient conditions for database analysis
CN108959004B (en) Disk failure prediction method, device, equipment and computer readable storage medium
CN110806954B (en) Method, device, equipment and storage medium for evaluating cloud host resources
CN109086816A (en) A kind of user behavior analysis system based on Bayesian Classification Arithmetic
CN110598620B (en) Deep neural network model-based recommendation method and device
WO2019196278A1 (en) Weather data acquisition method and apparatus, computer apparatus and readable storage medium
CN109461023B (en) Loss user retrieval method and device, electronic equipment and storage medium
JP5373870B2 (en) Prediction device, prediction method, and program
CN109697456A (en) Business diagnosis method, apparatus, equipment and storage medium
CN110472154A (en) A kind of resource supplying method, apparatus, electronic equipment and readable storage medium storing program for executing
CN111160959B (en) User click conversion prediction method and device
CN109117873A (en) A kind of user behavior analysis method based on Bayesian Classification Arithmetic
CN113268403A (en) Time series analysis and prediction method, device, equipment and storage medium
CN110414624A (en) Disaggregated model construction method and device based on multi-task learning
CN110232130B (en) Metadata management pedigree generation method, apparatus, computer device and storage medium
CN112883066B (en) Method for estimating multi-dimensional range query cardinality on database
CN111159241A (en) Click conversion estimation method and device
US9043256B2 (en) Hypothesis derived from relationship graph
EP3901789A1 (en) Method and apparatus for outputting information
CN113192627A (en) Patient and disease bipartite graph-based readmission prediction method and system
CN112396313B (en) Method for optimizing telephone sales performance by using smart watch
CN115409541A (en) Cigarette brand data processing method based on data blood relationship
EP1622309A2 (en) Method and system for treating events and data uniformly
CN113934894A (en) Data display method based on index tree and terminal equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20190101