CN109086816A - A kind of user behavior analysis system based on Bayesian Classification Arithmetic - Google Patents

A kind of user behavior analysis system based on Bayesian Classification Arithmetic Download PDF

Info

Publication number
CN109086816A
CN109086816A CN201810821761.1A CN201810821761A CN109086816A CN 109086816 A CN109086816 A CN 109086816A CN 201810821761 A CN201810821761 A CN 201810821761A CN 109086816 A CN109086816 A CN 109086816A
Authority
CN
China
Prior art keywords
user
module
analysis system
probability
behavior analysis
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201810821761.1A
Other languages
Chinese (zh)
Inventor
杨斌
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Chongqing Fumin Bank Co Ltd
Original Assignee
Chongqing Fumin Bank Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Chongqing Fumin Bank Co Ltd filed Critical Chongqing Fumin Bank Co Ltd
Priority to CN201810821761.1A priority Critical patent/CN109086816A/en
Publication of CN109086816A publication Critical patent/CN109086816A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2415Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on parametric or probabilistic models, e.g. based on likelihood ratio or false acceptance rate versus a false rejection rate
    • G06F18/24155Bayesian classification

Abstract

The invention discloses a kind of user behavior analysis systems based on Bayesian Classification Arithmetic, including, data acquisition module, for acquiring user operation records;Training module calculates the conditional probability between the probability of happening and each user's operation of each user's operation, to train the Bayes classifier for judging next operation of user according to current user operation;Prediction module, for judging next operation of user according to user's current operation of input using the Bayes classifier;Recurrence module, for being that user's current operation inputs categorization module with next operation of the user, until an operation is specified end operation under user;And it successively saves each operation and forms user behavior sequence.With the advantage for having played mass data sample, the technical effect of precision of analysis and analysis efficiency is improved.

Description

A kind of user behavior analysis system based on Bayesian Classification Arithmetic
Technical field
The present invention relates to data analysis technique fields, and in particular to a kind of user behavior based on Bayesian Classification Arithmetic point Analysis system
Background technique
The optimization of service procedure is paid much attention in current service trade, wherein the result that can often use user behavior analysis is made For the foundation of optimization.
Tradition is done user behavior analysis majority and is manually analyzed, inefficiency, lesser in user behavior data amount When, it is still able to satisfy demand, but also bring the inaccurate problem of result simultaneously;However as the hair of Electronic Commerce in China Exhibition, more and more users are taken through the convenient form such as webpage, cell phone application and receive service, this its behavioral data is also easier to It is acquired and records, this allows for the source of user behavior data and data volume and is all able to magnanimity increase, and manual analysis Mode can not make good use of mass data sample well at all, now need the advantage that can give full play to mass data sample, mention The user behavior analysis system of high analyte efficiency and precision of analysis.
Summary of the invention
The invention is intended to provide a kind of user behavior analysis system, to play the advantage of mass data sample, analysis is improved Efficiency and precision of analysis.
The user behavior analysis system based on Bayesian Classification Arithmetic in the present invention, comprising:
Data acquisition module, for acquiring user operation records;
Training module calculates the conditional probability between the probability of happening and each user's operation of each user's operation, with instruction Practice the Bayes classifier for judging next operation of user according to current user operation;
Prediction module, for being judged under user using the Bayes classifier according to user's current operation of input One operation;
Recurrence module, for being that user's current operation inputs categorization module with next operation of the user, until user Next operation is specified end operation;And judge that resulting operation forms user behavior sequence during successively saving each time.
The generation of each user's operation can be calculated using known user's operation behavior as sample by this programme Probability and conditional probability, these two types of probability are the bases that Bayes classifier is used to predict under user an operation, in this patent " training " finally obtains Bayes classifier by the probability calculation to all samples, which can learn currently generally The probability for all operations that next may occur is calculated in the case where rate, and probability soprano is then used as the prediction of classifier Output can input the user's operation of a starting, predict its next operation, then under this using trained classifier One operation is used as current operation to be predicted, and so on, when the next operation predicted is the behaviour that can be considered as end When making, stops prediction, can so obtain a complete user behavior sequence, take different beginning and ends, it can be with A plurality of user behavior sequence is obtained, the work that such as service procedure optimization depends on user's behavior prediction is used to help.
The present invention passes through the advantage that training Bayes classifier plays mass data sample, improves precision of analysis, can be with By the execution of computer system intelligence, algorithm is simple and easy, reduces manpower intervention, to improve analysis efficiency.
Further, the data acquisition module, when for using application program from application performance management system calling and obtaining user The interface transferred.
Application performance management (APM) system has carried out comprehensive monitoring to the application run on line, has especially had recorded User operates called interface every time, and each interface is then corresponding with specific operating function in the application, then passes through The result of identifying call can learn which kind of operation user has carried out, and be a kind of convenient-to-running and efficient acquisition user behaviour The method noted down.
Further, the data acquisition module is also used to through the user's operation in same group of other data of analysis Record, obtains user's operation behavior queue;
The training module, for calculating a user with next operation of operation and the operation in the behavior queue Conditional probability between the probability of happening of operation and each user's operation.
Need to acquire the training classifier that the continuous multiple operations of user just can be more accurate in this programme, and user is continuous Multiple interfaces be invoked in APM system and can be identified by the same group (Group id), answer this by group to obtain User's operation behavior queue is reliable, is conducive to the training of classifier.
Further, the Bayes classifier is the Bayes classifier using Gauss bayesian algorithm.
Gauss bayesian algorithm, the also referred to as Gauss of bayesian algorithm indicate, more accurate for the prediction of continuous probability value, It is suitable for application scenarios of the invention.
It further, further include database module, for passing through MYSQL, HADOOP to the user's operation behavior queue, One of HIVE, REDIS or a variety of data formats are stored.
APM system would generally use the database form of oneself often to use the database of the dereferenceds relation object such as ES, will Its unloading is MYSQL, HADOOP, HIVE, and one of REDIS or a variety of incidence relation class databases facilitate data to transfer It calculates.
Further, the data acquisition module, is also used to remove and is defined previously as invalid user operation records.
Some abnormal operations of user will also tend to be collected, such as can leave after logging in, this generic operation has For fear of the accurate calculating of probability, removed before carrying out probability calculation, obtained classifier is more accurate.
Further, the data acquisition module is also used to continuous collecting user operation records;
The training module is also used to, and updates the Bayes classifier using the user operation records after expansion.
With the increase of data volume, prediction probability is more accurate.
Detailed description of the invention
Fig. 1 is the schematic frame of the user behavior analysis system based on Bayesian Classification Arithmetic in the embodiment of the present invention Figure.
Specific embodiment
It is further described below by specific embodiment:
The user behavior analysis system based on Bayesian Classification Arithmetic in embodiment, substantially such as in attached drawing 1 in dotted line frame Part shown in, including data acquisition module, database module, training module, prediction module and recurrence module.
The specific work process of the present embodiment is as follows:
That is transferred when data acquisition module passes through application performance management system calling and obtaining user using application program first connects Mouthful, to characterize user operation records.Application performance management (APM) system has carried out comprehensive prison to the application run on line Control, especially has recorded user and operates called interface every time, each interface is then corresponding with specific behaviour in the application Make function, the result for then passing through identifying call can learn which kind of operation user has carried out, be it is a kind of convenient-to-running and The efficient method for obtaining user operation records.
User operation records are extracted from the ES database of APM system, pass through the institute in same group of other data of analysis User operation records are stated, user's operation behavior queue is obtained.Current embodiment require that the continuous multiple operations of acquisition user just can be more Accurately to train classifier, and continuously be invoked in APM system can be by the same group (Group for multiple interfaces by user Id it) to identify, answers this and obtains by group user's operation behavior queue reliable, be conducive to the training of classifier.
After data acquisition module obtains all user's operation behavior queues, removes be defined previously as invalid user first Operation note.Some abnormal operations of user will also tend to be collected, such as can leave after logging in, this generic operation has For fear of the accurate calculating of probability, removed before carrying out probability calculation, obtained classifier is more accurate.
User's operation behavior queue after processing is converted into association type data format and is stored in association type data inventory MYSQL, HADOOP, HIVE, one of REDIS or multitype database can be used in database module in storage;
Training module is used as sample by these known operations, calculates the probability of happening of each user's operation and each Conditional probability between user's operation, these two types of probability are the bases that Bayes classifier is used to predict an operation under user, are passed through Bayes classifier is finally obtained to the probability calculation of all samples, which can count in the case where learning current probability The probability for all operations that next may occur is calculated, and probability soprano is then used as the prediction of classifier to export.
Bayesian algorithm in the present embodiment uses Gauss bayesian algorithm.
Gauss bayesian algorithm, the also referred to as Gauss of bayesian algorithm indicate, more accurate for the prediction of continuous probability value, It is suitable for application scenarios of the invention.
Prediction module inputs trained Bayes classifier, shellfish in prediction, by the user's current operation received This classifier of leaf judges that next operation of user sends a recurrence module by probability height;Recurrence module is under the user One operation is used as user's current operation, from new input prediction module, while saving the operation;Until an operation is specified under user The each operation successively saved is then used as user behavior sequence to export by end operation, recurrence module.
Through this embodiment, using known user behavior as sample, calculate can each user's operation generation it is general Rate and conditional probability, these two types of probability are the bases that Bayes classifier is used to predict an operation under user, by all samples This probability calculation finally obtains Bayes classifier, which can calculate in the case where learning current probability connect under Carry out the probability for all operations that may occur, and probability soprano is then used as the prediction of classifier to export, and utilizes trained point Class device can input the user's operation of a starting, predict its next operation, then using next operation as current operation into Row prediction, and so on, when the next operation predicted is an operation that can be considered as end, stop prediction, so just Available one complete user behavior sequence, takes different beginning and ends, available a plurality of user behavior sequence, It is used to help the work that such as service procedure optimization depends on user's behavior prediction.
For example, with user " login " operation for starting point, predicting that next operation is " I by taking a Mobile banking APP as an example Account ", next one is " transferring accounts ", next one operation is " exiting ", and " exiting " is this inferior to the terminal that this is specified in advance Operation has then obtained behavior sequence " login " → check " my account " → " transferring accounts " → " exiting ", such as institute above It states, these operations are what the special interface called according to it was defined out.
Continuous with application program is used, and data acquisition module also needs continuous collecting user operation records, training mould User operation records after block expands update Bayes classifier.With the increase of data volume, prediction probability is also more accurate. By setting a renewal time, it is ensured that the newest classifier for being all predicted every time is predicted.
In other embodiments of the invention, further comprise during the result of prediction is applied to user's operation Pre-operation module, which is arranged in client, still by taking a Mobile banking APP as an example, after user clicks a certain operation, Need to fill in relevant information, such as account information;The predicted good user behavior sequence before at this time, while reading and working as The resource idle condition of preceding equipment;If the resource idle degrees of current device are higher than the ratio of setting, then directly adjusted on backstage With the interface of next step, the data needed are read in advance, carrying out before user clicks to enter next operation being capable of ready-made standard It is standby, it reads those and does not need the data that user is manually entered again;Next operation when the user clicks, and the operation just predicted When, all preparations are ready;Due to using preoperative form, idling-resource and the user that equipment is utilized are defeated The time for entering information clicks next mode for operating and making a response again compared to original waiting user, greatly shortens user Waiting time, the resource utilization of equipment is also improved.
The multiple user behavior sequences predicted in advance can be stored in the pre-operation module of user terminal, and periodically more Newly, since first of user operation is usually to log in, so the user behavior sequence used for the first time must be register It is then re-read for the sequence of starting point when the operation of the next step of user is no longer the next operation saved in this sequence Using current operation as the sequence of starting point, achieve the effect that rectify a deviation in time.
What has been described above is only an embodiment of the present invention, and the common sense such as well known specific structure and characteristic are not made herein in scheme Excessive description, technical field that the present invention belongs to is all before one skilled in the art know the applying date or priority date Ordinary technical knowledge can know the prior art all in the field, and have using routine experiment hand before the date The ability of section, one skilled in the art can improve and be implemented in conjunction with self-ability under the enlightenment that the application provides This programme, some typical known features or known method should not become one skilled in the art and implement the application Obstacle.It should be pointed out that for those skilled in the art, without departing from the structure of the invention, can also make Several modifications and improvements out, these also should be considered as protection scope of the present invention, these all will not influence the effect that the present invention is implemented Fruit and patent practicability.The scope of protection required by this application should be based on the content of the claims, the tool in specification The records such as body embodiment can be used for explaining the content of claim.

Claims (7)

1. a kind of user behavior analysis system based on Bayesian Classification Arithmetic, it is characterised in that: including,
Data acquisition module, for acquiring user operation records;
Training module calculates the conditional probability between the probability of happening and each user's operation of each user's operation, is used with training In the Bayes classifier for the next operation for judging user according to current user operation;
Prediction module, for judging next behaviour of user according to user's current operation of input using the Bayes classifier Make;
Recurrence module, for being that user's current operation inputs categorization module with next operation of the user, until one under user Operation is specified end operation;And judge that resulting operation forms user behavior sequence during successively saving each time.
2. the user behavior analysis system according to claim 1 based on Bayesian Classification Arithmetic, it is characterised in that: described Data acquisition module, the interface transferred when for from application performance management system calling and obtaining user using application program, with characterization User operation records.
3. the user behavior analysis system according to claim 2 based on Bayesian Classification Arithmetic, it is characterised in that: described Data acquisition module is also used to obtain user's operation row by the user operation records in same group of other data of analysis For queue;
The training module, for calculating a user's operation with next operation of operation and the operation in the behavior queue Probability of happening and each user's operation between conditional probability.
4. the user behavior analysis system according to claim 1 based on Bayesian Classification Arithmetic, it is characterised in that: described Bayes classifier is the Bayes classifier using Gauss bayesian algorithm.
5. the user behavior analysis system according to claim 1 based on Bayesian Classification Arithmetic, it is characterised in that: also wrap Include database module, for the user's operation behavior queue by MYSQL, HADOOP, HIVE, one of REDIS or A variety of data formats are stored.
6. the user behavior analysis system according to claim 1 based on Bayesian Classification Arithmetic, it is characterised in that: described Data acquisition module, is also used to remove and is defined previously as invalid user operation records.
7. the user behavior analysis system according to claim 1 based on Bayesian Classification Arithmetic, it is characterised in that: described Data acquisition module is also used to continuous collecting user operation records;
The training module is also used to, and updates the Bayes classifier using the user operation records after expansion.
CN201810821761.1A 2018-07-24 2018-07-24 A kind of user behavior analysis system based on Bayesian Classification Arithmetic Pending CN109086816A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810821761.1A CN109086816A (en) 2018-07-24 2018-07-24 A kind of user behavior analysis system based on Bayesian Classification Arithmetic

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810821761.1A CN109086816A (en) 2018-07-24 2018-07-24 A kind of user behavior analysis system based on Bayesian Classification Arithmetic

Publications (1)

Publication Number Publication Date
CN109086816A true CN109086816A (en) 2018-12-25

Family

ID=64838320

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810821761.1A Pending CN109086816A (en) 2018-07-24 2018-07-24 A kind of user behavior analysis system based on Bayesian Classification Arithmetic

Country Status (1)

Country Link
CN (1) CN109086816A (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109117874A (en) * 2018-07-25 2019-01-01 北京小米移动软件有限公司 Operation behavior prediction technique and device
CN111737101A (en) * 2020-06-24 2020-10-02 平安科技(深圳)有限公司 User behavior monitoring method, device, equipment and medium based on big data
CN111797861A (en) * 2019-04-09 2020-10-20 Oppo广东移动通信有限公司 Information processing method, information processing apparatus, storage medium, and electronic device
CN113836370A (en) * 2021-11-25 2021-12-24 上海观安信息技术股份有限公司 User group classification method and device, storage medium and computer equipment

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101008907A (en) * 2007-01-26 2007-08-01 清华大学 Load-aware IO performance optimization methods based on Bayesian decision
CN102087713A (en) * 2009-12-04 2011-06-08 索尼公司 Information processing device, information processing method, and program
CN102737037A (en) * 2011-04-07 2012-10-17 北京搜狗科技发展有限公司 Webpage pre-reading method, device and browser
CN105589914A (en) * 2015-07-20 2016-05-18 广州市动景计算机科技有限公司 Webpage pre-reading method and apparatus and intelligent terminal device
US20160379268A1 (en) * 2013-12-10 2016-12-29 Tencent Technology (Shenzhen) Company Limited User behavior data analysis method and device

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101008907A (en) * 2007-01-26 2007-08-01 清华大学 Load-aware IO performance optimization methods based on Bayesian decision
CN102087713A (en) * 2009-12-04 2011-06-08 索尼公司 Information processing device, information processing method, and program
CN102737037A (en) * 2011-04-07 2012-10-17 北京搜狗科技发展有限公司 Webpage pre-reading method, device and browser
US20160379268A1 (en) * 2013-12-10 2016-12-29 Tencent Technology (Shenzhen) Company Limited User behavior data analysis method and device
CN105589914A (en) * 2015-07-20 2016-05-18 广州市动景计算机科技有限公司 Webpage pre-reading method and apparatus and intelligent terminal device

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
张国印等: "《基于贝叶斯网络的Android恶意行为检测方法》", 《计算机工程与应用》 *
黄文茜等: "《基于用户行为分析的智能终端应用管理优化》", 《计算机系统应用》 *

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109117874A (en) * 2018-07-25 2019-01-01 北京小米移动软件有限公司 Operation behavior prediction technique and device
CN111797861A (en) * 2019-04-09 2020-10-20 Oppo广东移动通信有限公司 Information processing method, information processing apparatus, storage medium, and electronic device
CN111737101A (en) * 2020-06-24 2020-10-02 平安科技(深圳)有限公司 User behavior monitoring method, device, equipment and medium based on big data
CN111737101B (en) * 2020-06-24 2022-05-03 平安科技(深圳)有限公司 User behavior monitoring method, device, equipment and medium based on big data
CN113836370A (en) * 2021-11-25 2021-12-24 上海观安信息技术股份有限公司 User group classification method and device, storage medium and computer equipment
CN113836370B (en) * 2021-11-25 2022-03-01 上海观安信息技术股份有限公司 User group classification method and device, storage medium and computer equipment

Similar Documents

Publication Publication Date Title
CN109086816A (en) A kind of user behavior analysis system based on Bayesian Classification Arithmetic
US6542881B1 (en) System and method for revealing necessary and sufficient conditions for database analysis
CN108959004B (en) Disk failure prediction method, device, equipment and computer readable storage medium
CN110806954B (en) Method, device, equipment and storage medium for evaluating cloud host resources
CN107247811B (en) SQL statement performance optimization method and device based on Oracle database
CN109461023B (en) Loss user retrieval method and device, electronic equipment and storage medium
CN113268403B (en) Time series analysis and prediction method, device, equipment and storage medium
CN110399377A (en) Optimization method, device, electronic equipment and the computer readable storage medium of SQL
CN104391879A (en) Method and device for hierarchical clustering
CN117391292A (en) Carbon emission energy-saving management analysis system and method
CN110232130B (en) Metadata management pedigree generation method, apparatus, computer device and storage medium
CN111639902A (en) Data auditing method based on kafka, control device, computer equipment and storage medium
CN109117873A (en) A kind of user behavior analysis method based on Bayesian Classification Arithmetic
CN112765463B (en) Data management method for big data and user requirements and cloud computing server
CN110602207A (en) Method, device, server and storage medium for predicting push information based on off-network
CN112883066A (en) Multidimensional range query cardinality estimation method on database
CN109101395A (en) A kind of High Performance Computing Cluster application monitoring method and system based on LSTM
CN115174686B (en) Method and device for dynamically adjusting weights of multiple service channels based on service efficiency
CN111090585A (en) Crowd-sourcing task closing time automatic prediction method based on crowd-sourcing process
CN110502495A (en) A kind of log collecting method and device of application server
CN112396313B (en) Method for optimizing telephone sales performance by using smart watch
CN115309638A (en) Method and device for assisting model optimization
CN114116908A (en) Data management method and device and electronic equipment
CN112800035A (en) GIS (geographic information System) -based power grid data communication sharing system
CN105897503A (en) Hadoop cluster bottleneck detection algorithm based on resource information gain

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20181225

RJ01 Rejection of invention patent application after publication