CN109086816A - A kind of user behavior analysis system based on Bayesian Classification Arithmetic - Google Patents
A kind of user behavior analysis system based on Bayesian Classification Arithmetic Download PDFInfo
- Publication number
- CN109086816A CN109086816A CN201810821761.1A CN201810821761A CN109086816A CN 109086816 A CN109086816 A CN 109086816A CN 201810821761 A CN201810821761 A CN 201810821761A CN 109086816 A CN109086816 A CN 109086816A
- Authority
- CN
- China
- Prior art keywords
- user
- module
- analysis system
- probability
- behavior analysis
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
- G06F18/241—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
- G06F18/2415—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on parametric or probabilistic models, e.g. based on likelihood ratio or false acceptance rate versus a false rejection rate
- G06F18/24155—Bayesian classification
Abstract
The invention discloses a kind of user behavior analysis systems based on Bayesian Classification Arithmetic, including, data acquisition module, for acquiring user operation records;Training module calculates the conditional probability between the probability of happening and each user's operation of each user's operation, to train the Bayes classifier for judging next operation of user according to current user operation;Prediction module, for judging next operation of user according to user's current operation of input using the Bayes classifier;Recurrence module, for being that user's current operation inputs categorization module with next operation of the user, until an operation is specified end operation under user;And it successively saves each operation and forms user behavior sequence.With the advantage for having played mass data sample, the technical effect of precision of analysis and analysis efficiency is improved.
Description
Technical field
The present invention relates to data analysis technique fields, and in particular to a kind of user behavior based on Bayesian Classification Arithmetic point
Analysis system
Background technique
The optimization of service procedure is paid much attention in current service trade, wherein the result that can often use user behavior analysis is made
For the foundation of optimization.
Tradition is done user behavior analysis majority and is manually analyzed, inefficiency, lesser in user behavior data amount
When, it is still able to satisfy demand, but also bring the inaccurate problem of result simultaneously;However as the hair of Electronic Commerce in China
Exhibition, more and more users are taken through the convenient form such as webpage, cell phone application and receive service, this its behavioral data is also easier to
It is acquired and records, this allows for the source of user behavior data and data volume and is all able to magnanimity increase, and manual analysis
Mode can not make good use of mass data sample well at all, now need the advantage that can give full play to mass data sample, mention
The user behavior analysis system of high analyte efficiency and precision of analysis.
Summary of the invention
The invention is intended to provide a kind of user behavior analysis system, to play the advantage of mass data sample, analysis is improved
Efficiency and precision of analysis.
The user behavior analysis system based on Bayesian Classification Arithmetic in the present invention, comprising:
Data acquisition module, for acquiring user operation records;
Training module calculates the conditional probability between the probability of happening and each user's operation of each user's operation, with instruction
Practice the Bayes classifier for judging next operation of user according to current user operation;
Prediction module, for being judged under user using the Bayes classifier according to user's current operation of input
One operation;
Recurrence module, for being that user's current operation inputs categorization module with next operation of the user, until user
Next operation is specified end operation;And judge that resulting operation forms user behavior sequence during successively saving each time.
The generation of each user's operation can be calculated using known user's operation behavior as sample by this programme
Probability and conditional probability, these two types of probability are the bases that Bayes classifier is used to predict under user an operation, in this patent
" training " finally obtains Bayes classifier by the probability calculation to all samples, which can learn currently generally
The probability for all operations that next may occur is calculated in the case where rate, and probability soprano is then used as the prediction of classifier
Output can input the user's operation of a starting, predict its next operation, then under this using trained classifier
One operation is used as current operation to be predicted, and so on, when the next operation predicted is the behaviour that can be considered as end
When making, stops prediction, can so obtain a complete user behavior sequence, take different beginning and ends, it can be with
A plurality of user behavior sequence is obtained, the work that such as service procedure optimization depends on user's behavior prediction is used to help.
The present invention passes through the advantage that training Bayes classifier plays mass data sample, improves precision of analysis, can be with
By the execution of computer system intelligence, algorithm is simple and easy, reduces manpower intervention, to improve analysis efficiency.
Further, the data acquisition module, when for using application program from application performance management system calling and obtaining user
The interface transferred.
Application performance management (APM) system has carried out comprehensive monitoring to the application run on line, has especially had recorded
User operates called interface every time, and each interface is then corresponding with specific operating function in the application, then passes through
The result of identifying call can learn which kind of operation user has carried out, and be a kind of convenient-to-running and efficient acquisition user behaviour
The method noted down.
Further, the data acquisition module is also used to through the user's operation in same group of other data of analysis
Record, obtains user's operation behavior queue;
The training module, for calculating a user with next operation of operation and the operation in the behavior queue
Conditional probability between the probability of happening of operation and each user's operation.
Need to acquire the training classifier that the continuous multiple operations of user just can be more accurate in this programme, and user is continuous
Multiple interfaces be invoked in APM system and can be identified by the same group (Group id), answer this by group to obtain
User's operation behavior queue is reliable, is conducive to the training of classifier.
Further, the Bayes classifier is the Bayes classifier using Gauss bayesian algorithm.
Gauss bayesian algorithm, the also referred to as Gauss of bayesian algorithm indicate, more accurate for the prediction of continuous probability value,
It is suitable for application scenarios of the invention.
It further, further include database module, for passing through MYSQL, HADOOP to the user's operation behavior queue,
One of HIVE, REDIS or a variety of data formats are stored.
APM system would generally use the database form of oneself often to use the database of the dereferenceds relation object such as ES, will
Its unloading is MYSQL, HADOOP, HIVE, and one of REDIS or a variety of incidence relation class databases facilitate data to transfer
It calculates.
Further, the data acquisition module, is also used to remove and is defined previously as invalid user operation records.
Some abnormal operations of user will also tend to be collected, such as can leave after logging in, this generic operation has
For fear of the accurate calculating of probability, removed before carrying out probability calculation, obtained classifier is more accurate.
Further, the data acquisition module is also used to continuous collecting user operation records;
The training module is also used to, and updates the Bayes classifier using the user operation records after expansion.
With the increase of data volume, prediction probability is more accurate.
Detailed description of the invention
Fig. 1 is the schematic frame of the user behavior analysis system based on Bayesian Classification Arithmetic in the embodiment of the present invention
Figure.
Specific embodiment
It is further described below by specific embodiment:
The user behavior analysis system based on Bayesian Classification Arithmetic in embodiment, substantially such as in attached drawing 1 in dotted line frame
Part shown in, including data acquisition module, database module, training module, prediction module and recurrence module.
The specific work process of the present embodiment is as follows:
That is transferred when data acquisition module passes through application performance management system calling and obtaining user using application program first connects
Mouthful, to characterize user operation records.Application performance management (APM) system has carried out comprehensive prison to the application run on line
Control, especially has recorded user and operates called interface every time, each interface is then corresponding with specific behaviour in the application
Make function, the result for then passing through identifying call can learn which kind of operation user has carried out, be it is a kind of convenient-to-running and
The efficient method for obtaining user operation records.
User operation records are extracted from the ES database of APM system, pass through the institute in same group of other data of analysis
User operation records are stated, user's operation behavior queue is obtained.Current embodiment require that the continuous multiple operations of acquisition user just can be more
Accurately to train classifier, and continuously be invoked in APM system can be by the same group (Group for multiple interfaces by user
Id it) to identify, answers this and obtains by group user's operation behavior queue reliable, be conducive to the training of classifier.
After data acquisition module obtains all user's operation behavior queues, removes be defined previously as invalid user first
Operation note.Some abnormal operations of user will also tend to be collected, such as can leave after logging in, this generic operation has
For fear of the accurate calculating of probability, removed before carrying out probability calculation, obtained classifier is more accurate.
User's operation behavior queue after processing is converted into association type data format and is stored in association type data inventory
MYSQL, HADOOP, HIVE, one of REDIS or multitype database can be used in database module in storage;
Training module is used as sample by these known operations, calculates the probability of happening of each user's operation and each
Conditional probability between user's operation, these two types of probability are the bases that Bayes classifier is used to predict an operation under user, are passed through
Bayes classifier is finally obtained to the probability calculation of all samples, which can count in the case where learning current probability
The probability for all operations that next may occur is calculated, and probability soprano is then used as the prediction of classifier to export.
Bayesian algorithm in the present embodiment uses Gauss bayesian algorithm.
Gauss bayesian algorithm, the also referred to as Gauss of bayesian algorithm indicate, more accurate for the prediction of continuous probability value,
It is suitable for application scenarios of the invention.
Prediction module inputs trained Bayes classifier, shellfish in prediction, by the user's current operation received
This classifier of leaf judges that next operation of user sends a recurrence module by probability height;Recurrence module is under the user
One operation is used as user's current operation, from new input prediction module, while saving the operation;Until an operation is specified under user
The each operation successively saved is then used as user behavior sequence to export by end operation, recurrence module.
Through this embodiment, using known user behavior as sample, calculate can each user's operation generation it is general
Rate and conditional probability, these two types of probability are the bases that Bayes classifier is used to predict an operation under user, by all samples
This probability calculation finally obtains Bayes classifier, which can calculate in the case where learning current probability connect under
Carry out the probability for all operations that may occur, and probability soprano is then used as the prediction of classifier to export, and utilizes trained point
Class device can input the user's operation of a starting, predict its next operation, then using next operation as current operation into
Row prediction, and so on, when the next operation predicted is an operation that can be considered as end, stop prediction, so just
Available one complete user behavior sequence, takes different beginning and ends, available a plurality of user behavior sequence,
It is used to help the work that such as service procedure optimization depends on user's behavior prediction.
For example, with user " login " operation for starting point, predicting that next operation is " I by taking a Mobile banking APP as an example
Account ", next one is " transferring accounts ", next one operation is " exiting ", and " exiting " is this inferior to the terminal that this is specified in advance
Operation has then obtained behavior sequence " login " → check " my account " → " transferring accounts " → " exiting ", such as institute above
It states, these operations are what the special interface called according to it was defined out.
Continuous with application program is used, and data acquisition module also needs continuous collecting user operation records, training mould
User operation records after block expands update Bayes classifier.With the increase of data volume, prediction probability is also more accurate.
By setting a renewal time, it is ensured that the newest classifier for being all predicted every time is predicted.
In other embodiments of the invention, further comprise during the result of prediction is applied to user's operation
Pre-operation module, which is arranged in client, still by taking a Mobile banking APP as an example, after user clicks a certain operation,
Need to fill in relevant information, such as account information;The predicted good user behavior sequence before at this time, while reading and working as
The resource idle condition of preceding equipment;If the resource idle degrees of current device are higher than the ratio of setting, then directly adjusted on backstage
With the interface of next step, the data needed are read in advance, carrying out before user clicks to enter next operation being capable of ready-made standard
It is standby, it reads those and does not need the data that user is manually entered again;Next operation when the user clicks, and the operation just predicted
When, all preparations are ready;Due to using preoperative form, idling-resource and the user that equipment is utilized are defeated
The time for entering information clicks next mode for operating and making a response again compared to original waiting user, greatly shortens user
Waiting time, the resource utilization of equipment is also improved.
The multiple user behavior sequences predicted in advance can be stored in the pre-operation module of user terminal, and periodically more
Newly, since first of user operation is usually to log in, so the user behavior sequence used for the first time must be register
It is then re-read for the sequence of starting point when the operation of the next step of user is no longer the next operation saved in this sequence
Using current operation as the sequence of starting point, achieve the effect that rectify a deviation in time.
What has been described above is only an embodiment of the present invention, and the common sense such as well known specific structure and characteristic are not made herein in scheme
Excessive description, technical field that the present invention belongs to is all before one skilled in the art know the applying date or priority date
Ordinary technical knowledge can know the prior art all in the field, and have using routine experiment hand before the date
The ability of section, one skilled in the art can improve and be implemented in conjunction with self-ability under the enlightenment that the application provides
This programme, some typical known features or known method should not become one skilled in the art and implement the application
Obstacle.It should be pointed out that for those skilled in the art, without departing from the structure of the invention, can also make
Several modifications and improvements out, these also should be considered as protection scope of the present invention, these all will not influence the effect that the present invention is implemented
Fruit and patent practicability.The scope of protection required by this application should be based on the content of the claims, the tool in specification
The records such as body embodiment can be used for explaining the content of claim.
Claims (7)
1. a kind of user behavior analysis system based on Bayesian Classification Arithmetic, it is characterised in that: including,
Data acquisition module, for acquiring user operation records;
Training module calculates the conditional probability between the probability of happening and each user's operation of each user's operation, is used with training
In the Bayes classifier for the next operation for judging user according to current user operation;
Prediction module, for judging next behaviour of user according to user's current operation of input using the Bayes classifier
Make;
Recurrence module, for being that user's current operation inputs categorization module with next operation of the user, until one under user
Operation is specified end operation;And judge that resulting operation forms user behavior sequence during successively saving each time.
2. the user behavior analysis system according to claim 1 based on Bayesian Classification Arithmetic, it is characterised in that: described
Data acquisition module, the interface transferred when for from application performance management system calling and obtaining user using application program, with characterization
User operation records.
3. the user behavior analysis system according to claim 2 based on Bayesian Classification Arithmetic, it is characterised in that: described
Data acquisition module is also used to obtain user's operation row by the user operation records in same group of other data of analysis
For queue;
The training module, for calculating a user's operation with next operation of operation and the operation in the behavior queue
Probability of happening and each user's operation between conditional probability.
4. the user behavior analysis system according to claim 1 based on Bayesian Classification Arithmetic, it is characterised in that: described
Bayes classifier is the Bayes classifier using Gauss bayesian algorithm.
5. the user behavior analysis system according to claim 1 based on Bayesian Classification Arithmetic, it is characterised in that: also wrap
Include database module, for the user's operation behavior queue by MYSQL, HADOOP, HIVE, one of REDIS or
A variety of data formats are stored.
6. the user behavior analysis system according to claim 1 based on Bayesian Classification Arithmetic, it is characterised in that: described
Data acquisition module, is also used to remove and is defined previously as invalid user operation records.
7. the user behavior analysis system according to claim 1 based on Bayesian Classification Arithmetic, it is characterised in that: described
Data acquisition module is also used to continuous collecting user operation records;
The training module is also used to, and updates the Bayes classifier using the user operation records after expansion.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810821761.1A CN109086816A (en) | 2018-07-24 | 2018-07-24 | A kind of user behavior analysis system based on Bayesian Classification Arithmetic |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810821761.1A CN109086816A (en) | 2018-07-24 | 2018-07-24 | A kind of user behavior analysis system based on Bayesian Classification Arithmetic |
Publications (1)
Publication Number | Publication Date |
---|---|
CN109086816A true CN109086816A (en) | 2018-12-25 |
Family
ID=64838320
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201810821761.1A Pending CN109086816A (en) | 2018-07-24 | 2018-07-24 | A kind of user behavior analysis system based on Bayesian Classification Arithmetic |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN109086816A (en) |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109117874A (en) * | 2018-07-25 | 2019-01-01 | 北京小米移动软件有限公司 | Operation behavior prediction technique and device |
CN111737101A (en) * | 2020-06-24 | 2020-10-02 | 平安科技(深圳)有限公司 | User behavior monitoring method, device, equipment and medium based on big data |
CN111797861A (en) * | 2019-04-09 | 2020-10-20 | Oppo广东移动通信有限公司 | Information processing method, information processing apparatus, storage medium, and electronic device |
CN113836370A (en) * | 2021-11-25 | 2021-12-24 | 上海观安信息技术股份有限公司 | User group classification method and device, storage medium and computer equipment |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101008907A (en) * | 2007-01-26 | 2007-08-01 | 清华大学 | Load-aware IO performance optimization methods based on Bayesian decision |
CN102087713A (en) * | 2009-12-04 | 2011-06-08 | 索尼公司 | Information processing device, information processing method, and program |
CN102737037A (en) * | 2011-04-07 | 2012-10-17 | 北京搜狗科技发展有限公司 | Webpage pre-reading method, device and browser |
CN105589914A (en) * | 2015-07-20 | 2016-05-18 | 广州市动景计算机科技有限公司 | Webpage pre-reading method and apparatus and intelligent terminal device |
US20160379268A1 (en) * | 2013-12-10 | 2016-12-29 | Tencent Technology (Shenzhen) Company Limited | User behavior data analysis method and device |
-
2018
- 2018-07-24 CN CN201810821761.1A patent/CN109086816A/en active Pending
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101008907A (en) * | 2007-01-26 | 2007-08-01 | 清华大学 | Load-aware IO performance optimization methods based on Bayesian decision |
CN102087713A (en) * | 2009-12-04 | 2011-06-08 | 索尼公司 | Information processing device, information processing method, and program |
CN102737037A (en) * | 2011-04-07 | 2012-10-17 | 北京搜狗科技发展有限公司 | Webpage pre-reading method, device and browser |
US20160379268A1 (en) * | 2013-12-10 | 2016-12-29 | Tencent Technology (Shenzhen) Company Limited | User behavior data analysis method and device |
CN105589914A (en) * | 2015-07-20 | 2016-05-18 | 广州市动景计算机科技有限公司 | Webpage pre-reading method and apparatus and intelligent terminal device |
Non-Patent Citations (2)
Title |
---|
张国印等: "《基于贝叶斯网络的Android恶意行为检测方法》", 《计算机工程与应用》 * |
黄文茜等: "《基于用户行为分析的智能终端应用管理优化》", 《计算机系统应用》 * |
Cited By (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109117874A (en) * | 2018-07-25 | 2019-01-01 | 北京小米移动软件有限公司 | Operation behavior prediction technique and device |
CN111797861A (en) * | 2019-04-09 | 2020-10-20 | Oppo广东移动通信有限公司 | Information processing method, information processing apparatus, storage medium, and electronic device |
CN111737101A (en) * | 2020-06-24 | 2020-10-02 | 平安科技(深圳)有限公司 | User behavior monitoring method, device, equipment and medium based on big data |
CN111737101B (en) * | 2020-06-24 | 2022-05-03 | 平安科技(深圳)有限公司 | User behavior monitoring method, device, equipment and medium based on big data |
CN113836370A (en) * | 2021-11-25 | 2021-12-24 | 上海观安信息技术股份有限公司 | User group classification method and device, storage medium and computer equipment |
CN113836370B (en) * | 2021-11-25 | 2022-03-01 | 上海观安信息技术股份有限公司 | User group classification method and device, storage medium and computer equipment |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN109086816A (en) | A kind of user behavior analysis system based on Bayesian Classification Arithmetic | |
US6542881B1 (en) | System and method for revealing necessary and sufficient conditions for database analysis | |
CN108959004B (en) | Disk failure prediction method, device, equipment and computer readable storage medium | |
CN110806954B (en) | Method, device, equipment and storage medium for evaluating cloud host resources | |
CN107247811B (en) | SQL statement performance optimization method and device based on Oracle database | |
CN109461023B (en) | Loss user retrieval method and device, electronic equipment and storage medium | |
CN113268403B (en) | Time series analysis and prediction method, device, equipment and storage medium | |
CN110399377A (en) | Optimization method, device, electronic equipment and the computer readable storage medium of SQL | |
CN104391879A (en) | Method and device for hierarchical clustering | |
CN117391292A (en) | Carbon emission energy-saving management analysis system and method | |
CN110232130B (en) | Metadata management pedigree generation method, apparatus, computer device and storage medium | |
CN111639902A (en) | Data auditing method based on kafka, control device, computer equipment and storage medium | |
CN109117873A (en) | A kind of user behavior analysis method based on Bayesian Classification Arithmetic | |
CN112765463B (en) | Data management method for big data and user requirements and cloud computing server | |
CN110602207A (en) | Method, device, server and storage medium for predicting push information based on off-network | |
CN112883066A (en) | Multidimensional range query cardinality estimation method on database | |
CN109101395A (en) | A kind of High Performance Computing Cluster application monitoring method and system based on LSTM | |
CN115174686B (en) | Method and device for dynamically adjusting weights of multiple service channels based on service efficiency | |
CN111090585A (en) | Crowd-sourcing task closing time automatic prediction method based on crowd-sourcing process | |
CN110502495A (en) | A kind of log collecting method and device of application server | |
CN112396313B (en) | Method for optimizing telephone sales performance by using smart watch | |
CN115309638A (en) | Method and device for assisting model optimization | |
CN114116908A (en) | Data management method and device and electronic equipment | |
CN112800035A (en) | GIS (geographic information System) -based power grid data communication sharing system | |
CN105897503A (en) | Hadoop cluster bottleneck detection algorithm based on resource information gain |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20181225 |
|
RJ01 | Rejection of invention patent application after publication |