Disclosure of Invention
The invention aims to overcome at least one defect (deficiency) of the prior art, and provides a user behavior analysis method and a resource recommendation method based on the analysis method, which can well analyze user behaviors, reduce the difficulty of analyzing new user behaviors according to a decision tree generated by analysis, and mine the fuzzy requirement of a new user.
The technical scheme adopted by the invention is as follows:
a user behavior analysis method, comprising:
generating a directed behavior graph according to behavior data of a certain user;
calculating the behavior edge weight of the directed behavior graph, and extracting an effective path;
dividing the behavior data into directional requirements and fuzzy requirements, forming directional requirement characteristics by the mapping relation between the resource characteristics corresponding to the directional requirements and the user characteristics of the user, forming fuzzy requirement characteristics by the mapping relation between the resource characteristics corresponding to the fuzzy requirements and the user characteristics of the user, and putting the directional requirement characteristics of a plurality of users into a user requirement relation library;
performing cluster analysis on the fuzzy demand characteristics of a plurality of similar users to obtain a similar user demand characteristic set and/or performing cluster analysis on the similar fuzzy demand characteristics of a plurality of users to obtain a similar demand user characteristic set, and putting the similar user demand characteristic set and/or the similar demand user characteristic set into a user demand relation library;
generating a resource role frame according to the directed behavior diagram;
and generating a decision tree according to the resource role framework and the user demand relation library.
The method comprises the steps of sampling a plurality of user behavior data, dividing the sampled behavior data into a directional demand and a fuzzy demand, respectively analyzing and processing the directional demand and the fuzzy demand to generate a resource role framework and a user demand relation library, and constructing a decision tree. The requirements of the users are analyzed through the decision tree, the difficulty of requirement analysis can be reduced, and the fuzzy requirements of the users are fully mined, so that resources and services which are more in line with the requirements of the users can be recommended for the users.
Further, generating a directed behavior graph according to behavior data of a certain user specifically includes:
recording the transaction type, operation type, action time length and resource characteristics of a certain user action;
forming a growth tree by taking the transaction type as a root node and the operation type as a child node;
establishing a cell pool on the child node according to the action time length, wherein the resource characteristics are used as cells in the cell pool;
and generating a directed behavior graph by taking the cell pool as an edge.
Further, generating a directed behavior graph according to behavior data of a certain user specifically includes:
judging whether a user is a directional demand user or a fuzzy demand user according to behavior data of the user;
if the user is the directional demand user, generating a directed behavior graph according to the behavior data of the user;
and if the user is the fuzzy demand user, generating a directed behavior graph according to the behavior data of the user and the behavior data of similar users of the user.
When the behavior data of the user is recorded, different records can be carried out according to different requirements of the user, and the generated directed behavior graph can enable the user requirement relational library, the resource role framework and the decision tree which are generated subsequently to be capable of better analyzing the behavior of the user.
Further, according to behavior data of a certain user, whether the user is a directional demand user or a fuzzy demand user is judged, and the method specifically includes:
and judging whether the user is a directional demand user or a fuzzy demand user according to the behavior similarity between a behavior data structure of the user and a user behavior logic model, and/or the resource similarity between the resource characteristics in the user behavior data and the resource roles in the pre-established resource role characteristic library, and/or the user similarity between the user characteristics of the user and the user roles in the pre-established user role characteristic library.
Further, calculating the behavior edge weight of the directed behavior graph, and extracting an effective path specifically includes:
performing cluster analysis on the behavior data on a time scale, and calculating the time weight of the directed behavior graph to form a time behavior data graph;
performing clustering analysis on the behavior data on a spatial scale, and calculating the spatial weight of the directed behavior graph to form a spatial behavior data graph;
and calculating the behavior edge weight of the directed behavior graph according to the time weight and the space weight, and respectively extracting effective paths of the time behavior data graph and the space behavior data graph.
Further, dividing the behavior data into a directional requirement and a fuzzy requirement specifically includes:
combining the shortest paths of the time behavior data graph and the space behavior data graph to obtain the direct demand characteristics of the user;
analyzing the loop intersection points in the effective paths, extracting the loop-free paths, performing similarity analysis on the resource characteristics of the loop-free paths and the characteristics directly required by the user, and determining whether the behavior data corresponding to the loop-free paths is a directional requirement or a fuzzy requirement according to the similarity analysis result.
Further, generating a resource role framework according to the directed behavior graph specifically includes:
extracting edges with weights higher than a preset value in the directed behavior graph as a sub-graph a;
traversing a user demand relation library, and screening a sub-graph b with the support degree higher than a support degree threshold;
setting the resource characteristic of the sub-graph a as A and the resource characteristic of the sub-graph B as B, and calculating confidence coefficients from A to B and from B to A;
and screening out the resource features with the confidence coefficient higher than the confidence coefficient threshold value, and forming a resource role framework by the screened-out resource features.
Further, generating a decision tree according to the resource role framework and the user requirement relation library, specifically comprising:
sequencing the resource characteristics of the resource role framework according to the confidence coefficient by adopting a bubbling method, and taking the obtained sequence as a main decision rule; inducing the user requirement relation library to form an auxiliary decision rule;
and generating a decision tree according to the main decision rule and the auxiliary decision rule.
A resource recommendation method, comprising:
acquiring fuzzy demand characteristics of a user;
acquiring a demand characteristic with the highest relevance to the fuzzy demand characteristic and a resource role corresponding to the demand characteristic according to the decision tree;
and recommending resources to the user according to the resource roles.
When the fuzzy requirement is input by the user, the requirement of the user is analyzed through the decision tree, the difficulty of requirement analysis can be reduced, and the fuzzy requirement of the user is fully mined, so that resources and services which are more in line with the requirement of the user can be recommended to the user.
Further, the method further comprises:
acquiring a user role according to the user characteristics of the user;
and recommending resources to the user according to the user role and the resource role.
Compared with the prior art, the invention has the beneficial effects that: by formulating a user behavior data recording mode, generating a directed behavior graph according to user behavior data, dividing the user behavior data into directional requirements and fuzzy requirements, forming a resource role framework and a user requirement relational database, constructing a decision tree, reducing the difficulty of analyzing the new user behavior data, mining the fuzzy requirements of new users, and providing more friendly resource recommendation for the users.
Example 1
As shown in fig. 1, the present embodiment provides a user behavior analysis method, including:
A1. generating a directed behavior graph according to behavior data of a certain user;
A2. calculating the behavior edge weight of the directed behavior graph, and extracting an effective path;
A3. dividing the behavior data into directional requirements and fuzzy requirements, forming directional requirement characteristics by the mapping relation between the resource characteristics corresponding to the directional requirements and the user characteristics of the user, forming fuzzy requirement characteristics by the mapping relation between the resource characteristics corresponding to the fuzzy requirements and the user characteristics of the user, and putting the directional requirement characteristics of a plurality of users into a user requirement relation library;
A4. performing cluster analysis on the fuzzy demand characteristics of a plurality of similar users to obtain a similar user demand characteristic set and/or performing cluster analysis on the similar fuzzy demand characteristics of a plurality of users to obtain a similar demand user characteristic set, and putting the similar user demand characteristic set and/or the similar demand user characteristic set into a user demand relation library;
A5. generating a resource role frame according to the directed behavior diagram;
A6. and generating a decision tree according to the resource role framework and the user demand relation library.
The method comprises the steps of sampling a plurality of user behavior data, dividing the sampled behavior data into a directional demand and a fuzzy demand, respectively analyzing and processing the directional demand and the fuzzy demand to generate a resource role framework and a user demand relation library, and constructing a decision tree. The requirements of the users are analyzed through the decision tree, the difficulty of requirement analysis can be reduced, and the fuzzy requirements of the users are fully mined, so that resources and services which are more in line with the requirements of the users can be recommended for the users.
Step a1 specifically includes:
A11. recording the transaction type, operation type, action time length and resource characteristics of a certain user action;
A12. forming a growth tree by taking the transaction type as a root node and the operation type as a child node;
A13. establishing a cell pool on the child node according to the action time length, wherein the resource characteristics are used as cells in the cell pool;
A14. and generating a directed behavior graph by taking the cell pool as an edge.
The transaction type is a classification of the nature of the user's behavior, such as browsing, searching, intent confirmation, consultation, transaction, etc.
The operation type is a classification of data interaction actions in user behaviors, the data interaction actions may include all keyboard and mouse operations and all data information in the behavior process, for example, when the transaction type is search, the operation type may be keyword input, backspace, delete, confirmation and the like in the search behavior process.
The starting time and the ending time of each behavior of the user are recorded to calculate the duration of the whole behavior, which is recorded as the time length of the behavior, and a point can be buried in the ending action of each behavior in the implementation process, so as to confirm that the behavior is ended. Taking the transaction type as an example for searching, after a user carries out a series of operations in a search box and a screening list, a point is buried on a 'confirm' or 'screening' key, and when the buried point is triggered, the 'searching' behavior is marked to be ended.
The resource characteristics related to each behavior of the user are recorded to form the resource characteristics.
In summary, the behavior data of the user can be recorded in the form of "transaction code + operation code + data packet + end code". When the whole behavior data is recorded, a growing tree is formed by taking a transaction type as a root node and an operation type as a child node, the behavior time length is quantized, a cell pool for data collection is set under the child node of the growing tree according to the behavior time length quantization value, and the resource characteristics are stored in the cell pool, so that a directed behavior graph with the cell pool as a side can be generated.
Step a11 specifically includes:
A111. judging whether a user is a directional demand user or a fuzzy demand user according to behavior data of the user;
A112. if the user is the directional demand user, generating a directed behavior graph according to the behavior data of the user;
A113. and if the user is the fuzzy demand user, generating a directed behavior graph according to the behavior data of the user and the behavior data of similar users of the user.
When recording the behavior data of the user, different records can be performed according to different requirements and intentions of the user. If the requirement intention degree expressed by the behavior data of the user is fuzzy requirement, the behavior data of similar users can be recorded when the behavior data of the user is recorded, and a directed behavior graph is generated according to the behavior data of the user and the similar users. The generated directed behavior graph can enable the user demand relational library, the resource role framework and the decision tree which are generated subsequently to analyze the behavior of the user better.
The step a111 specifically includes: and judging whether the user is a directional demand user or a fuzzy demand user according to the behavior similarity between a behavior data structure of the user and a user behavior logic model, and/or the resource similarity between the resource characteristics in the user behavior data and the resource roles in the pre-established resource role characteristic library, and/or the user similarity between the user characteristics of the user and the user roles in the pre-established user role characteristic library.
Carrying out similarity analysis between a behavior data structure of a user and a user behavior logic model, wherein the analyzed behavior similarity X is a user requirement intention evaluation index I; carrying out similarity analysis between the resource characteristics in the user behavior data and the resource roles in the pre-established resource role characteristic library, wherein the analyzed resource similarity Y is a demand intention evaluation index II; and analyzing the similarity between the user characteristics of the user and the user roles in the pre-established user role characteristic library, wherein the analyzed user similarity Z is an evaluation index III of the user requirement intention, and the user similarity Z is 1 when the user roles cannot be obtained. And when the evaluation indexes I, II and/or III of the user demand intention degree are/is lower than the lower limit value, judging that the user is a fuzzy demand user. When three evaluation indexes are adopted for judgment at the same time, the weight sequence of the evaluation indexes I, II and III of the user demand intention degree is preferably as follows: II > I > III.
When the user is judged to be the fuzzy demand user, recording user behavior data by adopting a rule 1 shown in fig. 1, namely when the user behavior data is recorded, recording similar user behavior data of the user, and generating a directed behavior diagram according to the behavior data of the user and the similar user; when the user is judged to be the user with the directional demand, the rule 2 shown in fig. 1 is adopted to record the user behavior data, that is, only the behavior data of the user needs to be recorded, and a directed behavior diagram is generated according to the behavior data of the user.
The behavior logic model can be an empirical model obtained by counting a large number of user behaviors; the calculation rule of the behavior logic model may also be set according to the self market research situation, and the calculation rule is exemplified as follows: if a user firstly searches for a transaction and then browses, correlation analysis is carried out on search keywords according to a user demand relation library, if the correlation degree is high, directional demand is determined, if the correlation degree is low, fuzzy demand is determined, follow-up behavior operation of the user is monitored continuously, if the record shows that when the user overviews the whole transaction in browsing, the dwell time of a page is lower than the mean dwell time of the page of the user, the head and tail resource data information appearing on the page is recorded, and the demand degree keeps unchanged until the user enters the resource page to drive the browsing transaction, and the demand degree is changed into the directional demand. In specific implementation, the demand degree may be initially set to 0, and in monitoring the subsequent behavior of the user, the demand degree may be updated according to the user behavior.
The pre-building process of the resource role feature library can be as follows: extracting resource features according to the provided resources; and performing clustering analysis on the resource characteristics to establish a resource role characteristic library. The resources may specifically be scientific and technological resources, and may include instruments, core technical theories, method systems, and the like; the resource characteristics may include a resource name, a data interface, a data supplier characteristic, a resource application object, a background to which the resource belongs, a resource utility performance characteristic, and the like. The extraction of resource features may be based on a resource semantic analysis method.
The pre-building process of the user role feature library can be as follows: extracting user characteristics according to basic data information of each user; clustering and analyzing the user characteristics to establish a user role characteristic library; the user role characteristic library comprises a user characteristic relation and a similar user resource characteristic relation. Extracting the user roles according to the basic data information of each user, which specifically may be: and analyzing the basic information of the user and the resource occupation condition of the user to obtain the user role.
Step a2 specifically includes:
A21. performing cluster analysis on the behavior data on a time scale, and calculating the time weight of the directed behavior graph to form a time behavior data graph;
A22. performing clustering analysis on the behavior data on a spatial scale, and calculating the spatial weight of the directed behavior graph to form a spatial behavior data graph;
A23. and calculating the behavior edge weight of the directed behavior graph according to the time weight and the space weight, and respectively extracting effective paths of the time behavior data graph and the space behavior data graph.
In step a21, performing cluster analysis on the resource features in the cell pool on a time scale to obtain behavior resource feature clusters in unit time, and performing feature distribution analysis on the clusters on the time axis through fourier transform to obtain feature frequency distribution, where the frequency distribution constitutes the weight of the time behavior data graph.
In step a22, performing cluster analysis on the resource features in the cell pool on a spatial scale, that is, performing cluster analysis on the resource features in the entire behavior data sample to obtain a plurality of clusters, backtracking the data in different clusters in time to obtain the distribution of similar behavior data in the behavior flow, sharpening the distribution to remove a low frequency value to obtain a high frequency distribution, analyzing the time continuity of the data features in the high frequency distribution to obtain a high frequency feature linear distribution value, and obtaining behavior resource correlation distribution according to the linear correlation value, so as to use the high frequency feature linear distribution value as the weight of the spatial behavior data graph.
In step a23, calculating the behavior edge weight of the directed behavior graph according to the temporal weight and the spatial weight specifically includes: firstly, classifying resource features in user behavior data according to the resource field and user features of users, wherein each type of resource features has corresponding weight, counting the resource features with the same transaction type, classifying the resource features according to the weights to obtain resource feature data quantity of each weight, and superposing the behavior time length, the data quantity and the weights to obtain the resource demand basic proportion of the user behavior; secondly, the resource features in the user behavior data are counted, the data features are extracted, the data features are weighted according to the basic proportion of the resource requirements, a weighted data feature probability distribution curve is obtained, the distribution lines are sharpened, and the user behavior core requirement weight, namely the behavior edge weight of the directed behavior graph, is obtained.
Step a3 specifically includes:
A31. combining the shortest paths of the time behavior data graph and the space behavior data graph to obtain the direct demand characteristics of the user;
A32. analyzing the circuit cross points in the effective paths, extracting the non-circuit paths, extracting the sides of which the cell pools are larger than the threshold value, performing similarity analysis on the resource characteristics corresponding to the extracted sides and the characteristics directly required by the user, and determining whether the behavior data corresponding to the sides is the directional requirement or the fuzzy requirement according to the similarity analysis result.
The specific implementation process of step a32 may be: analyzing a loop intersection point in an effective path, extracting a loop-free path, extracting a side with a larger cell pool, counting the weight of the side, if the weight of the side is lower than the average value of the overall weight, calling a resource feature corresponding to the side, performing similarity analysis on the resource feature and a feature directly required by a user to obtain similarity, classifying behavior data corresponding to the side with the similarity lower than the average value as fuzzy requirements, and classifying behavior data corresponding to the side with the similarity higher than the average value as directional requirements.
In step a4, performing cluster analysis on the fuzzy demand characteristics of a plurality of similar users to obtain a demand characteristic set of the similar users, which specifically includes:
A41. in a directed behavior graph generated according to similar user behavior data, acquiring an edge which describes the direct demand characteristics of similar users and has a weight higher than a weight threshold value in a shortest path as a subgraph, and calculating the support degree of the subgraph;
A42. and taking the resource features in the subgraph with the support degree higher than the support degree threshold value as the similar user requirement features to form a similar user requirement feature set.
The weight threshold may be a weight mean; the support threshold may be a support mean.
Step a5 specifically includes:
A51. extracting edges with weights higher than a preset value in the directed behavior graph as a sub-graph a;
A52. traversing a user demand relation library, and screening a sub-graph b with the support degree higher than a support degree threshold;
A53. setting the resource characteristic of the sub-graph a as A and the resource characteristic of the sub-graph B as B, and calculating confidence coefficients from A to B and from B to A; A54. and screening out the resource features with the confidence coefficient higher than the confidence coefficient threshold value, and forming a resource role framework by the screened-out resource features.
The support threshold may be a support mean; the confidence threshold may be a confidence mean.
Step a6 specifically includes:
A61. sequencing the resource characteristics of the resource role framework according to the confidence coefficient by adopting a bubbling method, and taking the obtained sequence as a main decision rule;
A62. inducing the user requirement relation library to form an auxiliary decision rule;
A63. and generating a decision tree according to the main decision rule and the auxiliary decision rule.
The auxiliary decision rule does not influence the sequence of the main decision rule and is only used for calculating the proportion of the resource requirements. The resource role funnel tree diagram is obtained according to the main decision rule, then the resource demand proportion is calculated according to the auxiliary decision rule, and the weight of the resource characteristics of each node in the resource role funnel tree diagram is adjusted according to the resource demand proportion, so that the decision tree is generated.