CN116167829B - Multidimensional and multi-granularity user behavior analysis method - Google Patents

Multidimensional and multi-granularity user behavior analysis method Download PDF

Info

Publication number
CN116167829B
CN116167829B CN202310461608.3A CN202310461608A CN116167829B CN 116167829 B CN116167829 B CN 116167829B CN 202310461608 A CN202310461608 A CN 202310461608A CN 116167829 B CN116167829 B CN 116167829B
Authority
CN
China
Prior art keywords
user
webpage
software
behavior
click
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202310461608.3A
Other languages
Chinese (zh)
Other versions
CN116167829A (en
Inventor
王琨
刘滔
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hunan Weike Technology Group Co ltd
Original Assignee
Hunan Weike Technology Group Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hunan Weike Technology Group Co ltd filed Critical Hunan Weike Technology Group Co ltd
Priority to CN202310461608.3A priority Critical patent/CN116167829B/en
Publication of CN116167829A publication Critical patent/CN116167829A/en
Application granted granted Critical
Publication of CN116167829B publication Critical patent/CN116167829B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/06Buying, selling or leasing transactions
    • G06Q30/0601Electronic shopping [e-shopping]
    • G06Q30/0631Item recommendations
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/30Monitoring
    • G06F11/34Recording or statistical evaluation of computer activity, e.g. of down time, of input/output operation ; Recording or statistical evaluation of user activity, e.g. usability assessment
    • G06F11/3438Recording or statistical evaluation of computer activity, e.g. of down time, of input/output operation ; Recording or statistical evaluation of user activity, e.g. usability assessment monitoring of user actions
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2457Query processing with adaptation to user needs
    • G06F16/24578Query processing with adaptation to user needs using ranking
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2458Special types of queries, e.g. statistical queries, fuzzy queries or distributed queries
    • G06F16/2474Sequence data queries, e.g. querying versioned data
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • G06F16/9535Search customisation based on user profiles and personalisation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Abstract

The invention relates to the technical field of user behavior analysis, and discloses a multidimensional multi-granularity user behavior analysis method, which comprises the following steps: collecting online user behavior data, and constructing the collected online user behavior data into a user multidimensional behavior track; inputting the multidimensional behavior track of the user into a user group identification model, and identifying to obtain the group category of the user; constructing a multi-granularity user click intention model to obtain the intention degrees of different categories of users on different pages; and sequencing the intention degree of the users on different pages from high to low according to the category, and recommending the pages to the users in sequence. The invention constructs the multidimensional behavior track of the user by combining the click frequency characteristic and the time sequence characteristic of the user on the webpage and the software, further realizes the user group identification based on the user behavior, and obtains the intention degree of the user on different webpages according to the message recursion and the aggregation processing of the click time sequence matrix of the same category of the user webpage, thereby recommending the webpage.

Description

Multidimensional and multi-granularity user behavior analysis method
Technical Field
The invention relates to the technical field of user behavior analysis, in particular to a multidimensional and multi-granularity user behavior analysis method.
Background
With the gradual maturity of related industries such as intelligent equipment and 5G, the data scale of online behaviors of users is rapidly expanded, and the problems of personal information leakage, difficulty in identification of anonymous user identities and the like are increasingly highlighted. However, the traditional online user behavior analysis model is often constructed aiming at specific data, cannot be combined with multidimensional data to perform analysis processing, and needs to consume a large amount of manpower and material resources to extract features, and the limitations of strong subjectivity of the extracted features, large data noise and the like cause poor online shopping experience of the user, so that good commodity recommendation cannot be provided for the user. Aiming at the problem, the invention provides a multidimensional multi-granularity user behavior analysis method which is used for analyzing user behaviors by combining various types of online data and providing powerful technical support for commodity recommendation.
Disclosure of Invention
In view of this, the present invention provides a multidimensional and multi-granularity user behavior analysis method, which aims at: 1) Screening to obtain candidate webpage behavior track points and candidate software behavior track points by taking the click frequency and the total click frequency of different webpages or software as screening conditions, wherein the higher the total click frequency is, the higher the click frequency is, the probability of being judged as the track points is, the confidence level of the candidate behavior track is determined by calculating the probability that any two track points in the candidate track points are clicked simultaneously, the higher the confidence level is the stronger the relevance between any two track points in the candidate behavior track, otherwise, the accidental track points are represented, and further, the user multidimensional behavior track comprising the webpage behavior track of the user and the behavior track of the user software is constructed from the click frequency characteristics and the time characteristics of the webpages or the software; 2) Constructing a multi-granularity user click intention model, initializing user parameters and webpage parameters of the same category of users, constructing a message propagation system, performing embedded recursion propagation on the user parameters and the webpage parameters, further performing aggregation processing on a webpage click time sequence matrix of the users and coding representation of the click frequency, obtaining inner product calculation representation of the category of users on different webpages, namely, the time sequence coding representation of the category of users on different webpage click frequencies after multiple rounds of message transmission is the same as the angle of the webpage in a vector space, using the inner product calculation representation as the intention degree of the same category of users on different webpages, mapping the webpage to a commodity page, using the intention degree of the webpage as the intention degree of the mapped commodity page, and providing powerful technical support for commodity recommendation.
The invention provides a multidimensional multi-granularity user behavior analysis method, which comprises the following steps:
s1: collecting online user behavior data, and constructing the collected online user behavior data into a user multidimensional behavior track, wherein the online user behavior data comprises webpage click data and software click data;
s2: constructing and training to obtain a user group identification model, inputting a user multidimensional behavior track into the user group identification model, and identifying to obtain a group category of the user, wherein the user group identification model takes the user multidimensional behavior track as input and takes maximized user group distribution as a training objective function;
s3: constructing a multi-granularity user click intention model, wherein the multi-granularity click intention model takes a user multi-dimensional behavior track set of the same category as input and takes the intention degree of the category user on different pages as output;
s4: and according to the identified user group category, sequencing the intention degree of the users to different pages from high to low according to the category, and recommending the pages to the users in sequence.
As a further improvement of the present invention:
optionally, the step S1 of collecting online user behavior data includes:
The method comprises the steps of collecting online user behavior data, wherein the online user behavior data comprises webpage click data and software click data, and the collection flow of the online user behavior data is as follows:
constructing a webpage clicking statistical table, wherein the webpage clicking statistical table comprisesThe method comprises the steps of receiving a webpage click statistics table, acquiring click time sequence data of a user on the webpage in the webpage click statistics table, wherein the click time sequence data comprises common webpages and acquisition time sequence data of the user on the common webpages in the webpage click statistics table are acquired:
wherein:
indicating the click time sequence data of the user on the ith common webpage in the webpage click statistical table,,/>indicating that the user is +.>Time period without clicking on the ith common webpage, +.>Indicating that the user is +.>Time period without clicking on the ith common webpage, +.>Representing the acquisition time range of the online user behavior data;
counting the total number of clicks of the user of each common webpage in the webpage clicking statistical table:
wherein:
representing the total clicking times of the ith common webpage in the webpage clicking statistical table in the acquisition time range;
taking the total clicking times and clicking time sequence data of the user of each common webpage in the webpage clicking statistical table as webpage clicking data;
constructing a software click statistical table, wherein the software click statistical table comprisesAnd acquiring click time sequence data of the user on the common software in the software click statistical table:
Wherein:click time sequence data representing the j-th common software in the software click statistic table by a user,,/>indicating that the user is +.>Time period without clicking on the j-th common software, +.>Indicating that the user is +.>The j-th common software is not clicked in the time period;
statistics software the total number of clicks by the user for each common software in the statistics table:
wherein:
representing the total clicking times of the jth common software in the software clicking statistical table in the acquisition time range;
taking the total clicking times and the clicking time sequence data of the user of each common software in the software clicking statistical table as software clicking data;
and constructing the webpage clicking data and the software clicking data as online user behavior data. In the embodiment of the invention, if the user is in the process ofAnd (3) operating and clicking the interface of the j-th common software or common webpage in the period, marking the user to click the j-th common software or common webpage in the period, wherein the total times of operating and clicking the interface of the common software or common webpage in the acquisition time range by the user is the total times of clicking by the user.
Optionally, in the step S1, the building the collected online user behavior data into a user multidimensional behavior track includes:
the collected online user behavior data is constructed into a user multidimensional behavior track, wherein the construction flow of the user multidimensional track is as follows:
S11: calculating click frequency of a user on any webpage and software in an acquisition time range:
wherein:
indicating the click frequency of the user on the ith common webpage, < ->Representing the click frequency of the user on the j-th common software;
s12: setting minimum click frequency values for web pages and software respectivelyWherein->Minimum click frequency value representing a web page, +.>Representing a minimum click frequency value for the software;
s13: keep click frequency greater thanIs a web page of (2); keep click frequency greater than +.>Is a software of (a);
s14: combining the web pages reserved in the step S13 in pairs, calculating the click frequency of the user clicking the web page combining results in the acquisition time range, and returning to the step S13 until a new web page combining result cannot be reserved;
combining the software reserved in the step S13 in pairs, calculating the click frequency of the software combination result clicked by the user in the acquisition time range, and returning to the step S13 until a new software combination result cannot be reserved;
s15: respectively constructing and obtaining a webpage clicking time sequence matrix A and a software clicking time sequence matrix B of a user:
calculating to obtain the probability of clicking the (i+1) th webpage simultaneously in the time period of clicking any (i) th webpage by the user, wherein the probability is used as the confidence coefficient between the (i) th webpage and the (i+1) th webpage;
Calculating to obtain the probability of clicking the j+1th software in the time period of clicking any j software by the user, wherein the probability is used as the confidence coefficient between the j software and the j+1th software;
s16: setting the confidence threshold as
Calculating the confidence coefficient of any two webpages in the combination result of the webpages reserved in the step S14, taking the confidence coefficient mean value as the confidence coefficient of the combination result of the reserved webpages, and selecting the confidence coefficient to be greater than or equal to a confidence coefficient threshold valueThe web page combination result of the (2) is used as a candidate web page behavior track, and the candidate web page behavior track is selectedThe track with the largest number of web pages in the track is used as the behavior track of the user web pages;
calculating the confidence coefficient of any two pieces of software in the combination result of the software reserved in the step S14, taking the confidence coefficient mean value as the confidence coefficient of the combination result of the reserved software, and selecting the confidence coefficient to be greater than or equal to a confidence coefficient threshold valueThe software combination result of the software is used as a candidate software behavior track, and the track with the largest software number in the candidate software behavior tracks is selected as a user software behavior track;
s17: filtering repeated webpages in the webpage behavior track of the user; filtering repeated software in the behavior track of the user software; obtaining the multidimensional behavior track of the userWherein->Representing the filtered behavior trace of the user web page, +. >Representing the filtered behavior trace of the user software.
Optionally, the step S2 of constructing a user group identification model includes: constructing a user group identification model, wherein the input of the constructed user group identification model is a user multidimensional behavior track of a user to be identified, the output result is a group category of the user, and a group category identification formula based on the user group identification model is as follows:
wherein:
representing the probability that the multidimensional behavior track X of the user belongs to the class of the y-th population;
s is represented as the behavior trace of the user webpage>Wherein the track point is the user webpage behavior track +.>Is a web page->Weights representing the trace points s +.>Representing the number of times the locus s appears in the class of the group y;
representing k as user software behavior trace +.>Wherein the track point is a user software behavior track +.>Is one of the software->Weights representing track points k +.>Representing the number of times that the locus point k appears in the class of the group y;
is the total number of group categories;
and +.>And solving parameters for the to-be-trained.
Optionally, training the constructed user group identification model in the step S2 includes:
Training the constructed user group identification model, wherein the training process is as follows:
s21: acquiring online user behavior data of M users and extracting user multidimensional behavior tracks, wherein the acquired user multidimensional behavior tracks are not repeated, the acquisition flow of the user multidimensional behavior tracks is step S1, and in the embodiment of the invention, the acquired acquisition time ranges of the M user multidimensional behavior tracks are the same in length;
s22: marking the category of a user group for M user multidimensional behavior tracks, wherein in the embodiment of the invention, the category of the user group comprises game users, news users, movie users, variety users and the like;
s23: building a training objective function of a user group identification model:
wherein:
representing the probability of occurrence of the class of the y-th population in the collected M user multidimensional behavior tracks;
r represents any locus point in the multidimensional behavior locus of M users,representing track point weights +.>
Representing the frequency of occurrence of the trace point r in the multi-dimensional behavioural traces of M users, +.>The probability of occurrence of the y-th population class and occurrence of the track point r is represented;
obtaining according to the obtained M user multidimensional behavior tracksAnd +.>And obtaining the weight of each track point by minimizing the training objective function.
Optionally, in the step S2, the multi-dimensional behavior track of the user is input to a user group identification model, and the group category of the user is identified, including:
inputting the user multidimensional behavior track X into a user group identification model to obtain probabilities that the user multidimensional behavior track X belongs to different group categories, selecting the group category with the highest probability as an output value of the user group identification model, and identifying to obtain the group category of the user
Optionally, constructing a multi-granularity user click intention model in the step S3 includes:
constructing a multi-granularity user click intention model, wherein the multi-granularity click intention model takes a user multi-dimensional behavior track set of the same category as input and takes the intention degree of the category user on different pages as output;
the constructed multi-granularity user click intention model comprises an embedding layer, a propagation layer and an output layer, wherein the embedding layer is used for initializing user parameters and webpage parameters of the same category of users, the propagation layer is used for constructing a message propagation system, the user parameters and the webpage parameters of the same category are embedded and propagated, the output layer is used for evaluating the intention degree of the same category of users on different webpages, the webpages are mapped to commodity pages, and the intention degree of the webpages is used as the intention degree of the mapped commodity pages; in the embodiment of the invention, the content specifically described in the webpage is the same as the commodity displayed in the mapped commodity page;
Extracting webpage click time sequence matrixes of M users in the step S2, taking the webpage click time sequence matrixes of the users in the same category as a group, constructing each group to obtain a multi-granularity user click intention model, wherein the construction flow of the multi-granularity user click intention model corresponding to the y group is as follows:
s31: the embedded layer initializes the webpage clicking time sequence matrix of the user in the y group as a user parameter, and initializes the webpage ID names appearing in the user parameter as a webpage parameter, wherein the number of the webpage clicking time sequence matrix of the user in the y group is as followsThe number of web page ID names is->
S32: the embedded layer constructs initial codes of user parameters and webpage parameters:
wherein:
represents the u-th webpage click timing matrix +.>And c-th webpage parameter->Is used for the initial encoding of (a),,/>
s33: the propagation layer recursively represents the message based on the user parameters, the web page parameters, and the initial encoding of the embedded layer:
wherein:representing the webpage click time sequence matrix after D times of propagation>,/>
Representing the encoded representation after the D-pass;
representing an activation function;
s34: the output layer performs aggregation treatment on the propagation result of the propagation layer:
wherein:
an aggregation processing result representing the user parameter propagation result;
representing the c-th webpage parameter->Is a result of the polymerization treatment;
S35: calculated to obtainAnd->Is expressed by the inner product of (2) and is calculated to obtain +.>And->Inner product representation of individual web page parameters, normalized to +.>Carrying out normalization processing on the inner product representation of each webpage parameter, wherein the normalization processing result is the intention degree of the corresponding webpage parameter; and mapping the webpage to the commodity page, and taking the intention degree of the webpage as the intention degree of the mapped commodity page.
Optionally, in the step S4, according to the identified user group category, the user' S intention degree of the category user for different pages is ordered from high to low, and pages are sequentially recommended to the user in order, including:
and (2) identifying the obtained user group category according to the step (S2), sequencing the intention degree of the category users on different pages according to the intention degree of the category users on the different pages from high to low based on the intention degree of the category users on different pages output by the multi-granularity user click intention model, and recommending commodity pages to the users in sequence.
In order to solve the above-described problems, the present invention provides an electronic apparatus including:
a memory storing at least one instruction;
the communication interface is used for realizing the communication of the electronic equipment; and a processor executing the instructions stored in the memory to implement the multidimensional multi-granularity user behavior analysis method.
In order to solve the above-mentioned problems, the present invention also provides a computer-readable storage medium having at least one instruction stored therein, the at least one instruction being executed by a processor in an electronic device to implement the above-mentioned multidimensional multi-granularity user behavior analysis method.
Compared with the prior art, the invention provides a multidimensional multi-granularity user behavior analysis method, which has the following advantages: firstly, the scheme provides a user multidimensional behavior track construction flow, and click frequency of a user on any webpage and software in an acquisition time range is calculated:
wherein:indicating the click frequency of the user on the ith common webpage, < ->Representing the click frequency of the user on the j-th common software; setting minimum click frequency values for web pages and software, respectively>Wherein->Minimum click frequency value representing a web page, +.>Representing a minimum click frequency value for the software; keep click frequency greater than +.>Is a web page of (2); keep click frequency greater than +.>Is a software of (a); combining the reserved webpages in pairs, and calculating the click frequency of the user clicking the webpage combining results in the acquisition time range until a new webpage combining result cannot be reserved; will be preserved The reserved software is combined pairwise, and the clicking frequency of the software combining results is calculated when a user clicks the software combining results in the acquisition time range until a new software combining result cannot be reserved; respectively constructing and obtaining a webpage clicking time sequence matrix A and a software clicking time sequence matrix B of a user:
calculating to obtain the probability of clicking the (i+1) th webpage simultaneously in the time period of clicking any (i) th webpage by the user, wherein the probability is used as the confidence coefficient between the (i) th webpage and the (i+1) th webpage; calculating to obtain the probability of clicking the j+1th software in the time period of clicking any j software by the user, wherein the probability is used as the confidence coefficient between the j software and the j+1th software; setting the confidence threshold asThe method comprises the steps of carrying out a first treatment on the surface of the Calculating the confidence coefficient of any two webpages in the reserved webpage combination result, taking the confidence coefficient mean value as the confidence coefficient of the reserved webpage combination result, and selecting the confidence coefficient to be greater than or equal to the confidence coefficient threshold value +.>The webpage combination result of the (2) is used as a candidate webpage behavior track, and the track with the largest webpage number in the candidate webpage behavior track is selected as a user webpage behavior track; calculating the confidence coefficient of any two pieces of software in the combination result of the reserved software, taking the confidence coefficient mean value as the confidence coefficient of the combination result of the reserved software, and selecting the confidence coefficient to be greater than or equal to a confidence coefficient threshold value +. >The software combination result of the software is used as a candidate software behavior track, and the track with the largest software number in the candidate software behavior tracks is selected as a user software behavior track; filtering repeated webpages in the webpage behavior track of the user; filtering duplicate softwares in user software behavior trajectoriesA piece; obtaining a multidimensional behavior track of a user>WhereinRepresenting the filtered behavior trace of the user web page, +.>Representing the filtered behavior trace of the user software. According to the scheme, the click frequency and the total click frequency of different webpages or software are taken as screening conditions, candidate webpage behavior track points and candidate software behavior track points are screened, wherein the higher the total click frequency is, the higher the click frequency is, the probability of being judged as the track points is higher, the confidence level of the candidate behavior track is determined by calculating the probability that any two track points in the candidate track points are clicked simultaneously, the higher the confidence level is the stronger the relevance between any two track points in the candidate behavior track, otherwise, the accidental track points are represented, and further the user multidimensional behavior track comprising the webpage behavior track of the user and the behavior track of the user software is constructed from the click frequency characteristics and the time sequence characteristics of the webpages or the software.
Meanwhile, the scheme provides a page recommendation method, wherein a multi-granularity user click intention model is built, the multi-granularity click intention model takes a user multi-dimensional behavior track set of the same category as input, and the intention degree of the category user on different pages as output; the embedded layer initializes the webpage clicking time sequence matrix of the user in the y group as a user parameter, and initializes the webpage ID names appearing in the user parameter as a webpage parameter, wherein the number of the webpage clicking time sequence matrix of the user in the y group is as followsThe number of web page ID names is->The method comprises the steps of carrying out a first treatment on the surface of the The embedded layer constructs initial codes of user parameters and webpage parameters:
wherein:represents the u-th webpage click timing matrix +.>And c-th webpage parameter->Is encoded initially,/->,/>The method comprises the steps of carrying out a first treatment on the surface of the The propagation layer recursively represents the message based on the user parameters, the web page parameters, and the initial encoding of the embedded layer:
wherein:representing the webpage click time sequence matrix after D times of propagation>,/>;/>Representing the encoded representation after the D-pass; />Representing an activation function; propagation of propagation layer by output layerThe result is subjected to polymerization treatment:
wherein:an aggregation processing result representing the user parameter propagation result; />Representing the c-th webpage parameter->Is a result of the polymerization treatment; calculated- >And->Is expressed by the inner product of (2) and is calculated to obtain +.>And->Inner product representation of individual web page parameters, normalized to +.>Carrying out normalization processing on the inner product representation of each webpage parameter, wherein the normalization processing result is the intention degree of the corresponding webpage parameter; and mapping the webpage to the commodity page, and taking the intention degree of the webpage as the intention degree of the mapped commodity page. The scheme performs embedded recursion propagation on the user parameters and the webpage parameters by initializing the user parameters and the webpage parameters of the same category of users and constructing a message propagation system, so as to further enter the webpage click time sequence matrix of the users into the coding representation of the same combined click frequencyAnd performing row aggregation processing to obtain inner product calculation representations of the type of users on different webpages, namely, the time sequence codes of the type of users on different webpage click frequencies after multiple rounds of message transmission represent angles of the same webpage in a vector space, and the inner product calculation representations are used as the intention degrees of the same type of users on different webpages.
Drawings
FIG. 1 is a flow chart of a method for analyzing multi-dimensional and multi-granularity user behavior according to an embodiment of the present invention;
Fig. 2 is a schematic structural diagram of an electronic device for implementing a multidimensional multi-granularity user behavior analysis method according to an embodiment of the present application.
The achievement of the objects, functional features and advantages of the present application will be further described with reference to the accompanying drawings, in conjunction with the embodiments.
Detailed Description
It should be understood that the specific embodiments described herein are for purposes of illustration only and are not intended to limit the scope of the application.
The embodiment of the application provides a multidimensional and multi-granularity user behavior analysis method. The execution subject of the multidimensional multi-granularity user behavior analysis method includes, but is not limited to, at least one of a server, a terminal and the like which can be configured to execute the method provided by the embodiment of the application. In other words, the multidimensional multi-granularity user behavior analysis method may be performed by software or hardware installed in a terminal device or a server device, and the software may be a blockchain platform. The service end includes but is not limited to: a single server, a server cluster, a cloud server or a cloud server cluster, and the like.
Example 1
S1: and acquiring online user behavior data, and constructing the acquired online user behavior data into a user multidimensional behavior track, wherein the online user behavior data comprises webpage click data and software click data.
And in the step S1, collecting online user behavior data, including:
the method comprises the steps of collecting online user behavior data, wherein the online user behavior data comprises webpage click data and software click data, and the collection flow of the online user behavior data is as follows:
constructing a webpage clicking statistical table, wherein the webpage clicking statistical table comprisesThe method comprises the steps of receiving a webpage click statistics table, acquiring click time sequence data of a user on the webpage in the webpage click statistics table, wherein the click time sequence data comprises common webpages and acquisition time sequence data of the user on the common webpages in the webpage click statistics table are acquired:
wherein:
indicating the click time sequence data of the user on the ith common webpage in the webpage click statistical table,,/>indicating that the user is +.>Time period without clicking on the ith common webpage, +.>Indicating that the user is +.>Time period without clicking on the ith common webpage, +.>Representing the acquisition time range of the online user behavior data;
counting the total number of clicks of the user of each common webpage in the webpage clicking statistical table:
wherein:
representing the total clicking times of the ith common webpage in the webpage clicking statistical table in the acquisition time range;
taking the total clicking times and clicking time sequence data of the user of each common webpage in the webpage clicking statistical table as webpage clicking data;
constructing a software click statistical table, wherein the software click statistical table comprisesAnd acquiring click time sequence data of the user on the common software in the software click statistical table:
Wherein:
click time sequence data representing the j-th common software in the software click statistic table by a user,,/>indicating that the user is +.>Time period without clicking on the j-th common software, +.>Indicating that the user is +.>The j-th common software is not clicked in the time period;
statistics software the total number of clicks by the user for each common software in the statistics table:
wherein:
representing the total clicking times of the jth common software in the software clicking statistical table in the acquisition time range;
taking the total clicking times and the clicking time sequence data of the user of each common software in the software clicking statistical table as software clicking data;
and constructing the webpage clicking data and the software clicking data as online user behavior data. In the embodiment of the invention, if the user is in the process ofAnd (3) operating and clicking the interface of the j-th common software or common webpage in the period, marking the user to click the j-th common software or common webpage in the period, wherein the total times of operating and clicking the interface of the common software or common webpage in the acquisition time range by the user is the total times of clicking by the user.
In the step S1, the collected online user behavior data is constructed into a user multidimensional behavior track, and the method comprises the following steps:
the collected online user behavior data is constructed into a user multidimensional behavior track, wherein the construction flow of the user multidimensional track is as follows:
S11: calculating click frequency of a user on any webpage and software in an acquisition time range:
wherein:
representing the frequency of user clicks on the ith common web page,/>representing the click frequency of the user on the j-th common software;
s12: setting minimum click frequency values for web pages and software respectivelyWherein->Minimum click frequency value representing a web page, +.>Representing a minimum click frequency value for the software;
s13: keep click frequency greater thanIs a web page of (2); keep click frequency greater than +.>Is a software of (a);
s14: combining the web pages reserved in the step S13 in pairs, calculating the click frequency of the user clicking the web page combining results in the acquisition time range, and returning to the step S13 until a new web page combining result cannot be reserved;
combining the software reserved in the step S13 in pairs, calculating the click frequency of the software combination result clicked by the user in the acquisition time range, and returning to the step S13 until a new software combination result cannot be reserved;
s15: respectively constructing and obtaining a webpage clicking time sequence matrix A and a software clicking time sequence matrix B of a user:
calculating to obtain the probability of clicking the (i+1) th webpage simultaneously in the time period of clicking any (i) th webpage by the user, wherein the probability is used as the confidence coefficient between the (i) th webpage and the (i+1) th webpage;
Calculating to obtain the probability of clicking the j+1th software in the time period of clicking any j software by the user, wherein the probability is used as the confidence coefficient between the j software and the j+1th software;
s16: setting the confidence threshold as
Calculating the confidence coefficient of any two webpages in the combination result of the webpages reserved in the step S14, taking the confidence coefficient mean value as the confidence coefficient of the combination result of the reserved webpages, and selecting the confidence coefficient to be greater than or equal to a confidence coefficient threshold valueThe webpage combination result of the (2) is used as a candidate webpage behavior track, and the track with the largest webpage number in the candidate webpage behavior track is selected as a user webpage behavior track;
calculating the confidence coefficient of any two pieces of software in the combination result of the software reserved in the step S14, taking the confidence coefficient mean value as the confidence coefficient of the combination result of the reserved software, and selecting the confidence coefficient to be greater than or equal to a confidence coefficient threshold valueThe software combination result of the software is used as a candidate software behavior track, and the track with the largest software number in the candidate software behavior tracks is selected as a user software behavior track;
s17: filtering repeated webpages in the webpage behavior track of the user; filtering repeated software in the behavior track of the user software; obtaining the multidimensional behavior track of the userWherein->Representing the filtered behavior trace of the user web page, +. >Representing the filtered behavior trace of the user software.
S2: and constructing and training to obtain a user group identification model, inputting the user multidimensional behavior track into the user group identification model, and identifying to obtain the group category of the user, wherein the user group identification model takes the user multidimensional behavior track as input and takes the maximized user group distribution as a training objective function.
And S2, constructing and obtaining a user group identification model, wherein the step comprises the following steps: constructing a user group identification model, wherein the input of the constructed user group identification model is a user multidimensional behavior track of a user to be identified, the output result is a group category of the user, and a group category identification formula based on the user group identification model is as follows:
wherein:
representing the probability that the multidimensional behavior track X of the user belongs to the class of the y-th population;
s is represented as the behavior trace of the user webpage>Wherein the track point is the user webpage behavior track +.>Is a web page->Weights representing the trace points s +.>Representing the number of times the locus s appears in the class of the group y;
representing k as user software behavior trace +.>Wherein the track point is a user software behavior track +. >Is one of the software->Weights representing track points k +.>Representing the number of times that the locus point k appears in the class of the group y;
is the total number of group categories;
and +.>And solving parameters for the to-be-trained.
And in the step S2, training the constructed user group identification model, which comprises the following steps:
training the constructed user group identification model, wherein the training process is as follows:
s21: acquiring online user behavior data of M users and extracting user multidimensional behavior tracks, wherein the acquired user multidimensional behavior tracks are not repeated, and the acquisition flow of the user multidimensional behavior tracks is that in the embodiment of the invention, the acquired acquisition time ranges of the M user multidimensional behavior tracks are the same in length;
s22: marking the category of a user group for M user multidimensional behavior tracks, wherein in the embodiment of the invention, the category of the user group comprises game users, news users, movie users, variety users and the like;
s23: building a training objective function of a user group identification model:
wherein:representing the probability of occurrence of the class of the y-th population in the collected M user multidimensional behavior tracks;
r represents any locus point in the multidimensional behavior locus of M users, Representing track point weights +.>
Representing the frequency of occurrence of the trace point r in the multi-dimensional behavioural traces of M users, +.>The probability of occurrence of the y-th population class and occurrence of the track point r is represented;
obtaining according to the obtained M user multidimensional behavior tracksAnd +.>And obtaining the weight of each track point by minimizing the training objective function.
In the step S2, the multidimensional behavior track of the user is input into a user group identification model, and the group category of the user is identified and obtained, which comprises the following steps:
inputting the user multidimensional behavior track X into a user group identification model to obtain probabilities that the user multidimensional behavior track X belongs to different group categories, selecting the group category with the highest probability as an output value of the user group identification model, and identifying to obtain the group category of the user
S3: and constructing a multi-granularity clicking intention model, wherein the multi-granularity clicking intention model takes a multi-dimensional behavior track set of the user in the same category as input and takes the intention degree of the user in the category to different pages as output.
And in the step S3, a multi-granularity user click intention model is constructed, which comprises the following steps:
constructing a multi-granularity user click intention model, wherein the multi-granularity click intention model takes a user multi-dimensional behavior track set of the same category as input and takes the intention degree of the category user on different pages as output;
The constructed multi-granularity user click intention model comprises an embedding layer, a propagation layer and an output layer, wherein the embedding layer is used for initializing user parameters and webpage parameters of the same category of users, the propagation layer is used for constructing a message propagation system, the user parameters and the webpage parameters of the same category are embedded and propagated, the output layer is used for evaluating the intention degree of the same category of users on different webpages, the webpages are mapped to commodity pages, and the intention degree of the webpages is used as the intention degree of the mapped commodity pages; in the embodiment of the invention, the content specifically described in the webpage is the same as the commodity displayed in the mapped commodity page;
extracting webpage click time sequence matrixes of M users in the step S2, taking the webpage click time sequence matrixes of the users in the same category as a group, constructing each group to obtain a multi-granularity user click intention model, wherein the construction flow of the multi-granularity user click intention model corresponding to the y group is as follows:
s31: the embedded layer initializes the webpage click time sequence matrix of the y-th group of users as user parameters, and initializes the webpage ID names appearing in the user parameters as webpage parameters,wherein the number of the webpage click time sequence matrixes of the y-th group of users isThe number of web page ID names is- >
S32: the embedded layer constructs initial codes of user parameters and webpage parameters:
wherein:
represents the u-th webpage click timing matrix +.>And c-th webpage parameter->Is used for the initial encoding of (a),,/>
s33: the propagation layer recursively represents the message based on the user parameters, the web page parameters, and the initial encoding of the embedded layer:
wherein:
representing after D passesWebpage click timing matrix->,/>
Representing the encoded representation after the D-pass;
representing an activation function;
s34: the output layer performs aggregation treatment on the propagation result of the propagation layer:
;/>
wherein:
an aggregation processing result representing the user parameter propagation result;
representing the c-th webpage parameter->Is a result of the polymerization treatment;
s35: calculated to obtainAnd->Is expressed by the inner product of (2) and is calculated to obtain +.>And->Inner product representation of individual web page parameters, normalized to +.>Carrying out normalization processing on the inner product representation of each webpage parameter, wherein the normalization processing result is the intention degree of the corresponding webpage parameter; and mapping the webpage to the commodity page, and taking the intention degree of the webpage as the intention degree of the mapped commodity page.
S4: and according to the identified user group category, sequencing the intention degree of the users to different pages from high to low according to the category, and recommending the pages to the users in sequence.
And S4, according to the identified user group category, sequencing the intention degree of the user on different pages from high to low according to the category, and recommending the pages to the user in sequence, wherein the S4 comprises the following steps:
and (2) identifying the obtained user group category according to the step (S2), sequencing the intention degree of the category users on different pages according to the intention degree of the category users on the different pages from high to low based on the intention degree of the category users on different pages output by the multi-granularity user click intention model, and recommending commodity pages to the users in sequence.
Example 2
Fig. 2 is a schematic structural diagram of an electronic device for implementing a multidimensional and multi-granularity user behavior analysis method according to an embodiment of the present invention.
The electronic device 1 may comprise a processor 10, a memory 11, a communication interface 13 and a bus, and may further comprise a computer program, such as program 12, stored in the memory 11 and executable on the processor 10.
The memory 11 includes at least one type of readable storage medium, including flash memory, a mobile hard disk, a multimedia card, a card memory (e.g., SD or DX memory, etc.), a magnetic memory, a magnetic disk, an optical disk, etc. The memory 11 may in some embodiments be an internal storage unit of the electronic device 1, such as a removable hard disk of the electronic device 1. The memory 11 may in other embodiments also be an external storage device of the electronic device 1, such as a plug-in mobile hard disk, a Smart Media Card (SMC), a Secure Digital (SD) Card, a Flash memory Card (Flash Card) or the like, which are provided on the electronic device 1. Further, the memory 11 may also include both an internal storage unit and an external storage device of the electronic device 1. The memory 11 may be used not only for storing application software installed in the electronic device 1 and various types of data, such as codes of the program 12, but also for temporarily storing data that has been output or is to be output.
The processor 10 may be comprised of integrated circuits in some embodiments, for example, a single packaged integrated circuit, or may be comprised of multiple integrated circuits packaged with the same or different functions, including one or more central processing units (Central Processing unit, CPU), microprocessors, digital processing chips, graphics processors, combinations of various control chips, and the like. The processor 10 is a Control Unit (Control Unit) of the electronic device, connects respective parts of the entire electronic device using various interfaces and lines, executes or executes programs or modules (a program 12 for realizing user behavior analysis, etc.) stored in the memory 11, and invokes data stored in the memory 11 to perform various functions of the electronic device 1 and process data.
The communication interface 13 may comprise a wired interface and/or a wireless interface (e.g. WI-FI interface, bluetooth interface, etc.), typically used to establish a communication connection between the electronic device 1 and other electronic devices and to enable connection communication between internal components of the electronic device.
The bus may be a peripheral component interconnect standard (peripheral component interconnect, PCI) bus or an extended industry standard architecture (extended industry standard architecture, EISA) bus, among others. The bus may be classified as an address bus, a data bus, a control bus, etc. The bus is arranged to enable a connection communication between the memory 11 and at least one processor 10 etc.
Fig. 2 shows only an electronic device with components, it being understood by a person skilled in the art that the structure shown in fig. 2 does not constitute a limitation of the electronic device 1, and may comprise fewer or more components than shown, or may combine certain components, or may be arranged in different components.
For example, although not shown, the electronic device 1 may further include a power source (such as a battery) for supplying power to each component, and preferably, the power source may be logically connected to the at least one processor 10 through a power management device, so that functions of charge management, discharge management, power consumption management, and the like are implemented through the power management device. The power supply may also include one or more of any of a direct current or alternating current power supply, recharging device, power failure detection circuit, power converter or inverter, power status indicator, etc. The electronic device 1 may further include various sensors, bluetooth modules, wi-Fi modules, etc., which will not be described herein.
The electronic device 1 may optionally further comprise a user interface, which may be a Display, an input unit, such as a Keyboard (Keyboard), or a standard wired interface, a wireless interface. Alternatively, in some embodiments, the display may be an LED display, a liquid crystal display, a touch-sensitive liquid crystal display, an OLED (Organic Light-Emitting Diode) touch, or the like. The display may also be referred to as a display screen or display unit, as appropriate, for displaying information processed in the electronic device 1 and for displaying a visual user interface.
It should be understood that the embodiments described are for illustrative purposes only and are not limited to this configuration in the scope of the patent application.
The program 12 stored in the memory 11 of the electronic device 1 is a combination of instructions that, when executed in the processor 10, may implement:
collecting online user behavior data, and constructing the collected online user behavior data into a user multidimensional behavior track;
constructing and training to obtain a user group identification model, inputting a user multidimensional behavior track into the user group identification model, and identifying to obtain a group category of a user;
constructing a multi-granularity user click intention model;
and according to the identified user group category, sequencing the intention degree of the users to different pages from high to low according to the category, and recommending the pages to the users in sequence.
Specifically, the specific implementation method of the above instruction by the processor 10 may refer to descriptions of related steps in the corresponding embodiments of fig. 1 to 2, which are not repeated herein.
It should be noted that, the foregoing reference numerals of the embodiments of the present invention are merely for describing the embodiments, and do not represent the advantages and disadvantages of the embodiments. And the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, apparatus, article, or method that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, apparatus, article, or method. Without further limitation, an element defined by the phrase "comprising one … …" does not exclude the presence of other like elements in a process, apparatus, article or method that comprises the element.
From the above description of the embodiments, it will be clear to those skilled in the art that the above-described embodiment method may be implemented by means of software plus a necessary general hardware platform, but of course may also be implemented by means of hardware, but in many cases the former is a preferred embodiment. Based on such understanding, the technical solution of the present invention may be embodied essentially or in a part contributing to the prior art in the form of a software product stored in a storage medium (e.g. ROM/RAM, magnetic disk, optical disk) as described above, comprising instructions for causing a terminal device (which may be a mobile phone, a computer, a server, or a network device, etc.) to perform the method according to the embodiments of the present invention.
The foregoing description is only of the preferred embodiments of the present invention, and is not intended to limit the scope of the invention, but rather is intended to cover any equivalents of the structures or equivalent processes disclosed herein or in the alternative, which may be employed directly or indirectly in other related arts.

Claims (4)

1. A method of multidimensional, multi-granularity user behavior analysis, the method comprising:
S1: collecting online user behavior data, and constructing the collected online user behavior data into a user multidimensional behavior track, wherein the online user behavior data comprises webpage click data and software click data;
s2: constructing and training to obtain a user group identification model, inputting a user multidimensional behavior track into the user group identification model, and identifying to obtain a group category of the user, wherein the user group identification model takes the user multidimensional behavior track as input and takes maximized user group distribution as a training objective function;
constructing and obtaining a user group identification model, which comprises the following steps:
constructing a user group identification model, wherein the input of the constructed user group identification model is a user multidimensional behavior track of a user to be identified, the output result is a group category of the user, and a group category identification formula of the user group identification model is as follows:
wherein:
representing the probability that the multidimensional behavior track X of the user belongs to the class of the y-th population;
representing s as a userWebpage behavior trace->Wherein the track point is the user webpage behavior track +.>Is a web page->Weights representing the trace points s +.>Representing the number of times the locus s appears in the class of the group y;
Representing k as user software behavior trace +.>Wherein the track point is a user software behavior track +.>Is one of the software->Weights representing track points k +.>Representing the number of times that the locus point k appears in the class of the group y;
is the total number of group categories;
and +.>Solving parameters for to-be-trained;
training the constructed user group identification model, comprising the following steps:
training the constructed user group identification model, wherein the training process is as follows:
s21: acquiring online user behavior data of M users and extracting user multidimensional behavior tracks, wherein the acquired user multidimensional behavior tracks are not repeated, and the acquisition flow of the user multidimensional behavior tracks is step S1;
s22: carrying out user group category marking on M user multidimensional behavior tracks;
s23: building a training objective function of a user group identification model:
wherein:
representing the probability of occurrence of the class of the y-th population in the collected M user multidimensional behavior tracks;
r represents any locus point in the multidimensional behavior locus of M users,representing track point weights +.>
Representing the occurrence of trace points r in M user multidimensional behavior tracesFrequency of->The probability of occurrence of the y-th population class and occurrence of the track point r is represented;
Obtaining according to the obtained M user multidimensional behavior tracksAnd +.>Obtaining the weight of each track point by minimizing a training objective function;
inputting the multidimensional behavior track of the user into a user group identification model, and identifying to obtain the group category of the user, wherein the method comprises the following steps:
inputting the user multidimensional behavior track X into a user group identification model to obtain probabilities that the user multidimensional behavior track X belongs to different group categories, selecting the group category with the highest probability as an output value of the user group identification model, and identifying to obtain the group category of the user
Constructing a multi-granularity user click intention model, comprising:
constructing a multi-granularity user click intention model, wherein the multi-granularity user click intention model takes a user multi-dimensional behavior track set of the same category as input and takes the intention degree of the category user on different pages as output;
the constructed multi-granularity user click intention model comprises an embedding layer, a propagation layer and an output layer, wherein the embedding layer is used for initializing user parameters and webpage parameters of the same category of users, the propagation layer is used for constructing a message propagation system, the user parameters and the webpage parameters of the same category are embedded and propagated, the output layer is used for evaluating the intention degree of the same category of users on different webpages, the webpages are mapped to commodity pages, and the intention degree of the webpages is used as the intention degree of the mapped commodity pages;
Extracting webpage click time sequence matrixes of M users in the step S2, taking the webpage click time sequence matrixes of the users in the same category as a group, constructing each group to obtain a multi-granularity user click intention model, wherein the construction flow of the multi-granularity user click intention model corresponding to the y group is as follows:
s31: the embedded layer initializes the webpage clicking time sequence matrix of the user in the y group as a user parameter, and initializes the webpage ID names appearing in the user parameter as a webpage parameter, wherein the number of the webpage clicking time sequence matrix of the user in the y group is as followsThe number of web page ID names is->
S32: the embedded layer constructs initial codes of user parameters and webpage parameters:
wherein:
represents the u-th webpage click timing matrix +.>And c-th webpage parameter->Is used for the initial encoding of (a),,/>
s33: the propagation layer recursively represents the message based on the user parameters, the web page parameters, and the initial encoding of the embedded layer:
wherein:
representing the webpage click time sequence matrix after D times of propagation>,/>
Representing the encoded representation after the D-pass;
representing an activation function;
s34: the output layer performs aggregation treatment on the propagation result of the propagation layer:
wherein:
an aggregation processing result representing the user parameter propagation result;
representing the c-th webpage parameter->Is a result of the polymerization treatment;
S35: calculated to obtainAnd->Is expressed by the inner product of (2) and is calculated to obtain +.>And->Inner product representation of individual web page parameters, normalized to +.>Carrying out normalization processing on the inner product representation of each webpage parameter, wherein the normalization processing result is the intention degree of the corresponding webpage parameter; mapping the webpage to the commodity page, and taking the intention degree of the webpage as the intention degree of the commodity page after mapping;
s3: constructing a multi-granularity user click intention model, wherein the multi-granularity user click intention model takes a user multi-dimensional behavior track set of the same category as input and takes the intention degree of the category user on different pages as output;
s4: and according to the identified user group category, sequencing the intention degree of the users to different pages from high to low according to the category, and recommending the pages to the users in sequence.
2. The method for analyzing multi-dimensional and multi-granularity user behavior according to claim 1, wherein the step S1 of collecting online user behavior data comprises the steps of:
the method comprises the steps of collecting online user behavior data, wherein the online user behavior data comprises webpage click data and software click data, and the collection flow of the online user behavior data is as follows:
Constructing a webpage clicking statistical table, wherein the webpage clicking statistical table comprisesThe method comprises the steps of receiving a webpage click statistics table, acquiring click time sequence data of a user on the webpage in the webpage click statistics table, wherein the click time sequence data comprises common webpages and acquisition time sequence data of the user on the common webpages in the webpage click statistics table are acquired:
wherein:
time sequence data representing the clicking of the ith common webpage in the webpage clicking statistical table by the user, +.>Indicating that the user is +.>Time period without clicking on the ith common webpage, +.>Indicating that the user is +.>Clicking the ith common webpage in time interval, +.>Representing the acquisition time range of the online user behavior data;
counting the total number of clicks of the user of each common webpage in the webpage clicking statistical table:
wherein:
representing the total clicking times of the ith common webpage in the webpage clicking statistical table in the acquisition time range;
taking the total clicking times and clicking time sequence data of the user of each common webpage in the webpage clicking statistical table as webpage clicking data;
constructing a software click statistical table, wherein the software click statistical table comprisesAnd acquiring click time sequence data of the user on the common software in the software click statistical table:
wherein:
time sequence data representing the j-th common software in the software click statistical table by the user, ++>Indicating that the user is +.>Time period without clicking on the j-th common software, +. >Indicating that the user is +.>Clicking the j-th common software in a period;
statistics software the total number of clicks by the user for each common software in the statistics table:
wherein:
representing the total clicking times of the jth common software in the software clicking statistical table in the acquisition time range;
taking the total clicking times and the clicking time sequence data of the user of each common software in the software clicking statistical table as software clicking data;
and constructing the webpage clicking data and the software clicking data as online user behavior data.
3. The method for analyzing multi-dimensional and multi-granularity user behavior according to claim 2, wherein in the step S1, the collected online user behavior data is constructed as a multi-dimensional behavior track of the user, and the method comprises the following steps:
the collected online user behavior data is constructed into a user multidimensional behavior track, wherein the construction flow of the user multidimensional track is as follows:
s11: calculating click frequency of a user on any webpage and software in an acquisition time range:
wherein:
indicating the click frequency of the user on the ith common webpage, < ->Representing the click frequency of the user on the j-th common software;
s12: setting minimum click frequency values for web pages and software respectivelyWherein->Minimum click frequency value representing a web page, +. >Representing a minimum click frequency value for the software;
s13: keep click frequency greater thanIs a web page of (2); keep click frequency greater than +.>Is a software of (a);
s14: combining the web pages reserved in the step S13 in pairs, calculating the click frequency of the user clicking the web page combining results in the acquisition time range, and returning to the step S13 until a new web page combining result cannot be reserved;
combining the software reserved in the step S13 in pairs, calculating the click frequency of the software combination result clicked by the user in the acquisition time range, and returning to the step S13 until a new software combination result cannot be reserved;
s15: respectively constructing and obtaining a webpage clicking time sequence matrix A and a software clicking time sequence matrix B of a user:
calculating to obtain the probability of clicking the (i+1) th webpage simultaneously in the time period of clicking any (i) th webpage by the user, wherein the probability is used as the confidence coefficient between the (i) th webpage and the (i+1) th webpage;
calculating to obtain the probability of clicking the j+1th software in the time period of clicking any j software by the user, wherein the probability is used as the confidence coefficient between the j software and the j+1th software;
s16: setting the confidence threshold as
Calculating the confidence coefficient of any two webpages in the combination result of the webpages reserved in the step S14, taking the confidence coefficient mean value as the confidence coefficient of the combination result of the reserved webpages, and selecting the confidence coefficient to be greater than or equal to a confidence coefficient threshold value The webpage combination result of the (2) is used as a candidate webpage behavior track, and the track with the largest webpage number in the candidate webpage behavior track is selected as a user webpage behavior track;
calculating the confidence coefficient of any two pieces of software in the combination result of the software reserved in the step S14, taking the confidence coefficient mean value as the confidence coefficient of the combination result of the reserved software, and selecting the confidence coefficient to be greater than or equal to a confidence coefficient threshold valueThe software combination result of the software is used as a candidate software behavior track, and the track with the largest software number in the candidate software behavior tracks is selected as a user software behavior track;
s17: filtering repeated webpages in the webpage behavior track of the user; filtering repeated software in the behavior track of the user software; obtaining the multidimensional behavior track of the userWherein->Representing the filtered behavior trace of the user web page, +.>Representing the filtered behavior trace of the user software.
4. The method for analyzing multi-dimensional and multi-granularity user behavior according to claim 1, wherein in the step S4, according to the identified user group category, the intention degree of the user on different pages is ordered from high to low, and pages are sequentially recommended to the user, comprising:
and (2) identifying the obtained user group category according to the step (S2), sequencing the intention degree of the category users on different pages according to the intention degree of the category users on the different pages from high to low based on the intention degree of the category users on different pages output by the multi-granularity user click intention model, and recommending commodity pages to the users in sequence.
CN202310461608.3A 2023-04-26 2023-04-26 Multidimensional and multi-granularity user behavior analysis method Active CN116167829B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310461608.3A CN116167829B (en) 2023-04-26 2023-04-26 Multidimensional and multi-granularity user behavior analysis method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310461608.3A CN116167829B (en) 2023-04-26 2023-04-26 Multidimensional and multi-granularity user behavior analysis method

Publications (2)

Publication Number Publication Date
CN116167829A CN116167829A (en) 2023-05-26
CN116167829B true CN116167829B (en) 2023-08-29

Family

ID=86416810

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310461608.3A Active CN116167829B (en) 2023-04-26 2023-04-26 Multidimensional and multi-granularity user behavior analysis method

Country Status (1)

Country Link
CN (1) CN116167829B (en)

Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102262661A (en) * 2011-07-18 2011-11-30 南京大学 Web page access forecasting method based on k-order hybrid Markov model
CN104462156A (en) * 2013-09-25 2015-03-25 阿里巴巴集团控股有限公司 Feature extraction and individuation recommendation method and system based on user behaviors
CN106383895A (en) * 2016-09-27 2017-02-08 北京金山安全软件有限公司 Information recommendation method and device and terminal equipment
CN110825956A (en) * 2019-09-17 2020-02-21 中国平安人寿保险股份有限公司 Information flow recommendation method and device, computer equipment and storage medium
CN111294620A (en) * 2020-01-22 2020-06-16 北京达佳互联信息技术有限公司 Video recommendation method and device
WO2021139638A1 (en) * 2020-01-06 2021-07-15 阿里巴巴集团控股有限公司 Method and system for processing behavioral data, storage medium, and processor
CN113886204A (en) * 2021-09-29 2022-01-04 平安普惠企业管理有限公司 User behavior data collection method and device, electronic equipment and readable storage medium
CN114266625A (en) * 2021-12-21 2022-04-01 中国平安财产保险股份有限公司 Recommendation method, device and equipment based on new user and storage medium
CN114637917A (en) * 2022-03-28 2022-06-17 中国银行股份有限公司 Information head bar recommendation method and device based on artificial intelligence
CN115098789A (en) * 2022-08-05 2022-09-23 湖南工商大学 Neural network-based multi-dimensional interest fusion recommendation method and device and related equipment

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11132733B2 (en) * 2018-05-25 2021-09-28 Target Brands, Inc. Personalized recommendations for unidentified users based on web browsing context
CN110825966B (en) * 2019-10-31 2022-03-04 广州市百果园信息技术有限公司 Information recommendation method and device, recommendation server and storage medium
CN111931062B (en) * 2020-08-28 2023-11-24 腾讯科技(深圳)有限公司 Training method and related device of information recommendation model

Patent Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102262661A (en) * 2011-07-18 2011-11-30 南京大学 Web page access forecasting method based on k-order hybrid Markov model
CN104462156A (en) * 2013-09-25 2015-03-25 阿里巴巴集团控股有限公司 Feature extraction and individuation recommendation method and system based on user behaviors
CN106383895A (en) * 2016-09-27 2017-02-08 北京金山安全软件有限公司 Information recommendation method and device and terminal equipment
CN110825956A (en) * 2019-09-17 2020-02-21 中国平安人寿保险股份有限公司 Information flow recommendation method and device, computer equipment and storage medium
WO2021139638A1 (en) * 2020-01-06 2021-07-15 阿里巴巴集团控股有限公司 Method and system for processing behavioral data, storage medium, and processor
CN111294620A (en) * 2020-01-22 2020-06-16 北京达佳互联信息技术有限公司 Video recommendation method and device
CN113886204A (en) * 2021-09-29 2022-01-04 平安普惠企业管理有限公司 User behavior data collection method and device, electronic equipment and readable storage medium
CN114266625A (en) * 2021-12-21 2022-04-01 中国平安财产保险股份有限公司 Recommendation method, device and equipment based on new user and storage medium
CN114637917A (en) * 2022-03-28 2022-06-17 中国银行股份有限公司 Information head bar recommendation method and device based on artificial intelligence
CN115098789A (en) * 2022-08-05 2022-09-23 湖南工商大学 Neural network-based multi-dimensional interest fusion recommendation method and device and related equipment

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
基于客户web时空行为轨迹的兴趣点预测方法;陈冬林;《科技导报》;第74-79页 *

Also Published As

Publication number Publication date
CN116167829A (en) 2023-05-26

Similar Documents

Publication Publication Date Title
CN109271512B (en) Emotion analysis method, device and storage medium for public opinion comment information
CN110825957B (en) Deep learning-based information recommendation method, device, equipment and storage medium
CN107168854B (en) Internet advertisement abnormal click detection method, device, equipment and readable storage medium
CN111930962A (en) Document data value evaluation method and device, electronic equipment and storage medium
CN112733023A (en) Information pushing method and device, electronic equipment and computer readable storage medium
CN111090807A (en) Knowledge graph-based user identification method and device
CN111310052A (en) User portrait construction method and device and computer readable storage medium
CN113886708A (en) Product recommendation method, device, equipment and storage medium based on user information
CN110968802B (en) Analysis method and analysis device for user characteristics and readable storage medium
CN115391669A (en) Intelligent recommendation method and device and electronic equipment
CN112328657A (en) Feature derivation method, feature derivation device, computer equipment and medium
CN115729815A (en) Fuzzy test method, device, equipment and storage medium based on weighted random
CN110929169A (en) Position recommendation method based on improved Canopy clustering collaborative filtering algorithm
CN114862140A (en) Behavior analysis-based potential evaluation method, device, equipment and storage medium
CN114491047A (en) Multi-label text classification method and device, electronic equipment and storage medium
CN111652282B (en) Big data-based user preference analysis method and device and electronic equipment
CN108345620B (en) Brand information processing method, brand information processing device, storage medium and electronic equipment
CN113569162A (en) Data processing method, device, equipment and storage medium
CN116167829B (en) Multidimensional and multi-granularity user behavior analysis method
CN111667018A (en) Object clustering method and device, computer readable medium and electronic equipment
CN111831708A (en) Missing data-based sample analysis method and device, electronic equipment and medium
CN116403693A (en) Method, device, equipment and storage medium for dispatching questionnaire
CN113343306B (en) Differential privacy-based data query method, device, equipment and storage medium
CN110674020B (en) APP intelligent recommendation method and device and computer readable storage medium
CN113918577A (en) Data table identification method and device, electronic equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant