CN107609786B

CN107609786B - Method for constructing user behavior preference change model under online social network

Info

Publication number: CN107609786B
Application number: CN201710883112.XA
Authority: CN
Inventors: 汪材印; 崔琳; 梁楠楠; 谈成访; 潘正高; 彭智超; 戚溪溪
Original assignee: Suzhou University
Current assignee: Suzhou University
Priority date: 2017-09-26
Filing date: 2017-09-26
Publication date: 2021-02-09
Anticipated expiration: 2037-09-26
Also published as: CN107609786A

Abstract

The invention discloses a method for constructing a user behavior preference change model under an online social network. The method comprises the following steps: under the set Hadoop parallel distributed processing environment, after interest topic information of a user is processed through a Map-Reduce process, calculating the drift probability of the interest topic of the user by adopting symmetrical KL divergence, and determining an interest change point of the user; mining long-term interest and short-term interest of a user; mining the weight of each interest; and determining a user behavior preference change model according to the long-term interest, the short-term interest and the weight of each interest of the user. The invention researches the behavior preference change of the user in the online social network, knows the personalized preference of the user in different time and space situations, and finds information resources which are really interested by the user from the vast online social network environment according to the current situation information and the interest of the user, thereby meeting the personalized demand of the user on the information and practically improving the user experience.

Description

Method for constructing user behavior preference change model under online social network

Technical Field

The invention relates to the field of online social networks, in particular to a method for constructing a user behavior preference change model in an online social network.

Background

User behavior preference change modeling (namely user interest change modeling) refers to induction of related interests and behavior information of a user, a user model capable of describing behavior change characteristics of the user is deduced based on the induction, the user model is one of key technologies of a recommendation system, an efficient recommendation algorithm can be designed only by accurately grasping user interest changes, and therefore user experience is improved.

At present, although the problem of personalized recommendation under an online social network is studied to a certain extent, the behavior preference of a user under the online social network often changes along with time, while the traditional modeling method mostly adopts a static research method and does not consider the problem of the behavior preference change of the user under the online social network.

In summary, in the prior art, user interest modeling mostly adopts a static research method, and the problem that the online social network downlink is preference change is not considered.

Disclosure of Invention

The embodiment of the invention provides a method for constructing a user behavior preference change model in an online social network, which is used for solving the problem that the change of the user behavior preference is not considered by adopting a static research method in the prior art.

The embodiment of the invention provides a method for constructing a user behavior preference change model under an online social network, which comprises the following steps:

under the established Hadoop parallel distributed processing environment, after interest topic information of a user is processed through a Map-Reduce process, the drift probability of the interest topic of the user is calculated by adopting symmetrical KL divergence, and a user interest change point is determined;

dividing the evolution of the drift trajectory of the user interest subtopic into three types of new interest generation, interest maintenance and interest disappearance according to the relationship between the user interest subtopic and the forward associated subtopic and the backward associated subtopic; when the user behavior preference changes, analyzing the corresponding user interest change points, and mining the long-term interest and the short-term interest of the user;

calculating the absolute intensity and the relative intensity of the user behavior preference in the whole life cycle, constructing the intensity trend of the change track of the user behavior preference, and mining the weight of each interest;

and determining a user behavior preference change model according to the long-term interest, the short-term interest and the weight of each interest of the user.

Preferably, the calculating the drift probability of the user interest topic by using the symmetrical KL divergence and determining the user interest change point includes:

the user interest is expressed by using the occurrence probability of the feature words in the user scene log, and the semantic similarity between the user interest topics in different periods is judged to correspond to the approximate degree of measurement between two probability distributions; let the sliding window contain N time slices, and let the interest j of the user in time slice t be recorded as

The original KL divergence is asymmetrical, but the semantic similarity between the user interest topics in different periods is symmetrical, namely, for any user interest topic

And

and

similarity of

And

the similarity of the KL is equal, the original KL divergence is improved, and the similarity of the user interest subtopic based on the symmetrical KL divergence is determined as follows:

wherein p (w) and q (w) respectively represent the characteristic words w in the user interest sub-topic

And

v represents a collection of vocabulary dictionaries.

Preferably, the first and second liquid crystal films are made of a polymer,

the forward association sub-topic is: and in each time slice i (i ═ t-N, …, t-1) in the sliding window

The user interest subtopic with the greatest similarity is recorded as

The backward correlation sub-topic is: and in each time slice i (i ═ t +1, …, t + N) in the sliding window

The user interest subtopic with the greatest similarity is recorded as

Preferably, the first and second liquid crystal films are made of a polymer,

the new interest generation comprises the following steps: for user interest subtopic

If there is no forward related interest subtopic

So that

And

is greater than a threshold value epsilon and does not satisfy

Then

Is an emerging interesting topic generated in time slice t;

the interest preservation includes: for user interest subtopic

If there is a forward associated interest subtopic

So that

And

is greater than a threshold value epsilon, i.e.

And is

Is also that

Backward related interest sub-topic of (1), i.e.

Then

Is that

The interest of the user does not change much;

the interest disappears, including: for user interest subtopic

If no backward correlation subtopic exists

So that

And

is greater than a threshold value epsilon and does not satisfy

Then

Is died in time slice t, the user no longer has this interest.

Preferably, the first and second liquid crystal films are made of a polymer,

the absolute intensities include: let d_i＝{d_i1…d_iMRepresents a mobile user context log d_iM denotes a user scene log d_iThe number of words contained, i, represents the user context log d_iThe expressed user interest topic; the absolute intensity of the user interest topic i at time t takes the following formula:

the relative intensities include: the relative strength of the user interest topic i at time t takes the following formula:

wherein t' is any time slice between t-N and t-1, K is the number of the user interest topics, and p is any one of the user interest topics K.

In the embodiment of the invention, a method for constructing a user behavior preference change model under an online social network is provided, and compared with the prior art, the method has the following beneficial effects: according to the invention, under the established Hadoop parallel distributed processing environment, the behavior preference change of the user is researched, the personalized preference of the user at different time is known, and according to the current situation information and the interest of the user, information resources which are really interested by the user are found from the vast online social network environment, the personalized requirement of the user on the information is met, and the user experience is practically improved; the research of the invention has important value for improving the application effect of the personalized service and seeking the breakthrough of the personalized information service technology under the online social network, so as to promote the further development of the online social network application and service to intellectualization.

Drawings

FIG. 1 is a flowchart of a method for constructing a user behavior preference change model in an online social network according to an embodiment of the present invention;

FIG. 2 is a schematic diagram illustrating the generation of a user interest sub-topic according to an embodiment of the present invention;

FIG. 3 is a schematic diagram illustrating inheritance of a user interest sub-topic provided in an embodiment of the present invention;

fig. 4 is a schematic diagram of user interest sub-topic extinction provided in the embodiment of the present invention.

Detailed Description

The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

Fig. 1 is a flowchart of a method for constructing a user behavior preference change model in an online social network according to an embodiment of the present invention. As shown in fig. 1, the method includes:

and S101, under the set Hadoop parallel distributed processing environment, calculating the drift probability of the user interest topic by adopting symmetrical KL divergence, and determining the user interest change point.

In a Hadoop cluster environment, the pre-processing of user interest is divided into a Map process and a Reduce process. In the Map process, when a task is executed, input data of each Map is derived from one divided block Split (each data block size is set to 64MB) in the HDFS. After the Map process processing, the Map outputs a key-value pair, and sets different interest topics T corresponding to the user by the key, wherein the value is the probability value of the topic corresponding to the key. Subsequently, in the Shuffle stage, sequencing and merging operations are executed on the processing result of the Map process, and then the key-value information processed by Shuffle is input to the Reduce process for processing; and when the Reduce task is executed, dividing the Reduce task into 5 Reduce tasks according to the sequence of 26 English letters, secondarily merging the result data sets according to the sequence of key value initials, and outputting the final processing result of the Map-Reduce operation so as to further analyze how the behavior preference of the user changes along with time and the situation.

In the online social network, although the behavior preference of the user changes with time and the situation, there is a certain correlation between the interest topics of the users in different periods, and the correlation is reflected as semantic similarity between the interest topics of the users in different periods. Therefore, the user interest is expressed by using the occurrence probability of the feature words, and the semantic similarity between the user interest topics at different periods is judged to correspond to the similarity degree between two probability distributions. Let the sliding window contain N time slices, and let the interest j of the user in time slice t be recorded as

And

and

similarity of

And

the similarity of the KL is equal, the original KL divergence is improved, and the similarity of the user interest subtopic based on the symmetrical KL divergence is preliminarily conceived as follows:

And

the probability of occurrence of.

And S102, dividing the evolution of the drift trajectory of the user interest subtopic into three types of new interest generation, interest maintenance and interest disappearance according to the relationship between the user interest subtopic and the forward associated subtopic and the backward associated subtopic, analyzing corresponding user interest change points when the user behavior preference changes, and mining the long-term interest and the short-term interest of the user.

The user interest topic has a certain life cycle and is composed of a group of user interest subtopics which are mutually related. Let N be the size of the time sliding window, for the user interest sub-topic in time slice t

There is an association between user interest sub-topics within the sliding window adjacent to the time slice t. And in each time slice i (i ═ t-N, …, t-1) in the sliding window

The user interest sub-topic with the largest similarity is called

Is referred to as a forward association subtopic

And in time slice i (i ═ t +1, …, t + N)

The sub-topic with the greatest similarity is called

Is written as a backward association subtopic

According to the relationship between the user interest subtopic and the forward and backward related subtopics thereof, the evolution of the drift trajectory of the user interest subtopic is divided into three types of emerging interest generation, interest maintenance and interest disappearance for research, which are respectively introduced as follows:

user interest subtopic generation-New interest Generation

For user interest subtopic

If there is no forward related interest subtopic

So that

And

is greater than a threshold value epsilon and does not satisfy

Then

Is an emerging interesting theme generated in time slice t, as shown in fig. 2.

User interest subtopic inheritance-interest preservation

For user interest subtopic

If there is a forward associated interest subtopic

So that

And

is greater than a threshold value epsilon, i.e.

And is

Is also that

Backward related interest sub-topic of (1), i.e.

Then

Is that

As shown in fig. 3. This phenomenon indicates that the interests of the user do not change much.

User interest subtopic disappearance-interest disappearance

For user interest subtopic

If no backward correlation subtopic exists

So that

And

is greater than a threshold value epsilon and does not satisfy

Then

Is died out in time slice t, as shown in fig. 4, indicating that the user no longer has this interest.

Step S103, calculating the absolute intensity and the relative intensity of the user behavior preference in the whole life cycle, constructing the intensity trend of the change track of the user behavior preference, and mining the weight of each interest.

And step S104, determining a user behavior preference change model according to the long-term interest, the short-term interest and the weight of each interest of the user.

With the development of time, the user interest topic strength changes along with the change of the user behavior preference. The invention excavates the weight of each interest by calculating the absolute intensity and the relative intensity of each interest of the user so as to judge whether the degree of interest of the user changes.

Let d_i＝{d_i1…d_iMRepresents a mobile user context log d_iM denotes a user scene log d_iThe number of words contained, i, represents the user context log d_iThe expressed user interest topic. The absolute intensity of the user interest topic i at the time t is calculated by the following method:

within the time slice of t', when the word | d_ij|_t′Delta (| d) when belonging to the user interest topic i_ij|_t′I) 1, otherwise δ (| d)_ij|_t′,i)＝0。

The relative strength of the user interest topic i at the time t is calculated by the following method:

wherein t' is any time slice between t-N and t-1, K is the number of the user interest topics, and p is any one of the user interest topics K. Therefore, the intensity change of each behavior preference of the user can be realized by iteratively calculating the absolute intensity and the relative intensity of the user interest topic i in the whole topic life cycle, so as to judge whether the interest degree of each interest changes.

Based on the analysis, a user behavior preference change model can be constructed on the basis of mining the long-term interest and the short-term interest of the user and the weight of each interest.

In conclusion, the invention researches the behavior preference change of the user in the online social network of the user, learns the personalized preference of the user under different time-space situations, and discovers information resources which are really interested by the user from the vast social network environment according to the situation information at the current moment and the interest of the user, thereby meeting the personalized demand of the user on the information and practically improving the user experience; the research of the invention has important value for improving the application effect of the personalized service and seeking the breakthrough of the personalized information service technology in the online social network, so as to promote the further development of the online social application and service towards intellectualization.

The above disclosure is only a few specific embodiments of the present invention, and those skilled in the art can make various modifications and variations of the present invention without departing from the spirit and scope of the present invention, and it is intended that the present invention encompass these modifications and variations as well as others within the scope of the appended claims and their equivalents.

Claims

1. A method for constructing a user behavior preference change model under an online social network is characterized by comprising the following steps:

according to the relationship between the user interest subtopic and the forward related subtopic and the backward related subtopic, the evolution of the drift track of the user interest subtopic is divided into three types of new interest generation, interest maintenance and interest disappearance, when the user behavior preference changes, the corresponding user interest change point is analyzed, and the long-term interest and the short-term interest of the user are mined;

the forward associated user interest subtopic is: neutralizing each time slice t' in the sliding window

The user interest subtopic with the greatest similarity is recorded as

Wherein t' is t-N, …, t-1, N is an integer;

the backward associated user interest subtopics are: neutralizing each time slice t' in the sliding window

The user interest subtopic with the greatest similarity is recorded as

If there is no forward related interest sub-topic T_l ^mSo that

And T_l ^mIs greater than a threshold epsilon, i.e., does not satisfy the user interest subtopic similarity based on the symmetric KL divergence

Then

Is an emerging interesting topic generated in time slice t;

the interest preservation includes: for user interest subtopic

If there is a forward associated interest subtopic

So that

And T_l ^mIs greater than a threshold value epsilon, i.e.

And is

Is also T_l ^mBackward related interest sub-topic of (1), i.e.

Then

Is T_l ^mThe interest of the user does not change much;

the interest disappears, including:for user interest subtopic

If no backward correlation subtopic T exists_l ^mSo that

And T_l ^mIs greater than a threshold value epsilon and does not satisfy

Then the subject

Dying in time slice t, the user no longer has this interest;

the absolute intensities include: let d_i＝{d_i1…d_iMRepresents a mobile user context log d_iM denotes a user scene log d_iThe number of words contained, i, represents the user context log d_iThe expressed user interest topic; the absolute intensity of the user interest topic i in the time slice t adopts the following formula:

the relative intensities include: the relative strength of the user interest topic i in the time slice t adopts the following formula:

wherein t' is any time slice between t-N and t-1, K is the number of the user interest topics, and p is any one of the user interest topics K;

2. The method for constructing a user behavior preference change model under an online social network according to claim 1, wherein the calculating the drift probability of the user interest topic by using the symmetrical KL divergence and determining the user interest change point comprises:

And T_l ^m，

And T_l ^mSimilarity to T_l ^mAnd

wherein p (w) and q (w) respectively represent the characteristic vocabulary w in the user interest sub-mainQuestion (I)

And T_l ^mV represents a collection of vocabulary dictionaries.