CN107609786B - Method for constructing user behavior preference change model under online social network - Google Patents
Method for constructing user behavior preference change model under online social network Download PDFInfo
- Publication number
- CN107609786B CN107609786B CN201710883112.XA CN201710883112A CN107609786B CN 107609786 B CN107609786 B CN 107609786B CN 201710883112 A CN201710883112 A CN 201710883112A CN 107609786 B CN107609786 B CN 107609786B
- Authority
- CN
- China
- Prior art keywords
- user
- interest
- subtopic
- topic
- user interest
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Images
Landscapes
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The invention discloses a method for constructing a user behavior preference change model under an online social network. The method comprises the following steps: under the set Hadoop parallel distributed processing environment, after interest topic information of a user is processed through a Map-Reduce process, calculating the drift probability of the interest topic of the user by adopting symmetrical KL divergence, and determining an interest change point of the user; mining long-term interest and short-term interest of a user; mining the weight of each interest; and determining a user behavior preference change model according to the long-term interest, the short-term interest and the weight of each interest of the user. The invention researches the behavior preference change of the user in the online social network, knows the personalized preference of the user in different time and space situations, and finds information resources which are really interested by the user from the vast online social network environment according to the current situation information and the interest of the user, thereby meeting the personalized demand of the user on the information and practically improving the user experience.
Description
Technical Field
The invention relates to the field of online social networks, in particular to a method for constructing a user behavior preference change model in an online social network.
Background
User behavior preference change modeling (namely user interest change modeling) refers to induction of related interests and behavior information of a user, a user model capable of describing behavior change characteristics of the user is deduced based on the induction, the user model is one of key technologies of a recommendation system, an efficient recommendation algorithm can be designed only by accurately grasping user interest changes, and therefore user experience is improved.
At present, although the problem of personalized recommendation under an online social network is studied to a certain extent, the behavior preference of a user under the online social network often changes along with time, while the traditional modeling method mostly adopts a static research method and does not consider the problem of the behavior preference change of the user under the online social network.
In summary, in the prior art, user interest modeling mostly adopts a static research method, and the problem that the online social network downlink is preference change is not considered.
Disclosure of Invention
The embodiment of the invention provides a method for constructing a user behavior preference change model in an online social network, which is used for solving the problem that the change of the user behavior preference is not considered by adopting a static research method in the prior art.
The embodiment of the invention provides a method for constructing a user behavior preference change model under an online social network, which comprises the following steps:
under the established Hadoop parallel distributed processing environment, after interest topic information of a user is processed through a Map-Reduce process, the drift probability of the interest topic of the user is calculated by adopting symmetrical KL divergence, and a user interest change point is determined;
dividing the evolution of the drift trajectory of the user interest subtopic into three types of new interest generation, interest maintenance and interest disappearance according to the relationship between the user interest subtopic and the forward associated subtopic and the backward associated subtopic; when the user behavior preference changes, analyzing the corresponding user interest change points, and mining the long-term interest and the short-term interest of the user;
calculating the absolute intensity and the relative intensity of the user behavior preference in the whole life cycle, constructing the intensity trend of the change track of the user behavior preference, and mining the weight of each interest;
and determining a user behavior preference change model according to the long-term interest, the short-term interest and the weight of each interest of the user.
Preferably, the calculating the drift probability of the user interest topic by using the symmetrical KL divergence and determining the user interest change point includes:
the user interest is expressed by using the occurrence probability of the feature words in the user scene log, and the semantic similarity between the user interest topics in different periods is judged to correspond to the approximate degree of measurement between two probability distributions; let the sliding window contain N time slices, and let the interest j of the user in time slice t be recorded asThe original KL divergence is asymmetrical, but the semantic similarity between the user interest topics in different periods is symmetrical, namely, for any user interest topicAnd andsimilarity ofAndthe similarity of the KL is equal, the original KL divergence is improved, and the similarity of the user interest subtopic based on the symmetrical KL divergence is determined as follows:
wherein p (w) and q (w) respectively represent the characteristic words w in the user interest sub-topicAndv represents a collection of vocabulary dictionaries.
Preferably, the first and second liquid crystal films are made of a polymer,
the forward association sub-topic is: and in each time slice i (i ═ t-N, …, t-1) in the sliding windowThe user interest subtopic with the greatest similarity is recorded as
The backward correlation sub-topic is: and in each time slice i (i ═ t +1, …, t + N) in the sliding windowThe user interest subtopic with the greatest similarity is recorded as
Preferably, the first and second liquid crystal films are made of a polymer,
the new interest generation comprises the following steps: for user interest subtopicIf there is no forward related interest subtopicSo thatAndis greater than a threshold value epsilon and does not satisfyThenIs an emerging interesting topic generated in time slice t;
the interest preservation includes: for user interest subtopicIf there is a forward associated interest subtopicSo thatAndis greater than a threshold value epsilon, i.e.And isIs also thatBackward related interest sub-topic of (1), i.e.ThenIs thatThe interest of the user does not change much;
the interest disappears, including: for user interest subtopicIf no backward correlation subtopic existsSo thatAndis greater than a threshold value epsilon and does not satisfyThenIs died in time slice t, the user no longer has this interest.
Preferably, the first and second liquid crystal films are made of a polymer,
the absolute intensities include: let di={di1…diMRepresents a mobile user context log diM denotes a user scene log diThe number of words contained, i, represents the user context log diThe expressed user interest topic; the absolute intensity of the user interest topic i at time t takes the following formula:
the relative intensities include: the relative strength of the user interest topic i at time t takes the following formula:
wherein t' is any time slice between t-N and t-1, K is the number of the user interest topics, and p is any one of the user interest topics K.
In the embodiment of the invention, a method for constructing a user behavior preference change model under an online social network is provided, and compared with the prior art, the method has the following beneficial effects: according to the invention, under the established Hadoop parallel distributed processing environment, the behavior preference change of the user is researched, the personalized preference of the user at different time is known, and according to the current situation information and the interest of the user, information resources which are really interested by the user are found from the vast online social network environment, the personalized requirement of the user on the information is met, and the user experience is practically improved; the research of the invention has important value for improving the application effect of the personalized service and seeking the breakthrough of the personalized information service technology under the online social network, so as to promote the further development of the online social network application and service to intellectualization.
Drawings
FIG. 1 is a flowchart of a method for constructing a user behavior preference change model in an online social network according to an embodiment of the present invention;
FIG. 2 is a schematic diagram illustrating the generation of a user interest sub-topic according to an embodiment of the present invention;
FIG. 3 is a schematic diagram illustrating inheritance of a user interest sub-topic provided in an embodiment of the present invention;
fig. 4 is a schematic diagram of user interest sub-topic extinction provided in the embodiment of the present invention.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
Fig. 1 is a flowchart of a method for constructing a user behavior preference change model in an online social network according to an embodiment of the present invention. As shown in fig. 1, the method includes:
and S101, under the set Hadoop parallel distributed processing environment, calculating the drift probability of the user interest topic by adopting symmetrical KL divergence, and determining the user interest change point.
In a Hadoop cluster environment, the pre-processing of user interest is divided into a Map process and a Reduce process. In the Map process, when a task is executed, input data of each Map is derived from one divided block Split (each data block size is set to 64MB) in the HDFS. After the Map process processing, the Map outputs a key-value pair, and sets different interest topics T corresponding to the user by the key, wherein the value is the probability value of the topic corresponding to the key. Subsequently, in the Shuffle stage, sequencing and merging operations are executed on the processing result of the Map process, and then the key-value information processed by Shuffle is input to the Reduce process for processing; and when the Reduce task is executed, dividing the Reduce task into 5 Reduce tasks according to the sequence of 26 English letters, secondarily merging the result data sets according to the sequence of key value initials, and outputting the final processing result of the Map-Reduce operation so as to further analyze how the behavior preference of the user changes along with time and the situation.
In the online social network, although the behavior preference of the user changes with time and the situation, there is a certain correlation between the interest topics of the users in different periods, and the correlation is reflected as semantic similarity between the interest topics of the users in different periods. Therefore, the user interest is expressed by using the occurrence probability of the feature words, and the semantic similarity between the user interest topics at different periods is judged to correspond to the similarity degree between two probability distributions. Let the sliding window contain N time slices, and let the interest j of the user in time slice t be recorded asThe original KL divergence is asymmetrical, but the semantic similarity between the user interest topics in different periods is symmetrical, namely, for any user interest topicAnd andsimilarity ofAndthe similarity of the KL is equal, the original KL divergence is improved, and the similarity of the user interest subtopic based on the symmetrical KL divergence is preliminarily conceived as follows:
wherein p (w) and q (w) respectively represent the characteristic words w in the user interest sub-topicAndthe probability of occurrence of.
And S102, dividing the evolution of the drift trajectory of the user interest subtopic into three types of new interest generation, interest maintenance and interest disappearance according to the relationship between the user interest subtopic and the forward associated subtopic and the backward associated subtopic, analyzing corresponding user interest change points when the user behavior preference changes, and mining the long-term interest and the short-term interest of the user.
The user interest topic has a certain life cycle and is composed of a group of user interest subtopics which are mutually related. Let N be the size of the time sliding window, for the user interest sub-topic in time slice tThere is an association between user interest sub-topics within the sliding window adjacent to the time slice t. And in each time slice i (i ═ t-N, …, t-1) in the sliding windowThe user interest sub-topic with the largest similarity is calledIs referred to as a forward association subtopicAnd in time slice i (i ═ t +1, …, t + N)The sub-topic with the greatest similarity is calledIs written as a backward association subtopic
According to the relationship between the user interest subtopic and the forward and backward related subtopics thereof, the evolution of the drift trajectory of the user interest subtopic is divided into three types of emerging interest generation, interest maintenance and interest disappearance for research, which are respectively introduced as follows:
user interest subtopic generation-New interest Generation
For user interest subtopicIf there is no forward related interest subtopicSo thatAndis greater than a threshold value epsilon and does not satisfyThenIs an emerging interesting theme generated in time slice t, as shown in fig. 2.
User interest subtopic inheritance-interest preservation
For user interest subtopicIf there is a forward associated interest subtopicSo thatAndis greater than a threshold value epsilon, i.e.And isIs also thatBackward related interest sub-topic of (1), i.e.ThenIs thatAs shown in fig. 3. This phenomenon indicates that the interests of the user do not change much.
User interest subtopic disappearance-interest disappearance
For user interest subtopicIf no backward correlation subtopic existsSo thatAndis greater than a threshold value epsilon and does not satisfyThenIs died out in time slice t, as shown in fig. 4, indicating that the user no longer has this interest.
Step S103, calculating the absolute intensity and the relative intensity of the user behavior preference in the whole life cycle, constructing the intensity trend of the change track of the user behavior preference, and mining the weight of each interest.
And step S104, determining a user behavior preference change model according to the long-term interest, the short-term interest and the weight of each interest of the user.
With the development of time, the user interest topic strength changes along with the change of the user behavior preference. The invention excavates the weight of each interest by calculating the absolute intensity and the relative intensity of each interest of the user so as to judge whether the degree of interest of the user changes.
Let di={di1…diMRepresents a mobile user context log diM denotes a user scene log diThe number of words contained, i, represents the user context log diThe expressed user interest topic. The absolute intensity of the user interest topic i at the time t is calculated by the following method:
within the time slice of t', when the word | dij|t′Delta (| d) when belonging to the user interest topic iij|t′I) 1, otherwise δ (| d)ij|t′,i)=0。
The relative strength of the user interest topic i at the time t is calculated by the following method:
wherein t' is any time slice between t-N and t-1, K is the number of the user interest topics, and p is any one of the user interest topics K. Therefore, the intensity change of each behavior preference of the user can be realized by iteratively calculating the absolute intensity and the relative intensity of the user interest topic i in the whole topic life cycle, so as to judge whether the interest degree of each interest changes.
Based on the analysis, a user behavior preference change model can be constructed on the basis of mining the long-term interest and the short-term interest of the user and the weight of each interest.
In conclusion, the invention researches the behavior preference change of the user in the online social network of the user, learns the personalized preference of the user under different time-space situations, and discovers information resources which are really interested by the user from the vast social network environment according to the situation information at the current moment and the interest of the user, thereby meeting the personalized demand of the user on the information and practically improving the user experience; the research of the invention has important value for improving the application effect of the personalized service and seeking the breakthrough of the personalized information service technology in the online social network, so as to promote the further development of the online social application and service towards intellectualization.
The above disclosure is only a few specific embodiments of the present invention, and those skilled in the art can make various modifications and variations of the present invention without departing from the spirit and scope of the present invention, and it is intended that the present invention encompass these modifications and variations as well as others within the scope of the appended claims and their equivalents.
Claims (2)
1. A method for constructing a user behavior preference change model under an online social network is characterized by comprising the following steps:
under the established Hadoop parallel distributed processing environment, after interest topic information of a user is processed through a Map-Reduce process, the drift probability of the interest topic of the user is calculated by adopting symmetrical KL divergence, and a user interest change point is determined;
according to the relationship between the user interest subtopic and the forward related subtopic and the backward related subtopic, the evolution of the drift track of the user interest subtopic is divided into three types of new interest generation, interest maintenance and interest disappearance, when the user behavior preference changes, the corresponding user interest change point is analyzed, and the long-term interest and the short-term interest of the user are mined;
the forward associated user interest subtopic is: neutralizing each time slice t' in the sliding windowThe user interest subtopic with the greatest similarity is recorded asWherein t' is t-N, …, t-1, N is an integer;
the backward associated user interest subtopics are: neutralizing each time slice t' in the sliding windowThe user interest subtopic with the greatest similarity is recorded as
The new interest generation comprises the following steps: for user interest subtopicIf there is no forward related interest sub-topic Tl mSo thatAnd Tl mIs greater than a threshold epsilon, i.e., does not satisfy the user interest subtopic similarity based on the symmetric KL divergenceThenIs an emerging interesting topic generated in time slice t;
the interest preservation includes: for user interest subtopicIf there is a forward associated interest subtopicSo thatAnd Tl mIs greater than a threshold value epsilon, i.e.And isIs also Tl mBackward related interest sub-topic of (1), i.e.ThenIs Tl mThe interest of the user does not change much;
the interest disappears, including:for user interest subtopicIf no backward correlation subtopic T existsl mSo thatAnd Tl mIs greater than a threshold value epsilon and does not satisfyThen the subjectDying in time slice t, the user no longer has this interest;
calculating the absolute intensity and the relative intensity of the user behavior preference in the whole life cycle, constructing the intensity trend of the change track of the user behavior preference, and mining the weight of each interest;
the absolute intensities include: let di={di1…diMRepresents a mobile user context log diM denotes a user scene log diThe number of words contained, i, represents the user context log diThe expressed user interest topic; the absolute intensity of the user interest topic i in the time slice t adopts the following formula:
the relative intensities include: the relative strength of the user interest topic i in the time slice t adopts the following formula:
wherein t' is any time slice between t-N and t-1, K is the number of the user interest topics, and p is any one of the user interest topics K;
and determining a user behavior preference change model according to the long-term interest, the short-term interest and the weight of each interest of the user.
2. The method for constructing a user behavior preference change model under an online social network according to claim 1, wherein the calculating the drift probability of the user interest topic by using the symmetrical KL divergence and determining the user interest change point comprises:
the user interest is expressed by using the occurrence probability of the feature words in the user scene log, and the semantic similarity between the user interest topics in different periods is judged to correspond to the approximate degree of measurement between two probability distributions; let the sliding window contain N time slices, and let the interest j of the user in time slice t be recorded asThe original KL divergence is asymmetrical, but the semantic similarity between the user interest topics in different periods is symmetrical, namely, for any user interest topicAnd Tl m,And Tl mSimilarity to Tl mAndthe similarity of the KL is equal, the original KL divergence is improved, and the similarity of the user interest subtopic based on the symmetrical KL divergence is determined as follows:
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710883112.XA CN107609786B (en) | 2017-09-26 | 2017-09-26 | Method for constructing user behavior preference change model under online social network |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710883112.XA CN107609786B (en) | 2017-09-26 | 2017-09-26 | Method for constructing user behavior preference change model under online social network |
Publications (2)
Publication Number | Publication Date |
---|---|
CN107609786A CN107609786A (en) | 2018-01-19 |
CN107609786B true CN107609786B (en) | 2021-02-09 |
Family
ID=61058575
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201710883112.XA Active CN107609786B (en) | 2017-09-26 | 2017-09-26 | Method for constructing user behavior preference change model under online social network |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN107609786B (en) |
Families Citing this family (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111310033B (en) * | 2020-01-23 | 2023-05-30 | 山西大学 | Recommendation method and recommendation device based on user interest drift |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101082972A (en) * | 2007-05-30 | 2007-12-05 | 华为技术有限公司 | Method and device for forecasting user's interest to commercial product and method for publishing advertisement thereof |
CN106897363A (en) * | 2017-01-11 | 2017-06-27 | 同济大学 | The text for moving tracking based on eye recommends method |
CN107193456A (en) * | 2017-05-08 | 2017-09-22 | 上海交通大学 | Commending system and method based on slidingtype interactive operation |
US10430480B2 (en) * | 2012-08-22 | 2019-10-01 | Bitvore Corp. | Enterprise data processing |
-
2017
- 2017-09-26 CN CN201710883112.XA patent/CN107609786B/en active Active
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101082972A (en) * | 2007-05-30 | 2007-12-05 | 华为技术有限公司 | Method and device for forecasting user's interest to commercial product and method for publishing advertisement thereof |
US10430480B2 (en) * | 2012-08-22 | 2019-10-01 | Bitvore Corp. | Enterprise data processing |
CN106897363A (en) * | 2017-01-11 | 2017-06-27 | 同济大学 | The text for moving tracking based on eye recommends method |
CN107193456A (en) * | 2017-05-08 | 2017-09-22 | 上海交通大学 | Commending system and method based on slidingtype interactive operation |
Non-Patent Citations (1)
Title |
---|
混合模型的用户兴趣漂移算法;郭新明等;《智能系统学报》;20100430;第5卷(第2期);第181-184页 * |
Also Published As
Publication number | Publication date |
---|---|
CN107609786A (en) | 2018-01-19 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN107562947B (en) | Method for establishing dynamic instant recommendation service model under mobile space-time perception | |
Nguyen et al. | Real-time event detection for online behavioral analysis of big social data | |
WO2022041979A1 (en) | Information recommendation model training method and related device | |
CN107463704B (en) | Search method and device based on artificial intelligence | |
US20230229863A1 (en) | Content editing using AI-based content modeling | |
US20160170982A1 (en) | Method and System for Joint Representations of Related Concepts | |
Deng et al. | A user identification algorithm based on user behavior analysis in social networks | |
CN108108743B (en) | Abnormal user identification method and device for identifying abnormal user | |
CN111079442A (en) | Vectorization representation method and device of document and computer equipment | |
WO2023124029A1 (en) | Deep learning model training method and apparatus, and content recommendation method and apparatus | |
CN105760499A (en) | Method for analyzing and predicting online public opinion based on LDA topic models | |
WO2017087833A1 (en) | Measuring influence propagation within networks | |
Díaz-Morales | Cross-device tracking: Matching devices and cookies | |
Kim et al. | SMS spam filterinig using keyword frequency ratio | |
Zhao et al. | Text sentiment analysis algorithm optimization and platform development in social network | |
US20230401382A1 (en) | Dynamic Language Models for Continuously Evolving Content | |
MacDermott et al. | Using deep learning to detect social media ‘trolls’ | |
CN107609786B (en) | Method for constructing user behavior preference change model under online social network | |
Zhu et al. | MMLUP: Multi-Source & Multi-Task Learning for User Profiles in Social Network. | |
Dasondi et al. | An implementation of graph based text classification technique for social media | |
CN112307738A (en) | Method and device for processing text | |
CN114424197A (en) | Rare topic detection using hierarchical clustering | |
Bauersfeld et al. | Cracking double-blind review: Authorship attribution with deep learning | |
Tang | Some new confidence intervals for Kaplan‐Meier based estimators from one and two sample survival data | |
US10778803B2 (en) | Sub-social network based on contextual inferencing |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |