CN108737856B - Social relation perception IPTV user behavior modeling and program recommendation method - Google Patents

Social relation perception IPTV user behavior modeling and program recommendation method Download PDF

Info

Publication number
CN108737856B
CN108737856B CN201810385063.1A CN201810385063A CN108737856B CN 108737856 B CN108737856 B CN 108737856B CN 201810385063 A CN201810385063 A CN 201810385063A CN 108737856 B CN108737856 B CN 108737856B
Authority
CN
China
Prior art keywords
user
tensor
program
group
matrix
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Fee Related
Application number
CN201810385063.1A
Other languages
Chinese (zh)
Other versions
CN108737856A (en
Inventor
尹小燕
米晓倩
徐成
刘浩
王�华
王薇
徐丹
汤战勇
陈�峰
房鼎益
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Nowledge Data Co ltd
Original Assignee
Northwestern University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Northwestern University filed Critical Northwestern University
Priority to CN201810385063.1A priority Critical patent/CN108737856B/en
Publication of CN108737856A publication Critical patent/CN108737856A/en
Application granted granted Critical
Publication of CN108737856B publication Critical patent/CN108737856B/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/25Management operations performed by the server for facilitating the content distribution or administrating data related to end-users or client devices, e.g. end-user or client device authentication, learning user preferences for recommending movies
    • H04N21/251Learning process for intelligent management, e.g. learning user preferences for recommending movies
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/25Management operations performed by the server for facilitating the content distribution or administrating data related to end-users or client devices, e.g. end-user or client device authentication, learning user preferences for recommending movies
    • H04N21/251Learning process for intelligent management, e.g. learning user preferences for recommending movies
    • H04N21/252Processing of multiple end-users' preferences to derive collaborative data
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/25Management operations performed by the server for facilitating the content distribution or administrating data related to end-users or client devices, e.g. end-user or client device authentication, learning user preferences for recommending movies
    • H04N21/258Client or end-user data management, e.g. managing client capabilities, user preferences or demographics, processing of multiple end-users preferences to derive collaborative data
    • H04N21/25866Management of end-user data
    • H04N21/25891Management of end-user data being end-user preferences
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/45Management operations performed by the client for facilitating the reception of or the interaction with the content or administrating data related to the end-user or to the client device itself, e.g. learning user preferences for recommending movies, resolving scheduling conflicts
    • H04N21/466Learning process for intelligent management, e.g. learning user preferences for recommending movies
    • H04N21/4668Learning process for intelligent management, e.g. learning user preferences for recommending movies for recommending content, e.g. movies

Landscapes

  • Engineering & Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Computing Systems (AREA)
  • Computer Graphics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Two-Way Televisions, Distribution Of Moving Picture Or The Like (AREA)

Abstract

The invention discloses a social relationship perception IPTV user behavior modeling and program recommendation method, which mainly comprises the following steps: (1) analyzing the behaviors of IPTV users based on historical viewing logs, and constructing viewing preference models of the users and the groups to which the users belong by combining user group clustering results; (2) by utilizing a dense program category similarity coefficient matrix, avoiding adverse effects brought by data sparsity to recommendation, improving a tensor decomposition model and optimizing a recommendation result; (3) viewing preferences of IPTV users and the groups to which the IPTV users belong are comprehensively considered, and personalized IPTV program recommendation facing the users is realized. Compared with the existing recommendation algorithm, the method provided by the invention has the advantages that the recommendation accuracy, the recall rate and the like are remarkably improved.

Description

Social relation perception IPTV user behavior modeling and program recommendation method
Technical Field
The invention relates to the field of IPTV service, in particular to a social relationship perception IPTV user behavior modeling and program recommendation method.
Background
In recent years, with the push of the internet, IPTV has possessed a large number of users as one of the most typical applications of "convergence of three networks". IPTV provides personalized interactive services to users, such as live tv, video-on-demand, and time-shifted review. The user can watch the video content broadcast by each television station through the live television. Compared with the traditional broadcast television, through the IPTV, the user can watch richer channel contents, and a high-definition channel is provided for the user to select. And video on demand, a user can watch mass video resources recorded in advance at any time according to personal preference, wherein the mass video resources are generally movies, TV shows and comprehensive programs. Time-shifted review allows the user to view video content that has been broadcast by each television station for several days (typically 3 days).
IPTV provides great flexibility for users to watch tv programs, but users often face the problem of information overload when selecting interesting content from thousands of programs. In order to improve the satisfaction degree of the user, a personalized solution and a recommendation strategy are needed to help the user to quickly locate the interested programs.
Currently, several recommendation algorithms have been proposed in succession:
(1) popular program recommendation algorithm: and recommending the video with the highest viewing frequency to the user. Each user will receive the same program recommendation list, not a personalized recommendation method.
(2) Recommending programs based on the popularity of the user: and calculating the recommendation of the television programs which are watched most by each user according to the watching history record of the user. The algorithm solves the problem that popular articles recommend the same articles to each user, is more personalized, but cannot find new interests for the users.
(3) The recommendation algorithm based on singular value decomposition: a typical collaborative filtering recommendation algorithm based on matrix factorization. Such algorithms are often affected by data sparsity.
(4) Recommendation algorithm based on context awareness: and the personalization and the accuracy of recommendation can be further improved by combining the dynamically changing interests of the user in different time context environments. It destroys the inherent connections between entities in the original three-dimensional space.
(5) And (4) random recommendation: and randomly selecting a program category to recommend to the user.
(6) Tensor decomposition: and describing each entity in the recommendation system by using the tensor, and filling unknown elements in the initial tensor by using a tensor decomposition algorithm to achieve the prediction effect.
The algorithms are respectively suitable for different specific occasions, and different disadvantages may exist for different situations. In recent years, tensor decomposition is widely applied to a context-aware recommendation method, and potential semantic relations among users, items and contexts can be mined through a tensor decomposition algorithm, so that the users can be recommended the interested items. Tensor decomposition, however, is very time consuming for spatial storage, and decomposing a larger tensor is time consuming.
Moreover, due to the different life forms of users living in different regions, the interests of people in watching television programs are also different. For example, people living in rural areas and cities have differences in terms of occupation, life style, cultural degree, thought concept and the like, users living in rural areas mainly produce agriculture and pasturing, the seasonality is strong, the time is relatively loose, the time for watching television can be random, the watching time is relatively long, and the watching preference is also easily influenced by other people; for urban users, the living standard is high, the rhythm is fast, the working time is regular, the time for watching television programs is regular, and the requirements for watching television contents are different. Thus, people living in rural and urban areas may have different viewing preferences in different temporal contexts, and there may be errors if uniform criteria are used for program recommendations.
Disclosure of Invention
The invention provides a social relation perception IPTV user behavior modeling and program recommendation method, which comprises the steps of firstly using a historical watching log of a user as implicit feedback of preference, calculating loyalty and interestingness of the user to various programs by analyzing the watching behavior of the user, constructing a watching preference model of the user, clustering the user by combining the address category and the watching preference of the user, dividing the user into different groups, improving a tensor decomposition model, embedding a program category similarity coefficient matrix to reduce the sparsity of data, predicting preference values of the user and the groups by adopting hierarchical tensor decomposition, namely user layer tensor decomposition and group layer decomposition algorithm, and finally recommending the user by considering the watching preference characteristics of the user and the watching preference characteristics of the group.
In order to realize the task, the invention adopts the following technical scheme:
the social relationship-aware IPTV user behavior modeling and program recommendation method comprises the following steps:
step 1, establishing a preference model of a user watching a program
The preference model comprises loyalty and interestingness of a user to program types in a certain time period, wherein the loyalty is the ratio of the time length of the user watching a program of a certain type in the certain time period to the actual playing time length of the program of the type, and the interestingness is the ratio of the time length of the user watching the program of the certain type in the certain time period to the total time length of the user watching all program types;
determining preference values of the user on the program types in different time periods according to the preference model;
step 2, selecting the category of the user address and the preference value as the clustering characteristics, coding discrete characteristics in the characteristics, and normalizing numerical characteristics;
step 3, clustering the characteristics processed in the step 2 to obtain different user groups after clustering;
step 4, for the user-program category-watching time period of the user group, and:
group numbers of user groups, program categories and watching time periods are respectively expressed by three-dimensional tensors, and tensor decomposition and tensor reconstruction are respectively carried out to respectively obtain the best approximate tensor;
and 5, recommending programs for the user according to the optimal approximate tensor obtained in the step 4.
Further, the process of determining the preference value includes:
and respectively endowing different weights for the loyalty and the interest degree, and taking the sum of the loyalty and the interest degree endowed with the weights as the preference value.
Further, the weight value given to the loyalty is greater than the weight value given to the interestingness.
Further, when clustering is performed in step 3, the user address category and the preference values of the user for different programs, which are processed in step 2, are used as samples to form a sample set, and then clustering is performed on the sample set; in clustering, the method for selecting the initial clustering center comprises the following steps:
a) randomly selecting a sample from the sample set as a first clustering center;
b) for each sample except the first clustering center in the sample set, respectively calculating the distance between the sample and the first clustering center;
c) selecting a sample with the largest distance from the first clustering center as a new clustering center, namely a second clustering center;
d) for each sample except the two selected clustering centers in the sample set, respectively calculating the sum of the distances between the sample and the two selected clustering centers;
e) selecting a sample with the maximum distance sum of the two clustering centers as a new clustering center, namely a third clustering center;
and respectively calculating the sum of the distances between the remaining samples and the selected clustering centers according to the same method, and selecting the sample with the largest sum of the distances as a new clustering center until K clustering centers are selected.
Further, in step 4, the user-program category-viewing period of said user group is expressed by three-dimensional tensor as Auser∈RP×Q×VWhere P is the number of users in the user group, Q is the number of program types, and V is the number of time slots.
Further, after the user-program category-viewing period of the user group is expressed by a tensor in step 4, performing tensor decomposition includes:
tensor AuserDecomposed into a core tensor SuserAnd a user factor matrix UuserThe program category factor matrix GuserAnd a time period factor matrix TuserIs a modulo multiplication of where Suser∈Rdu×dg×dt,Uuser∈RP×du,Guser∈RQ×dg,Tuser∈RV×dtR represents a real number set, du, dg and dt satisfy 0<du<0.0006×P,0<dg<0.3×Q,0<dt<0.5×R。
Further, in step 4, after the user-program category-viewing period of the user group is expressed by a tensor and decomposed by the tensor, tensor reconstruction is performed to obtain an approximate tensor, and a formula of the approximate tensor is as follows:
Figure BDA0001641961880000041
in the above formula, Suser×UUuserRepresenting tensor SuserAnd matrix UuserU-modulus ofGGuserRepresenting tensor and matrix GuserG-modulus ofTTuserRepresenting tensor and matrix TuserThe T-product of (1).
Further, in step 4, after the user-program category-watching time period of the user group is expressed by a tensor, an approximate tensor is obtained after tensor decomposition, then a loss function is established through the approximate tensor, regularization is performed on the loss function, constraint is added, an objective function is obtained, then the objective function is solved, the optimal approximate tensor, the user factor matrix, the program category factor matrix and the time period factor matrix are determined according to the solving result, and then the tensor is reconstructed.
Further, the expression of the loss function is:
Figure BDA0001641961880000042
in the above formula, RuserThe method is characterized in that the method is a binary symbol tensor, and 0 and 1 are used for marking whether the preference information is unknown to a user; a. theuserThe tensor of users-program category-viewing period for the user group,
Figure BDA0001641961880000043
is an approximate tensor.
Further, the expression of the objective function is:
Figure BDA0001641961880000051
wherein
Figure BDA0001641961880000052
Represents the Frobenius norm,
Figure BDA0001641961880000053
to add constraints to help obtain a more accurate program category factor matrix Guser(ii) a X is a program category-program category similarity coefficient matrix which can be decomposed into a matrix GuserSum matrix HuserProduct of (A) and (B), Huser∈Rdg×QBy factoring auxiliary matrices, parameters lambda, for program types1And λ2,λ3Adjusting parameters for different models, wherein the value range is as follows: {0.1,0.2,0.01,0.001}.
Further, the solving process of the objective function includes:
from Suser∈Rdu×dg×dt,Uuser∈RP×du,Guser∈RQ×dg,Tuser∈RV×dt,Huser∈Rdg×QRandomly selecting a group of matrix S with each element being real numberuser *、Uuser *、Guser *、Tuser *And Huser *To approximate Suser,Uuser,Guser,TuserAnd HuserThen for the five components S of the objective functionuser,Uuser,Guser,TuserAnd HuserRespectively taking partial derivatives to obtain the gradient of each component in the target function, and then, according to the obtained gradient pair S of each componentuser,Uuser,Guser,TuserAnd HuserUpdating, re-taking partial derivatives and calculating gradient after updating until the target function is converged and S is obtained when the target function is convergeduser *、Uuser *、Guser *、Tuser *As the optimum Suser、Uuser、Guser、Tuser
A social relationship aware IPTV user behavior modeling and program recommendation system comprises a preference model building module, a data preprocessing module, a clustering module, a tensor module and a program recommendation module which are connected in sequence and are respectively used for realizing the functions of claims 1 to 4.
Compared with the prior art, the invention has the following technical characteristics:
(1) the method analyzes the influence of regional characteristics on the viewing preference of the user, clusters the user by combining the address category (urban area, county city, village and village) of the user and the viewing preference model of the user, and constructs the viewing preference model of the user and the group to which the user belongs based on the user group clustering result;
(2) according to the method, extra constraint is embedded in a tensor decomposition model, namely a dense program category similarity coefficient matrix is utilized to avoid adverse effects brought by data sparsity to recommendation, the tensor decomposition model is improved, and a recommendation result is optimized;
(3) the invention adopts hierarchical tensor decomposition, namely tensor decomposition of a user layer and tensor decomposition of a group layer, comprehensively considers the viewing preference of IPTV users and the groups to which the IPTV users belong when recommending the users, and realizes user-oriented personalized IPTV program recommendation.
Drawings
FIG. 1 is a diagram illustrating the relationship between the contour coefficient and the K value;
FIG. 2 is a user geographical distribution statistical chart;
FIG. 3 is a flow chart of a method of the present invention;
FIG. 4 is a flow chart of the k-means algorithm;
FIG. 5 is a flow diagram of an improved tensor resolution algorithm;
FIGS. 6-11 are graphs comparing the performance of the method of the present invention with that of a prior art method, in which:
FIG. 6 is a graph of accuracy comparisons for six recommendation systems;
FIG. 7 is a graph comparing recall rates for six recommendation systems;
FIG. 8 is a graph comparing the performance of the six recommendation systems F-score;
FIG. 9 is a chart comparing diversity of six recommendation systems;
FIG. 10 is a comparison graph of coverage for six recommendation systems;
FIG. 11 is a comparison chart of novelty of six recommendation systems.
Detailed Description
IPTV, one of the most important applications in "convergence of three networks", already has a large number of users. Through the IPTV service, the user can watch any interested television program at any time, thereby greatly increasing the flexibility of selecting the television program for the user. However, users often face the problem of information overload when selecting among thousands of movies, television shows, or other programs. Moreover, users living in different regions have different life forms (such as production modes, living standards, daily work and rest, and the like), and people have different interests in watching television programs. People living in rural areas and cities are different in terms of occupation, life style, cultural degree, thought concept and the like, the main production mode of users living in rural areas is farming and pasturing, the seasonality is strong, the time is relatively loose, the relatives and neighborhoods move more, the watching time of the users is related to the time of farming activities, the time is relatively long, and the watching preference is easily influenced by other people; for urban users, the living standard is high, the rhythm is fast, and the working time is regular, so that the time for watching television programs is regular. People living in rural and urban areas have different needs for viewing content and different preferences.
Therefore, in order to enable a user to obtain more personalized and accurate television program recommendation, the invention provides a program recommendation method integrating regional characteristics and group preferences.
When the IPTV user behavior modeling and program recommendation method based on social relationship perception carries out recommendation of television programs for users, regional characteristics of the users are fused into the recommendation, the individual viewing preference of the users and the viewing preference of all users in a group where the users are located are considered together, namely hierarchical tensor decomposition is carried out, firstly, tensor modeling and decomposition are carried out on each user of each group generated by clustering according to the viewing preference, secondly, tensor modeling and decomposition are carried out on K groups generated by clustering according to the viewing preference, and finally, the possible preference value of the users is predicted by combining the two groups, so that a program list of each time context is recommended for the users. The method specifically comprises the following steps:
a social relationship perception IPTV user behavior modeling and program recommendation method comprises the following steps:
step 1, establishing a preference model of a user watching a program
The preference model comprises loyalty and interestingness of a user to program types in a certain time period, wherein the loyalty is the ratio of the time length of the user watching a program of a certain type in the certain time period to the actual playing time length of the program of the type, and the interestingness is the ratio of the time length of the user watching the program of the certain type in the certain time period to the total time length of the user watching all program types;
determining preference values of the user on the program types in different time periods according to the preference model;
and taking the historical watching logs of the IPTV users as implicit feedback of the preference of the IPTV users, and calculating the preference characteristics, namely loyalty and interestingness, of the users to each program category through the watching time length of the users, the playing time length of the programs and the IPTV service type. For example, in this embodiment, the statistical data set has 62 channels, which are divided into 16 program categories: finance, international, sports, military and agricultural, drama, news, children's wear, shopping, martial arts, documentaries, science and education, social and legal, music, art programs, television sitcoms and movies; each program category includes a plurality of program types, for example, a tv series category includes: family, crime, suspicion, ancient times, comedy, love, war, science fiction novels, martial arts, action, city, history, idol, military, police, criminal investigation and spy, while the movie category encompasses 14 program types of crime, thrill, action, comedy, love, science fiction, adventure, suspicion, drama movie, horror, literature, history, war and cartoon. By calculating the loyalty and interest of the user to each program type in the program category at different time periods, the preference characteristics of the user at different time contexts can be obtained. The specific calculation process is as follows:
step 1.1, dividing the user viewing time period
And (4) counting data, wherein the viewing preference of the user is periodically changed along with time, and the viewing behaviors of the user in the same time period of a working day and a holiday are different in consideration of the influence of working factors. In order to better analyze the change of the viewing data along with the time, 24 hours a day is divided into 7 time periods, and two conditions of a working day and a holiday are respectively considered, so that 14 time periods (the sum of the 7 time periods of the working day and the 7 time periods of the holiday) are obtained; a specific time period division manner is shown in table 1 below:
table 1: user viewing time period division
Starting time End time Description of time periods
06:01 08:00 Early morning (morning)
08:01 12:00 In the morning
12:01 13:00 Noon is a Chinese traditional musical instrument
13:01 18:00 In the afternoon
18:01 19:30 In the evening
19:31 24:00 Night time
00:01 06:00 Early morning
Step 1.2, constructing a preference model of the program watched by the user
The preference model is as follows:
Figure BDA0001641961880000081
Figure BDA0001641961880000082
in the above formula, m and n respectively represent the number of program types and service types.
Loyaltyi,j,dIndicating the loyalty of user i to type j of the program during time period d, i.e. the ratio of the time during which user i views the type of program during time period d to the actual playing time of the type of program,
Figure BDA0001641961880000083
indicating that user i views the section during time period dWeighted total duration of the destination type j, tj,k,dIs the time, T, at which user i watches program type j via service type k during time period dj,kIs the actual play-out time length of the program type j by the service type k,
Figure BDA0001641961880000084
is the total duration of the play through all service types program type j. w is akIs the weight of service type k.
Interesti,j,dRepresenting the interest of the user i in the program type j in the time period d, i.e. the ratio of the total time for the user i to watch the program of the type in the time period d to the total time for the user i to watch all the program types in the time period d,
Figure BDA0001641961880000091
is the weighted total length of time that user i watches all program types through all service types during time period d.
For example, in IPTV, service types k include Catch-up TV, VOD, and Live TV. According to data statistics, the weights of the three service types meet the following conditions: catch-up TV > VOD > Live TV, so the weights for the three service types can take on values of 2, 1.5 and 1.
Step 1.3, respectively assigning different weights to the loyalty and the interest degree, and taking the sum of the loyalty and the interest degree after assigning the weights as a preference value of the user i to the program type j in the time period d, wherein the formula of the preference value is as follows:
ai,j,d=Loyaltyi,j,dW1+Interesti,j,dW2
in the above formula, W1,W2The weights assigned to loyalty and interestingness, respectively. In the preference model, it is believed that the loyalty of a user to a program is more representative of the user's preference for that program type. Thus, the weight given to the loyalty is greater than the weight given to the interest, W1>W2(ii) a For example, W1,W2Values of (d) can be 0.8 and 0.2.
For example, the pair of sections of the user i in the time period d (12:01-13:00) is calculatedA viewing preference value of the eye type j. Suppose the service type Catch-up TV shows program type j at 10 times: 00-14:00, user i watches TV programs via the service type Catch-up TV in time period 11:30-13:30 (weight w occupied by the service type Catch-up TV)k2) where 12:30-13:30 is watching program type j, the user has loyalty to program type j during time period 12:01-13:00
Figure BDA0001641961880000092
Degree of interest
Figure BDA0001641961880000093
The viewing preference value of the user i for the program type j in the time period d (12:01-13:00) is ai,j,d=0.125×0.8+0.25×0.2=0.15。
Through the steps, the preference values of different program types under different time periods for a certain program category of the user can be obtained.
Step 2, data preprocessing: selecting the category of the user address and the preference value as clustering characteristics, coding discrete characteristics in the characteristics, and normalizing numerical characteristics;
the invention selects the category of the user address and the preference value of the user to each program type as the clustering characteristic. In these clustering features, the preference value is continuous numerical data, while the category of the user address is discrete variable, so that the discrete variable needs to be converted into a numerical value first, and then the data is normalized.
Step 20, discrete feature processing: the discrete features are converted into numerical values by using One-hot Encoding, and discrete user address categories { urban district, county city, village and town } are processed in the invention. In one-hot encoding, a total of M different classes of features are defined, each class being encoded with one register bit, "1" indicating that bit is valid and "0" indicating that bit is not valid, so that a feature can be represented by an M-bit status register and only one bit is valid at a time.
The use of the one-hot encoding process has two benefits, and firstly, the problem that the clustering algorithm is difficult to process discrete features is solved. Using the one-hot code to expand the value of the discrete features to an Euclidean space, so that a certain value of the discrete features corresponds to a point in the Euclidean space, and the distance between the features can be conveniently calculated by a clustering algorithm; second, features are extended to some extent. For each feature, Y possible values are originally available, and then the Y binary features are obtained after the one-hot encoding processing.
Therefore, the four categories { urban district, county city, village and town } of the user address in the invention can be expressed as {1000,0100,0010,0001} by using the unique hot code. After the unique hot coding is adopted, the user address is represented by four characteristics; for example, if the user's address is a town, the corresponding code is 0010.
Step 21, normalization processing
If the difference between the values of the plurality of features is large, for example, the viewing preference values of the user for two different program types are 0.01 and 2.5 respectively, convergence is slow, and therefore, the data needs to be converted into the value between (0,1) according to a certain rule. The user preference values for various programs are continuous numerical variables, and Max-Min is adopted for normalization processing in the invention, as shown in the following formula.
Figure BDA0001641961880000101
Wherein X is the preference value ai,j,d,XmaxMaximum value, X, of user preference values for different program typesminIs the minimum value, X, of user preference values for different program typesnormIs the preference value after the normalization processing.
Step 3, clustering the characteristics processed in the step 2 to obtain different user groups after clustering;
the data is pre-processed and then clustered to analyze the characteristics of each class of users. In this embodiment, the user geographical features, i.e., the category of the user's address and the preference value features, i.e., the user's preference values for 16 program categories (financial, international, sports, military and agricultural, drama, news, animation, shopping, martial arts, documentaries, science and education, social and law, music, art, tv drama and movies) are selected as the clustering features.
The K-means algorithm is adopted for clustering in the scheme, and the main reason is that compared with other clustering algorithms such as hierarchical clustering and the like, the K-means clustering algorithm is high in convergence speed, low in algorithm complexity, easy to explain, simple to realize and suitable for the condition of large data volume. The purpose of the K-means algorithm is to divide the samples into K clusters, so that the cluster internal sample relation is compact, and the cluster samples are independent. The specific process is as follows:
step S30, determining the cluster number K
For the selection of the value of the cluster number K, the invention sets that the number of categories that the user classifies according to the region characteristics and the preference value characteristics does not exceed 10 categories (which is convenient for the analysis and description of each category of characteristics), so an enumeration method is adopted. As shown in fig. 1, let K be from 2 to 10, repeat K-means clustering on samples for multiple times (to avoid a local optimal solution) at different K values, and the clustering process is steps S31-S34, and calculate an average contour coefficient corresponding to each K value, and finally select the K value when the contour coefficient is maximum as the final clustering number. The contour coefficient combines the degree of clustering and the degree of separation to evaluate the clustering effect, and the coefficient is between-1 and 1, and the larger the value is, the better the clustering effect is, therefore, the contour coefficient is often used to select the best clustering number.
For example, with the present invention, as can be seen from fig. 1, when K is 8, the value of the contour coefficient of the cluster is the highest, reaching 0.843, so the user chooses to divide the user into 8 groups according to the region characteristics and the viewing preference value characteristics to study the characteristics.
Step S31, selecting a clustering center
And (3) taking the user address category processed in the step (2) and the preference value of the user to different program categories as samples to form a sample set, wherein each data in the set represents one sample.
For the selection of cluster centers, a locally optimal situation may occur if the initial cluster center is simply chosen randomly, since points with larger relative distances are less likely to belong to the same cluster. Therefore, the invention improves the selection of the initial clustering center, and selects K points which are as far away from each other as possible as the initial clustering center. The selection process of the initial cluster center is as follows:
a) randomly selecting a sample from the sample set as a first clustering center;
b) for each sample except the first clustering center in the sample set, respectively calculating the distance between the sample and the first clustering center;
c) selecting a sample with the largest distance from the first clustering center as a new clustering center, namely a second clustering center;
d) for each sample except the two selected clustering centers in the sample set, respectively calculating the sum of the distances between the sample and the two selected clustering centers;
e) selecting a sample with the maximum distance sum of the two clustering centers as a new clustering center, namely a third clustering center;
and respectively calculating the sum of the distances between the remaining samples and the selected clustering centers according to the same method, and selecting the sample with the largest sum of the distances as a new clustering center until K clustering centers are selected.
Step S32, dividing the cluster
Respectively calculating the distance between each sample in the sample set and each cluster center (cluster center), and dividing the distance into the clusters with the closest distance according to the obtained distance; that is, for a cluster center, samples that are closer to the cluster center than to other cluster centers form a cluster together with the cluster center.
Step S33, recalculating the average value of all samples in each cluster as the new cluster center of the cluster;
step S34, repeating step S32 and step S33 until the sum of squared Euclidean distances from all samples to the cluster center reaches the minimum, namely the objective function
Figure BDA0001641961880000121
Convergence, where n is the number of samples, K is the number of cluster centers, xiDenotes the ith sample, μkRepresents the k-th clustering center, and the value of J is [0.01,0.0001 ]]。
Step 4, for the user-program category-watching time period of the user group, and:
the group number of the user group, the program category, and the viewing period are respectively expressed by three-dimensional tensors, and tensor decomposition and tensor reconstruction are respectively performed to respectively obtain the best approximate tensors.
Classifying samples by using clustering characteristics, wherein users exist singly before clustering, and after clustering, users with similar characteristics are divided into the same group (cluster), and then tensor decomposition is carried out twice: (1) analysis from a user perspective, i.e., viewing preferences analysis is performed for each user of 8 groups separately (2) analysis from a group perspective, i.e., viewing preferences analysis is performed for all users of each group.
Step S40, tensor decomposition of user layer:
and carrying out tensor modeling and tensor decomposition and approximate tensor reconstruction on the three-dimensional relationship of the user, the program category and the watching time period of each clustered user group through the tensor to obtain the optimal approximate tensor, thereby obtaining the missing user preference value. The method comprises the following specific steps:
① creating three-dimensional tensor A for each clustered useruser∈RP×Q×VWhere P is the number of users in the user group, Q is the number of program types, and V is the number of time slots. Initial elements in a constructed three-dimensional tensor
Figure BDA0001641961880000131
I.e. the preference value calculated by step 1 (superscript user added). The user layer tensor decomposition model is mainly based on Tucker decomposition, and the initial tensor AuserDecomposed into a core tensor SuserAnd a user factor matrix UuserThe program category factor matrix GuserAnd a time period factor matrix TuserIs a modulo multiplication of where Suser∈Rdu×dg×dt,Uuser∈RP×du,Guser∈RQ×dg,Tuser∈RV×dtR represents a real number set, du, dg and dt generally satisfy 0<du<0.0006×P,0<dg<0.3×Q,0<dt<0.5 XV. The core tensor and the modulus multiplication of each factor matrix can be approximated to the initial tensor as much as possible through iterative training, and then the approximate tensor is reconstructed by reversely using a tensor decomposition formula as formula 1
Figure BDA0001641961880000132
The approximate tensor is filled with unknown elements in the initial tensor, the approximate tensor
Figure BDA0001641961880000133
Value of middle element
Figure BDA0001641961880000134
I.e., the predicted viewing preference of the user for the group, as shown in the following formula.
Figure BDA0001641961880000135
Figure BDA0001641961880000136
In the above formula, Suser×UUuserRepresenting tensor SuserAnd matrix UuserU-modulus ofGGuserRepresenting tensor and matrix GuserG-modulus ofTTuserRepresenting tensor and matrix TuserThe T-product of (1). In the formula 9, the first and second groups,
Figure BDA0001641961880000137
respectively representing the core tensor SuserA user factor matrix UuserProgram category factor matrix GuserAnd a time period factor matrix TuserOf (1).
② the goal of predicting the preference value of each group user is to reduce the initial tensor A as much as possibleuserAnd approximate tensor
Figure BDA0001641961880000138
The difference between them. Therefore, the invention adopts the mean square error as the point-by-point loss function C of tensor decompositionuserRepresented by the formula, wherein RuserIs binary symbol tensor, 0 and 1 are used to mark the unknown preference information of the user, if A isuserElement (1) of
Figure BDA0001641961880000139
Is not zero, then RuserIs 1, otherwise is 0.
Figure BDA0001641961880000141
③ data sparsity processing and tensor decomposition model improvement
Since the categories of tv programs watched by the user are small overall, the initial tensor is very sparse. The tensor decomposition model is improved, namely, an additional constraint condition is embedded in the objective function, and the dense program category similarity coefficient matrix is added, so that the low-dimensional representations of two similar program categories are forced to be as close as possible, the influence of data sparsity is further reduced, and the recommendation effect is improved.
The invention constructs a program category-program category similarity coefficient matrix X belonging to R by calculating the similarity between the television program categories and according to the similarity between any two program categories obtained by calculationQ×QIt is calculated as shown below:
Figure BDA0001641961880000142
xe,f=sim(E,F)
wherein E represents a set of program types of category E, F represents a set of program types of category F, xe,fRepresenting the similarity of the elements in the similarity matrix X, i.e., the program category E and the program category F.
Thus, the following objective function L is defineduser
Figure BDA0001641961880000143
Wherein
Figure BDA0001641961880000144
Represents the Frobenius norm,
Figure BDA0001641961880000145
to add constraints to help obtain a more accurate program category factor matrix Guser(ii) a The program category-program category similarity coefficient matrix X may be decomposed into a matrix GuserSum matrix HuserProduct of (A) and (B), Huser∈Rdg×QThe secondary matrix is a factor of the program type,
Figure BDA0001641961880000146
for the regularization term added to prevent overfitting during the solution, parameter λ1And λ2,λ3Adjusting parameters for different models, wherein the value range is as follows: {0.1,0.2,0.01,0.001}. U shapeuser,Guser,Tuser,Suser,Huser≧ 0 is the nonnegative constraint, specifically, just after an updated value is obtained, if the value is greater than 0, we retain it; otherwise, the updated value will be replaced with 0. Because the viewing preference values of the users are all non-negative values, the model can be more explanatory by adding non-negative constraints, the negative numbers do not participate in iterative update calculation, and the calculation complexity is reduced.
④ the objective function is solved by random gradient descent methoduser∈Rdu×dg×dt,Uuser∈RP×du,Guser∈RQ×dg,Tuser∈RV×dt,Huser∈Rdg×QRandomly selecting a group of matrix S with each element being real numberuser *、Uuser *、Guser *、Tuser *And Huser *To approximate Suser,Uuser,Guser,TuserAnd HuserThen for the five components S of the objective functionuser,Uuser,Guser,TuserAnd HuserRespectively taking partial derivatives to obtain the gradient of each component in the target function, and then, according to the obtained gradient pair S of each componentuser,Uuser,Guser,TuserAnd HuserAnd updating, wherein the updating formula is shown in formulas 2,3, 4, 5 and 6.
Figure BDA0001641961880000151
Figure BDA0001641961880000152
Figure BDA0001641961880000153
Figure BDA0001641961880000154
Figure BDA0001641961880000155
Figure BDA0001641961880000156
Where μ denotes the number of iterations, η denotes the learning rate, the superscripted 'component denotes the updated component, the unsupplemented' component denotes the component before the update, e.g. Suser *' denotes updated Suser *
Continuously iteratively updating until the target function is converged
Figure BDA0001641961880000157
Wherein
Figure BDA0001641961880000158
Respectively representing the objective functions after the mu and mu +1 iterations, wherein epsilon represents any small positive integer, the smaller the value of epsilon is, the closer the value is to the initial tensor, and the value range is generally [0.01, 0.0001%]. Will be S at the time of convergence of the objective functionuser *、Uuser *、Guser *、Tuser *As the optimum Suser、Uuser、Guser、Tuser
⑤ obtaining the best S based on the above stepsuser、Uuser、GuserAnd TuserUsing equation 1, reconstruct to get the best approximate tensor
Figure BDA0001641961880000161
At this time, in the approximate tensor, each element
Figure BDA0001641961880000162
The value of (d) is the predicted preference value of the user i in the group for the program of a certain program type j in a specific time period d.
And respectively establishing tensor models for the K groups according to the method, and carrying out tensor decomposition and reconstruction to obtain the viewing preference prediction values of all users of the user layer.
Step S41, tensor decomposition of group layer: and establishing a three-dimensional tensor model for the group to express the three-dimensional relationship among the group number, the program category and the viewing time period, and finally carrying out tensor decomposition to obtain a viewing preference value predicted value of the group. The method comprises the following specific steps:
① construct an initial group tensor Agroup∈RM×N×LM is the number of groups, N is the number of program types, and L is the number of time slots. The element of the initial group tensor is the average viewing preference of each group, that is, the viewing preference of each group to various programs in each time period, and the calculation formula is as follows:
Figure BDA0001641961880000163
wherein the content of the first and second substances,
Figure BDA0001641961880000164
indicating a preference value for program type j for user group k over time period d,
Figure BDA0001641961880000165
representing a preference value of a user i in the group k for the program type j within a time period d, W representing the number of users in the group i; in this scheme, the clustering is performed to divide the groups into 8 groups, so that i is 1, 2.
Initial tensor A based on Tucker decompositiongroupCan be approximated as a core tensor SgroupAnd factor matrices of various dimensions, i.e. group factor matrix UgroupThe program category factor matrix GgroupAnd a time period factor matrix TgroupIn which S isgroup∈Rdr×ds×dq,Ugroup∈RM×dr,Ggroup∈RN×ds,Tgroup∈RL×dqR represents a real number set, du, dg and dt satisfy 0<du<0.0006×M,0<dg<0.3×N,0<dt<0.5 XL. Approximate group tensor
Figure BDA0001641961880000166
As shown in formula 7,
Figure BDA0001641961880000167
middle element
Figure BDA0001641961880000168
The predicted group viewing preference value of (a) is as in equation 8.
Figure BDA0001641961880000169
Figure BDA0001641961880000171
In formula 7, Sgroup×TGgroupRepresenting tensor SgroupAnd matrix UgroupU-modulus ofGGgroupRepresenting tensor and matrix GgroupG-modulus ofTTgroupRepresenting tensor and matrix TgroupThe T-product of (1). In the formula 8, the reaction mixture is,
Figure BDA0001641961880000172
respectively representing the core tensor SgroupA user factor matrix UgroupProgram category factor matrix GgroupAnd a time period factor matrix TgroupOf (1).
② to approximate the group tensor
Figure BDA0001641961880000173
Approximating the initial group tensor as closely as possible, a loss function C is first establishedgroup
Figure BDA0001641961880000174
On the basis of the loss function, an objective function is established:
Figure BDA0001641961880000175
in the above formula, wherein
Figure BDA0001641961880000176
Denotes the Frobenius norm, Hgroup∈Rdg×NBy a subsidiary matrix, lambda, being a program type factor2、λ3The value range is as follows: {0.1,0.2,0.01,0.001}.
Solving the objective function by adopting a stochastic gradient descent method in the step ④ in the step S40, and obtaining the optimal core tensor S in the convergence of the objective functiongroupGroup factor matrix UgroupProgram category factor matrix GgroupAnd a time period factor matrix TgroupThen reconstructed by equation 7 to obtain the best approximate group tensor
Figure BDA0001641961880000177
Then the best approximate group tensor
Figure BDA0001641961880000178
Each element in (1)
Figure BDA0001641961880000179
Representing the predicted value of the preference value of the group k for the program type j over the time period d. The principle and basic process of this step are the same as step S40.
Step 5, recommending television programs
And 4, carrying out linear weighting on the viewing preference values of the user layer and the group layer obtained in the step 4 to obtain a final user preference value prediction value, thereby recommending the television programs for the user.
The user's viewing prediction preference for a given program category at a given time period is
Figure BDA00016419618800001710
The calculation formula is as follows:
Figure BDA00016419618800001711
wherein the content of the first and second substances,
Figure BDA0001641961880000181
the predicted preference value for the program type j for the user i at the time period d calculated at step S40,
Figure BDA0001641961880000182
the predicted value of the preference value of the group k to the program type j in the time period d obtained in the step S41, and the user i is in the group k, α is the weight of the predicted value of the preference of the user layer, and the value is [0,1 ]]And (1- α) is the weight of the preference prediction value of the group in which the group level user is located.
According to the obtained viewing preference values of various programs of each user in each time period, program recommendation can be carried out by sequencing the predicted preference values of the { i, d } pairs for a specific user i and the viewing time period d of the specific user. If we want to recommend Z program types to user i at a certain time d, we select the program of the Z program type with the highest recommendation prediction value.
Performance analysis of the method of the invention
The recommendation method provided by the invention is experimentally evaluated on a real IPTV user viewing data set, and the influence of each recommendation algorithm on the performance of each aspect is compared.
The method comprises the following steps: data set statistics
The data set used in this experiment was real viewing data provided by china unicom (one of three telecommunications carriers in china). Data is from tv program content viewing logs and live tv program schedule logs for each day from 2016, 10, 1, to 2017, 1, of IPTV users over 48.1 ten thousand in south fluvial province, and installed addresses of IPTV users in south fluvial province. In the experiment, the collected user audience logs are divided into a training data set and a testing data set, wherein the training data set comprises audience logs between 2016, 10 and 1 month and 2016, 12 and 14 months, and is 75 days in total, and the testing data set comprises audience logs between 2016, 12 and 15 months, and 2017, 1 and 1 month. Table 1 shows the details of the data set, table 2 is a segment of a tv program content viewing log, table 3 shows a segment of a live tv program schedule log, and table 4 shows a segment of user installed address data.
Table 1 statistics of data sets
Figure BDA0001641961880000183
TABLE 2 television program content viewing Log segment
Figure BDA0001641961880000191
TABLE 3 live television program schedule Log segment
Figure BDA0001641961880000192
TABLE 4 user installed address data fragment
Figure BDA0001641961880000193
Figure BDA0001641961880000201
In this experiment, users with both address and preference values were first screened from the data, collectively 320,430. Then, the address is subdivided into four categories of urban district, county city, village and village, as shown in fig. 3, wherein 44,448 people in the urban district, 50,042 people in the county city, 53,574 people in the county town and 172,366 people in the village are obviously more users than urban users, which also accords with the development mode of IPTV in China, and the address is firstly popularized in the rural area of the south Henan province on a trial basis. Because many user address data formats are messy, such as the address "shang feng shidao zhang gou" in Henan province, and the detailed address should be "shang feng shidao xiang zhang gou cun" in Henan province, the address needs to be manually filled, and the format is standardized.
In order to research the viewing preference of users by combining regional characteristics, 1 ten thousand users are randomly selected from four types of addresses according to a proportion to serve as samples, so that the data processing amount can be reduced, the system resources are saved, and the reflecting regularity of the users can be more prominent, wherein at the moment, 1,387 people in urban areas, 1,561 people in counties, 1,672 people in towns and 5,379 people in villages are selected from the samples.
Step two: analysis of the results of the experiment
Figure 6 shows the accuracy comparison results for six recommendation systems. The accuracy refers to the proportion of the program categories actually watched by the user in the recommendation list. It can be seen from fig. 6 that the recommendation method of the present invention has higher accuracy than the other five methods, and as the number of recommended program categories for the user increases, the accuracy starts to increase and then gradually decreases. This is because the user only watches a few types of programs in reality, and as the number of recommended categories increases, the ratio naturally increases first and then decreases.
Figure 7 shows the recall rates for the six recommendation systems for different values of n. The recall rate refers to the proportion of programs recommended by the recommendation list in the program categories actually watched by the user. Higher recall indicates more accurate user preferences predicted by the recommendation system. In fig. 7, it can be seen that the recommendation method of the present invention obtains a higher recall rate starting from n-4 than other algorithms, and the recall rate increases as the number of recommendation types increases.
Figure 8 shows the results of the F-score performance comparison for the six recommendation systems. The F-score can comprehensively evaluate the accuracy and the recall rate of the recommendation algorithm. As can be seen from FIG. 8, F-score is positively correlated with the number of recommended types. And, starting from n-4, the program recommendation method provided by the invention has about 10% improvement in accuracy compared with other 5-class algorithms.
Fig. 9, fig. 10, and fig. 11 show the coverage, diversity, and novelty comparison results of the six recommendation systems, respectively. The coverage is the ratio of the recommended program category to the total number of program categories to the user. The coverage rate reflects the capability of the recommendation algorithm for mining long-tail programs, and is an index which is very concerned by content providers. Diversity refers to the ability of a recommendation system to recommend categories of programs to a user that satisfy the user's broad interests. Novelty is the ability to recommend to a user categories of television programs that are cooler and not watched by the user.
It can be seen from the figure that as the number of recommended program categories increases, the coverage and diversity of the recommendation algorithm increases, and the novelty decreases. The lower the novelty, the cooler the category is represented and the more novel it is for the user. The more novel the recommended program category is, the stronger the recommendation algorithm is in the ability to explore the long-tail category is, the more program categories are covered, the higher the coverage rate is naturally, and the diversity of the categories is high. For coverage and diversity and novelty of each recommendation algorithm, the Random algorithm (Random) exhibits the best effect, because the Random algorithm randomly selects K program categories to recommend to the user each time, each category of program may be recommended, the randomness is extremely high, but on the other hand, the accuracy is less than 0.1. The Most Popular program recommendation algorithm (Most Popular) is the least effective, and all users receive the same recommendation list by adopting the method, which is a very non-personalized recommendation method. The program recommendation algorithm (UserMostPopular) based on the popularity of the user recommends the program category of which the user has more historically watched to the user every time, is a relatively personalized recommendation, and the coverage rate, diversity and novelty of the recommendation algorithm are better than those of the popularity of the program recommendation algorithm. From the results in the figure, the non-personalized recommendations compared to the personalized recommendations performed less effectively as expected. The SVD algorithm can find potential association between users and program categories through matrix decomposition, the coverage rate, diversity and novelty are higher than those of a program recommendation algorithm based on user popularity, which only recommends the category that users watch most, and the watching time period of the users is not considered, and the programs recommended in any time period are the same as those of the program recommendation algorithm based on user popularity. Context-aware based algorithms (Context-aware) recommend different program categories at different time periods, but because they recommend to the user the program category that is the most historically watched by the user for that time period, coverage and diversity, novelty are low. The coverage rate, diversity and novelty of the recommendation method are second to random recommendation, and the comparison result in the figure shows the superiority of the program recommendation method.

Claims (7)

1. The social relationship-aware IPTV user behavior modeling and program recommendation method is characterized by comprising the following steps of:
step 1, establishing a preference model of a user watching a program
The preference model comprises loyalty and interestingness of a user to program types in a certain time period, wherein the loyalty is the ratio of the time length of the user watching a program of a certain type in the certain time period to the actual playing time length of the program of the type, and the interestingness is the ratio of the time length of the user watching the program of the certain type in the certain time period to the total time length of the user watching all program types;
determining preference values of the user on the program types in different time periods according to the preference model;
step 2, selecting the category of the user address and the preference value as the clustering characteristics, coding discrete characteristics in the characteristics, and normalizing numerical characteristics;
step 3, clustering the characteristics processed in the step 2 to obtain different user groups after clustering;
step 4, for the user-program category-watching time period of the user group, and:
group numbers of user groups, program categories and watching time periods are respectively expressed by three-dimensional tensors, and tensor decomposition and tensor reconstruction are respectively carried out to respectively obtain the best approximate tensor;
step 5, recommending programs for the user according to the optimal approximate tensor obtained in the step 4;
in step 4, after a user-program category-watching time period of the user group is expressed by a tensor, an approximate tensor is obtained after tensor decomposition, then a loss function is established through the approximate tensor, regularization is carried out on the loss function, constraint is added, an objective function is obtained, then the objective function is solved, the optimal approximate tensor, a user factor matrix, a program category factor matrix and a time period factor matrix are determined according to a solving result, and then the tensor is reconstructed;
the solving process of the objective function comprises the following steps:
from Suser∈Rdu×dg×dt,Uuser∈RP×du,Guser∈RQ×dg,Tuser∈RV×dt,Huser∈Rdg×QRandomly selecting a group of matrix S with each element being real numberuser *、Uuser *、Guser *、Tuser *And Huser *To approximate Suser,Uuser,Guser,TuserAnd HuserThen for the five components S of the objective functionuser,Uuser,Guser,TuserAnd HuserRespectively taking partial derivatives to obtain the gradient of each component in the target function, and then, according to the obtained gradient pair S of each componentuser,Uuser,Guser,TuserAnd HuserUpdating, re-taking partial derivatives and calculating gradient after updating until the target function is converged and S is obtained when the target function is convergeduser *、Uuser *、Guser *、TuserAs optimum Suser、Uuser、Guser、Tuser(ii) a Wherein S isuserIs the core tensor, UuserAs a user factor matrix, GuserFactor matrix for program category and TuserIs a time period factor matrix.
2. The method of claim 1, wherein the process of determining the preference value comprises:
and respectively endowing different weights for the loyalty and the interest degree, and taking the sum of the loyalty and the interest degree endowed with the weights as the preference value.
3. The method as claimed in claim 1, wherein the step 4 of representing the user-program category-watching period of the user group as a three-dimensional tensor is performeduser∈RP ×Q×VWhere P is the number of users in the user group, Q is the number of program types, and V is the number of time slots.
4. The method as claimed in claim 3, wherein the step 4 of performing tensor decomposition after the user-program category-watching period of the user group is expressed by tensor comprises:
tensor AuserDecomposed into a core tensor SuserAnd a user factor matrix UuserThe program category factor matrix GuserAnd a time period factor matrix TuserIs a modulo multiplication of where Suser∈Rdu×dg×dt,Uuser∈RP×du,Guser∈RQ×dg,Tuser∈RV×dtR represents a real number set, du, dg and dt satisfy 0<du<0.0006×P,0<dg<0.3×Q,0<dt<0.5×V。
5. The method as claimed in claim 1, wherein the step 4 comprises tensor reconstruction after the user-program category-watching time period of the user group is expressed by tensor and after tensor decomposition, to obtain an approximate tensor, wherein the approximate tensor has a formula:
Figure FDA0002276558170000021
in the above formula, Suser×UUuserRepresenting tensor SuserAnd matrix UuserU-modulus ofGGuserRepresenting tensor and matrix GuserG-modulus ofTTuserRepresenting tensor and matrix TuserThe T-product of (1).
6. The method of claim 1, wherein the loss function is expressed as:
Figure FDA0002276558170000031
in the above formula, RuserThe method is characterized in that the method is a binary symbol tensor, and 0 and 1 are used for marking whether the preference information is unknown to a user; a. theuserThe tensor of users-program category-viewing period for the user group,
Figure FDA0002276558170000032
is an approximate tensor.
7. The method for modeling social relationship aware IPTV user behavior and recommending programs of claim 1, wherein the expression of said objective function is:
Figure FDA0002276558170000033
wherein
Figure FDA0002276558170000034
Represents the Frobenius norm,
Figure FDA0002276558170000035
to add constraints to help obtain a more accurate program category factor matrix Guser(ii) a X is a program category-program category similarity coefficient matrix which can be decomposed into a matrix GuserSum matrix HuserProduct of (A) and (B), Huser∈Rdg×QBy factoring auxiliary matrices, parameters lambda, for program types1And λ2,λ3Adjusting parameters for different models, wherein the value range is as follows: {0.1,0.2,0.01,0.001}.
CN201810385063.1A 2018-04-26 2018-04-26 Social relation perception IPTV user behavior modeling and program recommendation method Expired - Fee Related CN108737856B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810385063.1A CN108737856B (en) 2018-04-26 2018-04-26 Social relation perception IPTV user behavior modeling and program recommendation method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810385063.1A CN108737856B (en) 2018-04-26 2018-04-26 Social relation perception IPTV user behavior modeling and program recommendation method

Publications (2)

Publication Number Publication Date
CN108737856A CN108737856A (en) 2018-11-02
CN108737856B true CN108737856B (en) 2020-03-20

Family

ID=63939896

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810385063.1A Expired - Fee Related CN108737856B (en) 2018-04-26 2018-04-26 Social relation perception IPTV user behavior modeling and program recommendation method

Country Status (1)

Country Link
CN (1) CN108737856B (en)

Families Citing this family (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109740063A (en) * 2019-01-17 2019-05-10 北京奇艺世纪科技有限公司 Information recalls, information cluster method, device and equipment
CN109903082B (en) * 2019-01-24 2022-10-28 平安科技(深圳)有限公司 Clustering method based on user portrait, electronic device and storage medium
JP2020184106A (en) * 2019-04-26 2020-11-12 シャープ株式会社 Controller, household appliance, communication device, server, information presentation system, control program and control method
CN110232154B (en) * 2019-05-30 2023-06-09 平安科技(深圳)有限公司 Random forest-based product recommendation method, device and medium
CN110557660B (en) * 2019-09-04 2021-10-01 北京奇艺世纪科技有限公司 Live video processing method and device
CN111177577B (en) * 2019-12-12 2023-03-28 中国科学院深圳先进技术研究院 Group project recommendation method, intelligent terminal and storage device
CN111935513B (en) * 2020-07-14 2022-04-19 广东工业大学 Home user-oriented network television program recommendation method and device
CN113852867B (en) * 2021-05-27 2023-09-08 天翼数字生活科技有限公司 Method and device for recommending programs based on kernel density estimation
CN113254794B (en) * 2021-07-15 2021-10-01 中国传媒大学 Program data recommendation method and system based on modeling
CN113688258A (en) * 2021-08-20 2021-11-23 广东工业大学 Information recommendation method and system based on flexible multidimensional clustering
CN113779395A (en) * 2021-09-10 2021-12-10 粒子文化科技集团(杭州)股份有限公司 Media asset recommendation method, device, system, electronic device and storage medium

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101859313A (en) * 2009-04-08 2010-10-13 索尼公司 Messaging device and method and program thereof
CN103338403A (en) * 2012-09-17 2013-10-02 中国传媒大学 Broadcasting television system and personalized program recommending method in system
CN106454431A (en) * 2016-10-14 2017-02-22 合肥工业大学 Method and system for recommending television programs

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9613118B2 (en) * 2013-03-18 2017-04-04 Spotify Ab Cross media recommendation

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101859313A (en) * 2009-04-08 2010-10-13 索尼公司 Messaging device and method and program thereof
CN103338403A (en) * 2012-09-17 2013-10-02 中国传媒大学 Broadcasting television system and personalized program recommending method in system
CN106454431A (en) * 2016-10-14 2017-02-22 合肥工业大学 Method and system for recommending television programs

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
基于张量理论的移动推荐系统研究;朱沿熹;《中国优秀硕士论文全文数据库》;20140401;正文第6-24页 *

Also Published As

Publication number Publication date
CN108737856A (en) 2018-11-02

Similar Documents

Publication Publication Date Title
CN108737856B (en) Social relation perception IPTV user behavior modeling and program recommendation method
US10650245B2 (en) Generating digital video summaries utilizing aesthetics, relevancy, and generative neural networks
CN108521586B (en) IPTV television program personalized recommendation method giving consideration to time context and implicit feedback
CN111797321B (en) Personalized knowledge recommendation method and system for different scenes
CN103377242B (en) User behavior analysis method, analyzing and predicting method and television program push system
CN104182449A (en) System and method for personalized video recommendation based on user interests modeling
CN102780920A (en) Television program recommending method and system
US10165315B2 (en) Systems and methods for predicting audience measurements of a television program
CN106791964B (en) Broadcast TV program recommender system and method
CN103686382A (en) Program recommendation method
KR20210125590A (en) Hashing-based effective user modeling
Jeon et al. Hybrid machine learning approach for popularity prediction of newly released contents of online video streaming services
CN115168744A (en) Radio and television technology knowledge recommendation method based on user portrait and knowledge graph
CN107920260A (en) Digital cable customers behavior prediction method and device
CN112579822A (en) Video data pushing method and device, computer equipment and storage medium
CN106649733B (en) Online video recommendation method based on wireless access point context classification and perception
CN116186309B (en) Graph convolution network recommendation method based on interaction interest graph fusing user intention
CN113852867B (en) Method and device for recommending programs based on kernel density estimation
CN117033754A (en) Model processing method, device, equipment and storage medium for pushing resources
Darvishy et al. New attributes for neighborhood-based collaborative filtering in news recommendation
He et al. Efficient barrage video recommendation algorithm based on convolutional and recursive neural network
ZHANG et al. Review of user behavior analysis based on big data: method and application
CN108495155B (en) Viewing habit analysis method and system
CN117633371B (en) Recommendation method, device and readable storage medium based on multi-attention mechanism
CN113468415B (en) Movie recommendation system integrating movie attribute and interaction information

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
TR01 Transfer of patent right
TR01 Transfer of patent right

Effective date of registration: 20211027

Address after: 710065 406, block B, Taiwei intelligent chain center, No. 8, Tangyan South Road, high tech Zone, Xi'an, Shaanxi Province

Patentee after: NOWLEDGE DATA CO.,LTD.

Address before: 710069 No. 229 Taibai North Road, Shaanxi, Xi'an

Patentee before: NORTHWEST University

CF01 Termination of patent right due to non-payment of annual fee
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20200320