KR20170079423A - Dynamic Noise Reduction Method based on Content Rating Distribution for Content Recommendation and Content Recommendation System Using Thereof - Google Patents

Dynamic Noise Reduction Method based on Content Rating Distribution for Content Recommendation and Content Recommendation System Using Thereof Download PDF

Info

Publication number
KR20170079423A
KR20170079423A KR1020150189979A KR20150189979A KR20170079423A KR 20170079423 A KR20170079423 A KR 20170079423A KR 1020150189979 A KR1020150189979 A KR 1020150189979A KR 20150189979 A KR20150189979 A KR 20150189979A KR 20170079423 A KR20170079423 A KR 20170079423A
Authority
KR
South Korea
Prior art keywords
content
rating
user
data
recommendation
Prior art date
Application number
KR1020150189979A
Other languages
Korean (ko)
Inventor
김베드로
이지형
김누리
Original Assignee
성균관대학교산학협력단
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 성균관대학교산학협력단 filed Critical 성균관대학교산학협력단
Priority to KR1020150189979A priority Critical patent/KR20170079423A/en
Publication of KR20170079423A publication Critical patent/KR20170079423A/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/02Marketing; Price estimation or determination; Fundraising
    • G06Q30/0201Market modelling; Market analysis; Collecting market data
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/02Marketing; Price estimation or determination; Fundraising
    • G06Q30/0241Advertisements
    • G06Q30/0251Targeted advertisements
    • G06Q30/0254Targeted advertisements based on statistics
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/02Marketing; Price estimation or determination; Fundraising
    • G06Q30/0241Advertisements
    • G06Q30/0251Targeted advertisements
    • G06Q30/0255Targeted advertisements based on user history
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
    • G06Q50/10Services

Landscapes

  • Business, Economics & Management (AREA)
  • Strategic Management (AREA)
  • Engineering & Computer Science (AREA)
  • Accounting & Taxation (AREA)
  • Development Economics (AREA)
  • Finance (AREA)
  • Physics & Mathematics (AREA)
  • Marketing (AREA)
  • Economics (AREA)
  • Entrepreneurship & Innovation (AREA)
  • General Business, Economics & Management (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Game Theory and Decision Science (AREA)
  • Tourism & Hospitality (AREA)
  • Data Mining & Analysis (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Human Resources & Organizations (AREA)
  • Primary Health Care (AREA)
  • Probability & Statistics with Applications (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Two-Way Televisions, Distribution Of Moving Picture Or The Like (AREA)

Abstract

The present invention relates to a dynamic noise cancellation method for content recommendation and a content recommendation system using the method. The method includes the steps of reflecting at least one of a content evaluation time difference and a number of contents evaluated by the user as a weight, calculating a rating distribution for each user, and removing the noise data by comparing the distribution with a predetermined value .

Description

TECHNICAL FIELD The present invention relates to a dynamic noise reduction method and a content recommendation system for content recommendation,

The present invention relates to a media content recommendation method and system, and more particularly, to a technique for filtering to remove noise of user rating data on media content such as movies.

The Recommendation System recommends services and items to users, and is widely used not only in shopping malls such as Amazon and CDNow but also in some shopping malls in Korea. Among these recommendation systems, collaborative filtering (Collaborative filtering), which is a recommendation method based on the opinions of others, is widely used (G. Adomavicius, and A. Tuzhilin, "Toward the Next Generation of Recommender Systems: A Survey of the State Art and Possible Extensions, " Knowledge and Data Engineering, IEEE Transactions, Vol. 17, pp. 734-749, 2005). However, if the distorted data are reflected, such as for commercial purposes or intentional bad reviews, the accuracy of the recommendations may be poor. A typical example is rating data of users reflected on a movie recommendation site. A rating for a movie can be an important indicator because other users can choose a movie or be associated with a movie's hit. Therefore, in order to obtain more accurate reliability and accuracy, filtering for noise data is required. In the conventional noise elimination research, there is a case in which the score is removed without taking the user's tendency into consideration and information other than noise is removed. It is difficult to say that the data on which the recommendation system is based only reflects the user's preferences. That is, a kind of noise may be included in the data. In general, the noise of the recommendation system is defined by O'Mahony in two categories as follows. First, it is defined as related to collecting or inferring a user's preference with a natural noise. For example, errors that occur in the behavior of a user entering data. Next, it can be defined as malicious noise (Malicious Noise). This is caused by an action to influence the recommendation system by entering intentional information into the data. For example, a highly artificially high-ranking behavior is used to promote the value of a product on a recommendation system that is used for commercial purposes. In the existing collaborative filtering research, studies are being conducted to filter out data distortion caused by such malicious noise. As a representative study, it was noted that 1 or 10 of the evaluators who recorded the malicious ratings had a high percentage of the ratings, so that the evaluators who had a high proportion of 1 point and 10 points of the rating data were classified as malicious evaluators (MP O'Mahony, NJ Hurley, GCM Silvestre, "Detecting Noise in Recommender System Databases," Proceedings of the 11th International Conference on Intelligent User Interfaces, pp. 109-115, 2006) (Item Based Recommendation System for Filtering Noise Ratings), Proceedings of the Korea Information Science Society Conference, pp. 291-293, 2013). However, in the case of the existing research method, a considerable number of scores may be removed instead of noise.

SUMMARY OF THE INVENTION An object of the present invention is to improve reliability of data by removing noise data, to analyze the distribution of ratings by reflecting the change of rating influence degree, And to provide a dynamic noise removal method and a content recommendation system for content recommendation that provide reliable data in various data analysis.

According to an aspect of the present invention, there is provided a dynamic noise removal method for content recommendation. The method includes the steps of reflecting at least one of a content evaluation time difference and a number of contents evaluated by the user as a weight, calculating a rating distribution for each user, and removing the noise data by comparing the distribution with a predetermined value .

The method may further include generating a content recommendation collaboration filtering model using rating data obtained through the step of removing the noise data as learning and test data.

The method may further include collecting data regarding at least one of a user, a content, and a rating, and recommending the content using the content recommendation collaboration filtering model.

The step of reflecting the weighted value as the weighting value may be performed by:

Figure pat00001
(here,
Figure pat00002
Indicates original rating information,
Figure pat00003
Is a weight based on time difference.
Figure pat00004
At this point,
Figure pat00005
Means the time when the movie was evaluated,
Figure pat00006
Is a proportionality constant) to the content rating.

The step of reflecting the weighted value as the weighting factor may include:

Figure pat00007
(here,
Figure pat00008
Means the weight according to the number of movies viewed by the user.
Figure pat00009
(Reflecting the effect of the evaluation time difference) on the content rating.

The step of calculating a rating distribution for each user may include calculating an average and a standard deviation with respect to a rating point on which the weight is reflected,

Figure pat00010
Is expressed by the following equation
Figure pat00011
(
Figure pat00012
: Average,
Figure pat00013
: Number of movies rated by the user,
Figure pat00014
: User-rated
Figure pat00015
Second movie rating,
Figure pat00016
:Standard Deviation,
Figure pat00017
:
Figure pat00018
th
Figure pat00019
).

Wherein the step of comparing the distribution with a predetermined value to remove noise data further comprises the step of removing lower rating data from the higher rating data if the rating distribution average of each user is greater than the rating distribution average of all users, Is smaller than the average distribution of the average ratings of all users, the high rating data is removed more than the low rating data.

The step of recommending the content may include calculating a prediction rating for the content to be recommended and generating the recommended content information.

The step of recommending the content may further include receiving a content recommendation request from the user, determining whether to create a content recommendation model, and transmitting the generated content recommendation information to the user device.

The recommended content information may include a list sorted according to a prediction rating and an outline of the content.

According to another aspect of the present invention, there is provided a content recommendation system using a dynamic noise cancellation method for content recommendation. The system includes a content rating weight calculation unit that reflects a content evaluation time difference and a number of contents evaluated by a user as a weight, a noise rating calculator for calculating a rating distribution for each user and comparing the distribution with a predetermined value to remove noise data, And a collaborative filtering model generation unit for generating a content recommendation collaborative filtering model by using score data from which noise data is removed as learning and test data.

The system includes a content recommendation unit for generating a recommendation content using the content recommendation collaboration filtering model, a communication interface for receiving data on at least one of a user, a content and a rating from content sources and transmitting the recommendation content to a user device As shown in FIG.

The content rating weight calculation unit may calculate a difference between a current time point and a content evaluation point by

Figure pat00020
(here,
Figure pat00021
Indicates original rating information,
Figure pat00022
Is a weight based on time difference.
Figure pat00023
At this point,
Figure pat00024
Means the time when the movie was evaluated,
Figure pat00025
Is a proportional constant), and reflects the number of contents evaluated by the user and the evaluation time point difference using the following equation
Figure pat00026
(here,
Figure pat00027
Means the weight according to the number of movies viewed by the user.
Figure pat00028
Is reflected in the content rating by using the influence of the evaluation time difference).

Wherein the noise eliminator calculates an average and a standard deviation of each user for a rating point on which the weight is reflected,

Figure pat00029
Is expressed by the following equation
Figure pat00030
(
Figure pat00031
: Average,
Figure pat00032
: Number of movies rated by the user,
Figure pat00033
: User-rated
Figure pat00034
Second movie rating,
Figure pat00035
:Standard Deviation,
Figure pat00036
:
Figure pat00037
th
Figure pat00038
), And obtains a rating distribution for each user. If the average of ratings for each user is greater than the average rating distribution of all the users, the low rating data is further removed from the higher rating data, Is lower than the average of the rating distributions of the high score data, the high score data is further removed from the low score data.

According to the dynamic noise removal method and the content recommendation system for recommending contents of the present invention, the movie data may be filtered by reflecting the attribute information of the user appearing in the movie data, and a movie satisfying the users may be recommended using the filter.

1 is a diagram showing an example of a flowchart of a dynamic noise removal method for recommending contents of the present invention.
FIG. 2 is a view illustrating an example of a process of obtaining a rating distribution of each user and comparing with a predetermined value to remove noise.
3 is a diagram schematically illustrating a content recommendation service providing environment including a content recommendation system according to an embodiment of the present invention.
4 is a content recommendation flowchart in a content recommendation unit according to an embodiment of the present invention.

While the invention is susceptible to various modifications and alternative forms, specific embodiments thereof are shown by way of example in the drawings and will herein be described in detail. It is to be understood, however, that the invention is not to be limited to the specific embodiments, but includes all modifications, equivalents, and alternatives falling within the spirit and scope of the invention.

The terms including the first, second, etc. may be used to describe various elements, but the elements are not limited to these terms. The terms are used only for the purpose of distinguishing one component from another. For example, without departing from the scope of the present invention, the first component may be referred to as a second component, and similarly, the second component may also be referred to as a first component. The term " and / or " includes any combination of a plurality of related entry items or any of a plurality of related entry items.

When an element is referred to as being "connected" or "connected" to another element, it may be directly connected or connected to the other element, but other elements may be present in between. On the other hand, when an element is referred to as being "directly connected" or "directly connected" to another element, it should be understood that there are no other elements in between.

The terminology used in this application is used only to describe a specific embodiment and is not intended to limit the invention. The singular expressions include plural expressions unless the context clearly dictates otherwise. In the present application, the terms "comprises" or "having" and the like are used to specify that there is a feature, a number, a step, an operation, an element, a component or a combination thereof described in the specification, Should not be construed to preclude the presence or addition of one or more other features, integers, steps, operations, elements, parts, or combinations thereof.

Unless defined otherwise, all terms used herein, including technical or scientific terms, have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs. Terms such as those defined in commonly used dictionaries are to be interpreted as having a meaning consistent with the contextual meaning of the related art and are to be interpreted as either ideal or overly formal in the sense of the present application Do not.

DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS Reference will now be made in detail to the preferred embodiments of the present invention, examples of which are illustrated in the accompanying drawings, wherein like reference numerals refer to the like elements throughout. The description will be omitted.

The present invention relates to a system using a collaborative filtering method for media content recommendation, and in particular, to a method and system for removing noise in data for creating a content recommendation collaborative filtering model.

Collaborative filtering is a method that automatically predicts users' interests according to preference information obtained from users. There are two types of collaborative filtering: active filtering and passive filtering. Active filtering is based on the fact that people want to share information about their purchases with other people in a P2P way. Active filtering can generate reliable descriptions and ranks because people who have interest in the product in question have evaluated it. However, prejudice can be involved in the evaluation and there may be an initial evaluator problem and a cold start problem. Passive filtering is a method of collecting information implicitly and can remove certain changes from the analysis that appear in active filtering. For example, in passive filtering, everyone can automatically access the given data. Another collaborative filtering method is item-based filtering. Item-based collaborative filtering is based on the fact that most people tend to like products that are similar to those they liked in the past, and tend to dislike products that are similar to those that they did not like. This filtering method is a method of predicting the customer's preference by calculating the similarity between the existing products in which the customer inputs the preferences and the products to be predicted. The item-based collaborative filtering method uses the preferences of customers who input preferences in both products to calculate the similarity between the products. However, since the similarity between customers is not considered at all, if the evaluation is based on the evaluation of users who do not have similar preferences with a specific customer, the accuracy of the correlation between the products may be degraded and the prediction and recommendation abilities of the recommendation system may be degraded have. Thus, in the recommendation system, it is important to precisely predict the value of unevaluated points from the rating history.

In the present invention, noise-canceled data is used to generate a content recommendation collaboration filtering model using an item-based collaborative filtering method. In the present specification, movies are used as media contents for the sake of explanation. However, the scope of the present invention is not limited to movies but can be applied to all media contents.

Dynamic for content recommendation noise  How to uninstall

1 is a diagram showing an example of a flowchart of a dynamic noise removal method for recommending contents of the present invention. Referring to FIG. 1, a dynamic noise removal method for recommending a content of the present invention includes collecting data on a user / content / rating (S11), reflecting a weight based on a difference in evaluation time (S12) (S13), and the weighted value of each user

Figure pat00039
(S14), noise data removal (S15), learning data and test data configuration (S16), content recommendation collaboration filtering model creation (S17), and test and recommendation movie rating prediction (S18).

First, data for generating a content recommendation model is collected (S11). For example, user information, movie information, and rating data for a movie are collected. The user information may include the user's sex, age, number of evaluations, and the like. The movie information may include a movie title, a genre, an opening date, a synopsis, and the like. The rating information may include rating points for evaluation fields such as amusement and artistry for the movie.

After the data is collected as described above, the data is calculated and reflected in order to reflect the influence of the rating according to the difference in the evaluation time (S12).

Figure pat00040

here,

Figure pat00041
Indicates original rating information,
Figure pat00042
Is a weight based on time difference.
Figure pat00043
At this point,
Figure pat00044
Means the time when the movie was evaluated,
Figure pat00045
Is a proportional constant.
Figure pat00046
And
Figure pat00047
Is converted into a day number using the equation (2), and is substituted.

Figure pat00048

(Unix Time: Seconds from January 1, 1970, 0:00:00, Conversion timestamp: Number of days since January 1, 1900, 9/24: Day of the week for GMT + 9 )

Figure pat00049

Equation (3) is an expression that reflects both the weight based on the difference between the number of movies watched by the user and the evaluation time point. here,

Figure pat00050
Quot; refers to a weight according to the number of movies viewed by the user.
Figure pat00051
Reflects the effect of the difference in evaluation time.

E.g,

Figure pat00052
Can reflect the weight by dividing the number of movies of the person who viewed the movie at the minimum and the person who watched the movie at the maximum into five sections of 0.1 to 0.5. Also
Figure pat00053
Expresses the difference between the current point and the oldest point of view in days and divides it into five sections of 0.1 ~ 0.5, so that the older the movie is, the less weight is given, and the more recent the evaluation point is, the higher the weight is. Table 1 is an example of weighting.

Figure pat00054

When the process of reflecting the weight according to the evaluation point-in-time and the number of movies is completed, the average and standard deviation of the rating data reflecting the weight values for each user are calculated (Equation 4). after that

Figure pat00055
(Equation 4) to obtain a rating distribution of each user (S14).

Figure pat00056

Figure pat00057
: Average,
Figure pat00058
: Number of movies rated by the user,
Figure pat00059
: User-rated
Figure pat00060
Second movie rating,
Figure pat00061
:Standard Deviation,
Figure pat00062
:
Figure pat00063
th
Figure pat00064

When a rating distribution is obtained for each user, the noise data is removed (S15). FIG. 2 is a view illustrating an example of a process of obtaining a rating distribution of each user and comparing with a predetermined value to remove noise. Referring to Figure 2,

Figure pat00065
When the distribution has a low rating distribution by comparing the distribution with a predetermined value, many high ratings are removed, and when the distribution has a high rating distribution, many low ratings are removed. In the case of an intermediate distribution, both ratings are removed (for example,
Figure pat00066
By comparing the distribution with the total user distribution, we can grasp the tendency of the distribution of the distribution of each user distribution,
Figure pat00067
Average for the value
Figure pat00068
Removes the information except the value corresponding to the range of values. That is, for users corresponding to a high rating distribution,
Figure pat00069
, And for users corresponding to a low rating distribution, the average
Figure pat00070
The value corresponding to the excess is removed). Unlike the existing extreme value removal method, the prediction performance can be improved because the noise data is removed by judging the user's rating tendency.

The rating data from which the noise data is removed is composed of training data and test data (S16). Then, a content recommendation collaboration filtering model to be learned using the training data is generated (S17). When the content recommendation collaboration filtering model is generated, the test can be performed using the test data to evaluate the performance of the model (S18).

When a content recommendation collaborative filtering model is generated, it is possible to recommend a content suitable for the user's tendency (S19).

Hereinafter, a content recommendation system using the dynamic noise removal method will be described.

Content recommendation system

3 is a diagram schematically illustrating a content recommendation service providing environment including the content recommendation system 10 according to an embodiment of the present invention. Referring to FIG. 3, a content recommendation system 10 in accordance with an embodiment of the present invention is connected to a content source 20 and a user device 40 via a network.

The content source 20 includes a website or the like where a user can purchase content and record a rating for the content. The user can purchase TV programs, movies, radio programs, and the like from such content sources.

The content recommendation system 10 includes a communication interface 100, a weight calculation unit 200, a noise removal unit 300, a collaboration filtering model generation unit 400, and a content recommendation unit 500. The communication interface 100 receives content-related data from the content source providing sites 20 or receives a content recommendation request from the user device 40. [ The content-related data may include, for example, content information such as genre, release date, synopsis, existing rating information, and the like in the case of movie content. The content recommendation request may include, for example, a request for recommendation of a new movie, a request for recommendation of a movie by genre, and the like. Also, it is possible to transmit the recommended content information to the user device 40 through the communication interface unit 100. The weight calculation unit 200 assigns weights to the collected rating data on the basis of the rating time difference and the number of movies evaluated by the user. The noise removing unit 300 removes noise data according to a tendency of a user-specific rating distribution with a weighted rating. The collaboration filtering model generation unit 400 generates a content recommendation collaboration filtering model using the noise-removed rating data as training data. Then, the content recommendation unit 500 calculates a predictive rating for the content using the model, and generates the recommended content information accordingly. 4 is a diagram showing an example of a content recommendation flowchart in the content recommendation unit of the present invention. Referring to FIG. 4, when the content recommendation request input from the user device 40 is received from the receiving unit (S41), the content recommendation unit 500 determines whether a content recommendation collaboration filtering model has been created (S42). If the content recommendation collaboration filtering model is not created, the content recommendation collaboration filtering model is created with the functions of the weight calculation unit 200, the noise removing unit 300 and the collaboration filtering model generating unit 400 according to the order of FIG. 1 (S43). When the content recommendation collaborative filtering model is generated, the predictive score for the content is calculated using the model (S44). Then, the recommended content information list is sorted or categorized according to the calculated predictive rating (S45). For example, if a user requests a recommendation for a newest movie, the user can generate a list from a movie having a high prediction rating ranking among the latest movies, starting with a movie. Alternatively, when a user requests a movie recommendation for each genre, the user can generate a list by sorting from a movie having a high prediction ranking ranking by genre such as action, romance, and fear. In addition, brief introduction information (for example, a synopsis of a movie, a reproduction time, and the like) about the recommended content can be generated. The communication interface 100 transmits the generated content recommendation information to the user device 40 (S46).

The user device 40 includes a PC, a tablet, a smart phone, an IPTV, etc. having an interface through which a user can purchase content and input a rating and content recommendation request thereto.

Performance evaluation Experimental Example

An example of the performance of the collaborative filtering model generated using the noise data removal method of the present invention is as follows.

For the experiment, MovieLens' 100k data set was used. This data set consists of 100,000 movie rating data, and 1682 movies were rated by 943 users (one with at least 20 ratings). Compared with the collaborative filtering model using existing noise data removal method, Mean Absolute Error (MAE) and coverage were used as the performance measurement technique.

Mean Absolute Error (MAE) is a technique for evaluating the accuracy of recommendation, which is calculated as the difference between the actual preference and the predicted value as shown in Equation (5).

Figure pat00071

Here, N is the total number of evaluation objects,

Figure pat00072
Is the predicted score,
Figure pat00073
Is the actual score. That is, the absolute value is taken as the difference between the predicted score and the actual score, and the sum is divided by the total number of the evaluation subjects. Mean Absolute Error (MAE) is the average of the error correction values, and they all have the same weight irrespective of the magnitude of the error. Therefore, the lower the value, the better the performance.

Coverage is a technique for evaluating the possibility of recommending various contents according to the user's preference.

Figure pat00074

That is, the recommended content set

Figure pat00075
A set of contents in which genre-specific information is not duplicated
Figure pat00076
The higher the value, the better the performance.

Table 2 below is a table for evaluating the performance of the content recommendation collaboration filtering model generated using the noise data removal method disclosed herein.

Figure pat00077

Pw1 : Removal of noise data through technique 1 (time effect * rating) and performance of recommended technique through collaborative filtering method

P MF 1 : All data are classified into male and female. Method 1 is applied to remove noise data, and the combined data is then used as a collaborative filtering method. (The reasons for dividing male and female gender are male and female There may be a difference in tendency to add a rating)

Pw2 : Removing noise data by applying Technique 1 and Technique 2 and improving performance of recommended technique through collaborative filtering

P MF 2 : All data are classified into male and female, and techniques 1 and 2 are applied to remove noise data.

Pw3 : Remove noise data through Technique 1, Technique 2, Technique 3, and perform performance of recommended techniques through collaborative filtering

P MF 3 : All data are classified into male and female, and the techniques 1 and 2 are applied to remove noise data.

Figure pat00078
: The larger the value of the proportional constant and the proportional constant, the smaller the influence of the old rating is reflected)

For the collaborative filtering model using the existing noise reduction method, the MAE figure of merit was 0.816 and the coverage diversity index was 0.056. The proposed method ( P MF 3 ) has a MAE value of 0.474 and a coverage value of 0.045. Recommendation accuracy is improved compared to existing methods. Diversity index is decreased compared with existing method, but diversity performance index is generally decreased because recommendation accuracy is higher, and only the most suitable content is recommended for user. That is, it can be seen that the recommendation accuracy is higher than that of the conventional method.

10: Content recommendation system
100-1, 100-2: Communication interface 200: Weight calculation unit
300: Noise removing unit 400: Collaboration filtering model generating unit
500: content recommendation section
20: Content Sources
40: User device

Claims (14)

A dynamic noise removal method for content recommendation, the method comprising:
Reflecting at least one of a content evaluation time difference and a number of contents evaluated by a user as a weight;
Calculating a rating distribution for each user; And
And comparing the distribution with a predetermined value to remove noise data.
The method according to claim 1,
And generating a content recommendation collaborative filtering model using the rating data obtained through the step of removing the noise data as learning and test data.
3. The method of claim 2,
Collecting data regarding at least one of a user, a content, and a rating; And
And recommending content using the content recommendation collaboration filtering model. ≪ RTI ID = 0.0 > 11. < / RTI >
The method of claim 3,
The step of reflecting to the weighting value includes:
The difference between the current time and the content evaluation point is expressed by the following equation
Figure pat00079

(here,
Figure pat00080
Indicates original rating information,
Figure pat00081
Is a weight based on time difference.
Figure pat00082
At this point,
Figure pat00083
Means the time when the movie was evaluated,
Figure pat00084
Is a proportional constant)
To the content rating using the content rating.
5. The method of claim 4,
The step of reflecting to the weighting value includes:
The number of contents evaluated by the user and the evaluation time point difference are expressed by the following equations
Figure pat00085

(here,
Figure pat00086
Means the weight according to the number of movies viewed by the user.
Figure pat00087
Reflects the effect of the difference in valuation time point)
To the content rating using the content rating.
6. The method of claim 5,
The step of calculating a rating distribution for each user may include:
Calculating an average and a standard deviation for the scores for which the weights are reflected,
Figure pat00088
Is expressed by the following equation
Figure pat00089

(
Figure pat00090
: Average,
Figure pat00091
: Number of movies rated by the user,
Figure pat00092
: User-rated
Figure pat00093
Second movie rating,
Figure pat00094
:Standard Deviation,
Figure pat00095
:
Figure pat00096
th
Figure pat00097
)
And a dynamic noise removal method for recommending a content.
The method according to claim 6,
And comparing the distribution with a predetermined value to remove noise data,
If the average of the average ratings of each user is larger than the average of ratings of all users, the low rating data is further removed from the high rating data,
Wherein the high score data is further removed from the low score data if the average score distribution of each user is smaller than the average score distribution of all users.
8. The method of claim 7,
The step of recommending the content may include:
Calculating a predictive rating for the content subject to the recommendation; And
And generating recommended content information. ≪ RTI ID = 0.0 > 11. < / RTI >
9. The method of claim 8,
The step of recommending the content may include:
Receiving a user's content recommendation request;
Determining whether a content recommendation model is generated; And
And transmitting the generated content recommendation information to the user device.
10. The method of claim 9,
Wherein the recommended content information includes an ordered list of predicted ratings and an overview of the content.
A system for recommending content using dynamic noise removal for content recommendation, the system comprising:
A content rating weight calculation unit that reflects the content evaluation time difference and the number of contents evaluated by the user as a weight;
A noise removing unit for calculating a rating distribution for each user and comparing the distribution with a predetermined value to remove noise data; And
And a collaborative filtering model generation unit for generating a content recommendation collaborative filtering model by using score data from which noise data is removed as learning and test data, using a dynamic noise removal method for content recommendation.
12. The method of claim 11,
A content recommendation unit for generating a recommendation content using the content recommendation collaboration filtering model; And
Further comprising a communication interface for receiving data relating to at least one of a user, content and a rating from content sources and transmitting the recommendation content to a user device.
13. The method of claim 12,
The content rating weight calculation unit may calculate,
The difference between the current time and the content evaluation point is expressed by the following equation
Figure pat00098

(here,
Figure pat00099
Indicates original rating information,
Figure pat00100
Is a weight based on time difference.
Figure pat00101
At this point,
Figure pat00102
Means the time when the movie was evaluated,
Figure pat00103
Is a proportional constant)
To reflect the content rating,
The number of contents evaluated by the user and the evaluation time point difference are expressed by the following equations
Figure pat00104

(here,
Figure pat00105
Means the weight according to the number of movies viewed by the user.
Figure pat00106
Reflects the effect of the difference in valuation time point)
Wherein the content recommendation system uses the dynamic noise removal method for content recommendation.
14. The method of claim 13,
Wherein the noise eliminator comprises:
Calculating an average and a standard deviation of each user for the rated score in which the weight is reflected,
Figure pat00107
Is expressed by the following equation
Figure pat00108

(
Figure pat00109
: Average,
Figure pat00110
: Number of movies rated by the user,
Figure pat00111
: User-rated
Figure pat00112
Second movie rating,
Figure pat00113
:Standard Deviation,
Figure pat00114
:
Figure pat00115
th
Figure pat00116
)
To obtain a rating distribution for each user,
If the average of the average ratings of each user is larger than the average of ratings of all users, the low rating data is further removed from the high rating data,
Wherein the high score data is further removed from the low score data if the average score distribution of each user is smaller than the average score distribution of all users.
KR1020150189979A 2015-12-30 2015-12-30 Dynamic Noise Reduction Method based on Content Rating Distribution for Content Recommendation and Content Recommendation System Using Thereof KR20170079423A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
KR1020150189979A KR20170079423A (en) 2015-12-30 2015-12-30 Dynamic Noise Reduction Method based on Content Rating Distribution for Content Recommendation and Content Recommendation System Using Thereof

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
KR1020150189979A KR20170079423A (en) 2015-12-30 2015-12-30 Dynamic Noise Reduction Method based on Content Rating Distribution for Content Recommendation and Content Recommendation System Using Thereof

Publications (1)

Publication Number Publication Date
KR20170079423A true KR20170079423A (en) 2017-07-10

Family

ID=59356256

Family Applications (1)

Application Number Title Priority Date Filing Date
KR1020150189979A KR20170079423A (en) 2015-12-30 2015-12-30 Dynamic Noise Reduction Method based on Content Rating Distribution for Content Recommendation and Content Recommendation System Using Thereof

Country Status (1)

Country Link
KR (1) KR20170079423A (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR20190102905A (en) 2018-02-27 2019-09-04 울산과학기술원 Method for calculating rating of content
WO2020153750A1 (en) * 2019-01-22 2020-07-30 삼성전자 주식회사 Method and device for providing application list by electronic device
CN112700054A (en) * 2021-01-05 2021-04-23 辽宁工程技术大学 Recommendation algorithm optimization method based on non-independent same distribution
US12020710B2 (en) 2021-03-05 2024-06-25 Samsung Electronics Co., Ltd. Electronic apparatus and controlling method thereof

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR20190102905A (en) 2018-02-27 2019-09-04 울산과학기술원 Method for calculating rating of content
WO2020153750A1 (en) * 2019-01-22 2020-07-30 삼성전자 주식회사 Method and device for providing application list by electronic device
KR20200094829A (en) * 2019-01-22 2020-08-10 삼성전자주식회사 Apparatus and method for providing of application list in electronic device
CN112700054A (en) * 2021-01-05 2021-04-23 辽宁工程技术大学 Recommendation algorithm optimization method based on non-independent same distribution
US12020710B2 (en) 2021-03-05 2024-06-25 Samsung Electronics Co., Ltd. Electronic apparatus and controlling method thereof

Similar Documents

Publication Publication Date Title
McLaughlin et al. A collaborative filtering algorithm and evaluation metric that accurately model the user experience
JP3116851B2 (en) Information filtering method and apparatus
US20160364736A1 (en) Method and system for providing business intelligence based on user behavior
KR101471940B1 (en) Apparatus, System, Method and Computer Readable Recording Media Storing the Program for Related Recommendation of TV Program Contents and Web Contents
Cremonesi et al. Hybrid algorithms for recommending new items
Pirasteh et al. Item-based collaborative filtering with attribute correlation: a case study on movie recommendation
US8250012B1 (en) Evaluating recommendations by determining user actions, and performance values pertaining to lists of recommendations
US20120296701A1 (en) System and method for generating recommendations
KR20170079429A (en) A clustering based collaborative filtering method with a consideration of users' features and movie recommendation system using thereof
US20150186947A1 (en) Digital content recommendations based on user comments
KR101725510B1 (en) Method and apparatus for recommendation of social event based on users preference
Shang et al. A randomwalk based model incorporating social information for recommendations
CN105338408B (en) Video recommendation method based on time factor
CN104615741B (en) Cold-start project recommendation method and device based on cloud computing
US11653064B2 (en) Methods and systems for determining disliked content
KR20170079423A (en) Dynamic Noise Reduction Method based on Content Rating Distribution for Content Recommendation and Content Recommendation System Using Thereof
KR101859620B1 (en) Method and system for recommending content based on trust in online social network
Thomas et al. Comparative study of recommender systems
Liu et al. Document recommendations based on knowledge flows: A hybrid of personalized and group‐based approaches
JP2006053616A (en) Server device, web site recommendation method and program
JP2012098950A (en) Similar user extraction method, similar user extraction device and similar user extraction program
Castillejo et al. Social network analysis applied to recommendation systems: Alleviating the cold-user problem
Wang et al. Predicting the incremental benefits of online information search for heterogeneous consumers
KR101496181B1 (en) Methods and apparatuses for a content recommendations using content themes
Bangale et al. Recipe recommendation system using content-based filtering