CN117407591A

CN117407591A - Movie personalized poster recommendation method

Info

Publication number: CN117407591A
Application number: CN202311479593.XA
Authority: CN
Inventors: 王洪君; 闫立鑫; 韩亚; 陈灵; 马荣深; 包晖
Original assignee: Sichuan Changhong Electric Co Ltd
Current assignee: Sichuan Changhong Electric Co Ltd
Priority date: 2023-11-08
Filing date: 2023-11-08
Publication date: 2024-01-16

Abstract

The invention relates to a personalized television poster recommendation method, which relates to the technical field of television recommendation, and comprises the steps of selecting a favorite television of a user from a television list by using a user interest model, namely obtaining the television list for the user, if a television actor set in the television list for the user and the favorite actor list for the user have intersection, issuing a poster containing favorite actors of the user in the intersection to the user, otherwise returning to a public poster, thereby providing accurate television recommendation for the user, and personalizing the poster, and solving the problem that the corresponding personalized poster cannot be issued for the user.

Description

Movie personalized poster recommendation method

Technical Field

The invention relates to the technical field of movie recommendation, in particular to a movie personalized poster recommendation method.

Background

With the development of the Internet, the intelligent television obtains a larger development space, and the video watching demand of intelligent television users is also increasing. The distribution of the film and television resources not only needs to consider the content quality, but also needs to add a sorting method, so that more sexualization is displayed for users from the film and television content and the poster as much as possible.

Currently, few products are recommended by personalized video posters on the market, and most of the products are public posters based on operation editing, however, the video content and actor posters of the public posters are homogenized, namely the video content is the same, the actor posters are the same, and the increasing video watching demands of users cannot be met; the homogeneous personalized poster meets the demands of users on the personalized poster, but lacks deep mining of user preference and does not issue corresponding personalized content for the same user.

Disclosure of Invention

The technical problems solved by the invention are as follows: the method for recommending the personalized video is provided, and the problem that the corresponding personalized video cannot be issued by a user is solved.

The invention solves the technical problems by adopting the technical scheme that: the personalized video poster recommending method comprises the following steps:

s1, constructing a user interest model, wherein the construction of the user interest model comprises the steps of obtaining user data, extracting feature data, processing the feature data, constructing the model, verifying the model and deploying the model;

s2, analyzing historical film watching records of the user, counting the occurrence frequency of actors, arranging the actors according to the descending order of the frequency, and taking the first N actors as a favorite actor list of the user;

s3, acquiring a film and television recommendation list, acquiring basic data of each film and television in the film and television recommendation list, predicting the like probability of a user on each film and television in the film and television recommendation list by using the user interest model, and selecting M films with the like probability from big to small as the film and television list for the user;

s4, aiming at each movie in a movie list of the user, if the movie actor set and the favorite actor list of the user have intersection, issuing a poster containing the favorite actor of the user in the intersection to the user, otherwise issuing a public poster.

Further, the user data includes a history favorite movie list, a user VIP value, a movie id, a director, an actor, a region of showing, a movie title, a movie showing time, a movie type, a play amount, a tag, and a poster score.

Further, the extracting feature data includes extracting a feature set of a film and television liked by a history in a first preset time, wherein the feature includes director, actor, region of the upper map and subject of the film and television.

Further, the feature processing comprises the steps of calculating an intersection of the features of the film to be predicted and the feature set, carrying out normalization processing, and calculating the ratio of the normalized intersection to all actors of the film to be predicted; carrying out normalization data processing on the user VIP value, the film showing time and the film poster score; and digitizing the history favorite movie list, movie ids and movie types.

Further, the digitizing includes representing each movie in the favorite movie list with a movie id value, counting the play amount of the front P movies arranged from high to low in a second preset time, normalizing the play amount, recording as sum_count, correlating the movie id with the play amount sum_count, counting each type of parts in the media resource library, normalizing the parts, recording as type_count, defaulting the movie type to-1 by the corresponding type_count, and the unassociated movie id or the corresponding numerical characteristic value of the movie type.

Further, the model construction includes constructing a model using a machine learning algorithm GBDT and iterating through gradient descent.

Further, the data set adopted by the model verification is divided into a training set and a testing set, wherein the testing set is data of day A before the current time, and the training set is data of day B before day A.

Further, the deployment model comprises the steps of storing a trained model into hdfs in a PMML file, storing a numeric historical favorite film list, a film id and a film type into hdfs in a csv format, and reading the csv file and the model file through a spark method.

Further, the base data includes movie showing time, director, actors, showing region, movie title, movie type, label, public poster and personalized poster.

The invention has the beneficial effects that: according to the personalized video poster recommendation method, the user interest model is constructed, the user interest model is utilized to select the favorite videos of the user from the video list, namely, the video list for the user is obtained, if the intersection exists between the video actor set in the video list for the user and the favorite actor list for the user, the poster containing the favorite actors of the user in the intersection is issued to the user, otherwise, the public poster is returned, so that accurate video recommendation is provided for the user, and the personalized poster is provided, and the problem that the corresponding personalized poster cannot be issued for the user is solved.

Drawings

Fig. 1 is a flow chart of a movie personalized poster recommendation method of the invention.

Fig. 2 is a schematic flow chart of loading a user interest model to complete personalized poster recommendation in the movie personalized poster recommendation method.

Detailed Description

The invention relates to a film and television personalized poster recommending method, which is shown in figure 1 and comprises the following steps:

specifically, the user characteristic data comprises a favorite movie list, a user VIP value, a movie id, a director, an actor, a showing area, a movie subject, a movie showing time, a movie type, a label and a poster score; wherein, the label is 0 or 1,0 indicates dislike of the user, and 1 indicates like of the user; the poster score is obtained through a Mongodb table; the video types can be divided into six categories, namely, drama, movie, variety, juvenile, animation and documentary.

The feature data extraction includes extracting a feature set of a favorite movie and television in a first preset time, wherein the feature set includes director, actor, remapped area and movie and television subject, and the first preset time can be the last month or the last two months. For example, the first preset time is the last month, the director is represented by a director yi_director, the actor is represented by a director yi_actor, the upward-reflecting area is represented by a director yi_area, the subject is represented by a director yi_gene, i represents the ith movie in the movie list, and the director set is the union of the director yi_directors and is represented by director_his; the actor set is the union of the historyi_actiones and is expressed by actione_his; the set of the remapped areas is a union of historyi_area and is represented by area_his; the topic collection is the union of historyi_genes, denoted by genes_his.

The feature processing comprises the steps of calculating the intersection of the features of the film to be predicted and the feature set, carrying out normalization processing, and calculating the ratio of the normalized intersection to all actors of the film to be predicted; carrying out normalization data processing on the user VIP value, the film showing time and the film poster score; and digitizing the history favorite movie list, movie ids and movie types. For example, if an actor of a movie to be predicted is denoted as actor, an actor intersection is denoted as actor_inter, and an actor ratio is denoted as actor_sim, then actor_inter=actor_z, actor_sim=length (actor_inter)

length (actor), wherein length represents the normalized processed value, and the director ratio, the epicenter ratio, and the video title ratio can be obtained by the same method; carrying out normalized data processing on the user VIP value, the video showing time and the video poster score, wherein the normalized interval is [ -1,1]; the said digitizing includes representing each video in the history favorite video list with video id value, counting the playing amount of the front P video arranged from high to low in the second preset time, normalizing the playing amount, recording as sum_count, correlating the video id with the playing amount sum_count, counting each type of parts in the media resource library, normalizing the parts, recording as type_count, defaulting the video type to-1 by the corresponding type_count, the unassociated video id or the corresponding value characteristic value of the video type, the second preset time can be 7 days or 8 days, etc.

The model construction includes constructing a model using a machine learning algorithm GBDT and iterating through gradient descent.

Specifically, the constructed model is trained with actor ratio, director ratio, mapping region ratio, movie subject ratio, normalized user VIP value, normalized movie mapping time, normalized movie poster score, digitized movie id, and digitized movie type as inputs and labels as outputs.

The data set adopted by the model verification is divided into a training set and a testing set, wherein the testing set is data of day A before the current time, the training set is data of day B before the day A, for example, data of day 1 before the current time is used as the testing set, and data of day 2 and day 3 are used as the training set.

The deployment model comprises the steps of storing a trained model into hdfs, storing a numeric historical favorite film list, a film id and a film type into hdfs in a csv format, and reading the csv file and the model file by a spark method.

specifically, N is a positive integer, which can be set for the user.

specifically, the movie recommendation list may be a movie list recommended by a movie platform, a movie list, a television play list, a background broadcast, a whole-network hot broadcast, a VIP special area, a satellite television hot broadcast, a public television, and the like, and a movie liked by a user is selected from the movie recommendation list through a user interest model and used as a movie list for the user, so that a recommendation effect is improved, and the basic data include movie showing time, a director, actors, a showing area, movie titles, movie types, labels, public posters, and personalized posters, wherein the personalized posters refer to posters containing specific actors, and the M is a preset value, and may be 10 or 20.

Specifically, the most preferred actors of the user in the intersection are the top ranked actors in the favorite actor list of the user, the movie list and the issued poster for the user are stored in the HBase list, and the content and the poster are connected by underlining.

The method focuses on the user, analyzes the user viewing data and interaction data, extracts characteristics, and models and fits the characteristic data through machine learning so as to achieve personalized ordering of a preselected film and television list; meanwhile, a favorite actor list of the user is analyzed, and corresponding favorite actor posters are issued to the user, so that the aim of recommending personalized posters of films and videos is fulfilled.

Examples:

after the user interest model is deployed, loading the user interest model to complete the personalized poster recommendation process as shown in fig. 2, and specifically comprises the following steps:

firstly, all videos in a video recommendation list are acquired, a user interest model is loaded, the like probability of a user on each video in the video recommendation list is obtained, and M videos with the top ranking from big to small like probability are selected as a video list for the user.

Secondly, acquiring a historical film watching record of the user, counting the occurrence frequency of actors, arranging the actors according to the descending order of the frequency, and taking the first N actors as a favorite actor list of the user.

And finally, calculating an intersection of the actor sets of each movie in the movie list of the user and the favorite actor list of the user, if the intersection is empty, issuing a public poster, and if the intersection is not empty, selecting the actor ranked at the forefront in the favorite actor list of the user from the actors in the intersection as the favorite actor of the user, and issuing the poster containing the favorite actor of the user in the intersection.

Claims

1. The personalized video poster recommending method is characterized by comprising the following steps of:

2. The movie personalized poster recommendation method according to claim 1, wherein said user data comprises a history favorite movie list, user VIP values, movie ids, directors, actors, showareas, movie titles, movie showing times, movie types, play amounts, labels, and poster scores.

3. The method of claim 2, wherein extracting feature data includes extracting a set of features of a favorite movie for a first predetermined time, the features including director, actor, region of interest, and movie title.

4. The personalized video poster recommendation method according to claim 3, wherein said feature processing comprises calculating an intersection of features and feature sets of a video to be predicted, performing normalization processing, and calculating a ratio of the normalized intersection to all actors of the video to be predicted; carrying out normalization data processing on the user VIP value, the film showing time and the film poster score; and digitizing the history favorite movie list, movie ids and movie types.

5. The personalized video poster recommendation method according to claim 4, wherein said digitizing comprises representing each video in a history favorite video list with a video id value, counting the play amount of the front P-part video arranged from high to low in a second preset time, normalizing the play amount, recording as sum_count, associating the play amount sum_count with the video, counting each type of part in the media resource library, normalizing the number of parts, recording as type_count, and defaulting the video type with a corresponding type_count, and a non-associated video id or a corresponding numerical characteristic value of the video type as-1.

6. The movie personalized poster recommendation method according to claim 5, wherein said model construction comprises constructing a model using a machine learning algorithm GBDT and iterating through gradient descent.

7. The personalized video poster recommendation method according to claim 6, wherein a data set adopted by model verification is divided into a training set and a test set, wherein the test set is data of day a before the current time, and the training set is data of day B before the day a.

8. The personalized video poster recommendation method according to claim 7, wherein said deploying the model comprises storing the trained model to hdfs in a format of PMML file, storing the quantized history favorite video list, video id and video type to hdfs in a csv format, and reading the csv file and the model file by a spark method.

9. The movie personalized poster recommendation method according to any one of claims 1-8, wherein said base data comprises movie showing time, director, actors, showing regions, movie titles, movie types, labels, public posters, and personalized posters.