CN117648462A

CN117648462A - Video recommendation method and system

Info

Publication number: CN117648462A
Application number: CN202410116961.2A
Authority: CN
Inventors: 熊明辉; 尹继圣; 刘大照; 李航; 王家瑞
Original assignee: Shenzhen Ganzhen Intelligent Co ltd
Current assignee: Shenzhen Ganzhen Intelligent Co ltd
Priority date: 2024-01-29
Filing date: 2024-01-29
Publication date: 2024-03-05
Anticipated expiration: 2044-01-29
Also published as: CN117648462B

Abstract

The application provides a video recommendation method and a system, wherein a popular video list is obtained by obtaining user behavior log data of all users under the same viewing authority; then, specific user behavior log data are put into an algorithm model for calculation, and a corresponding interest close list is obtained for each video in the video set; calculating the feature vector of each video according to the video detail data, and obtaining a corresponding content similarity list for each video; then, according to the latest user log data of the target user, the corresponding interest similar list and content similar list are called to form a quasi interest video list; and finally selecting videos from the list to form a hot recommendation list and an interest recommendation list. By integrating and innovating a recall algorithm and a recommendation system based on content recommendation and popular recommendation, the accuracy, applicability and instantaneity of video recommendation are improved, and more personalized and comprehensive recommended video service can be provided for users.

Description

Video recommendation method and system

Technical Field

The invention relates to the field of video recommendation, in particular to a video recommendation method and system.

Background

With the rapid development of information explosion and multimedia applications, a recommendation system is used as a key technology of information filtering and personalized services and gradually becomes the core of various applications. Traditional recommendation methods mainly comprise collaborative filtering and content-based recommendation, however, the methods have certain limitations in facing challenges such as cold start, data sparseness and recommendation diversity of users. Traditional popular recommendations tend to narrow the filtering of information, enabling users to sink into "filter bubbles" of information, and lack sufficient diversity. In addition, these methods often have difficulty in efficiently processing large-scale user and video information, resulting in reduced recommendation performance, failing to meet the diverse information needs of users. In the process of expanding the business in the global scope, because the global users are distributed in different countries and regions, different agent tenants in different countries are responsible for operation, and the video authority which can be watched in the management scope of different tenants is different. The current video recommendation system cannot support different video management ranges of different tenants.

Disclosure of Invention

In view of this, the present application proposes a video recommendation method and system, as shown in fig. 1-5, and the specific scheme is as follows:

the first part, the application provides a video recommendation method, which comprises the following steps:

obtaining a video set corresponding to each viewing authority in a video database;

obtaining user behavior log data of all users under the same viewing authority in a preset first time period to obtain first log data, and obtaining a hot video list in the first time period according to the first log data;

acquiring user behavior log data of all users under the same viewing authority in a preset second time period to obtain second log data, and selecting video names of a plurality of videos closest to each video in a video set according to the second log data to obtain an interest close list; the interest similar list is stored in a cache, so that later-stage calling is facilitated;

calculating the feature vector of each video according to the video detail data of all videos in the video set, and obtaining the video names of a plurality of videos closest to the content of each video to obtain a content similarity list; the content similar list is stored in a cache, so that later-stage calling is facilitated;

obtaining video names in the latest user log data of a target user, and calling a corresponding interest close list and a corresponding content close list according to the video names to form a quasi interest video list; likewise, the quasi interest video list can be stored in a cache, so that later-stage calling is facilitated;

and selecting a plurality of videos from the hot video list to form a hot recommendation list, and selecting a plurality of videos from the quasi interest video list to form an interest recommendation list. In practical application, the time for selecting the popular recommendation list and the interest recommendation list can be timed or can be adjusted to be non-timed according to the practical situation, and the process for selecting the video composition list can be realized by calling the data stored in the cache.

In some embodiments, the acquiring the hot video list in the first period of time includes:

screening first user behavior log data from the first log data, and distributing corresponding weight scores for each first user behavior log data;

and assigning scores to the videos according to the first user behaviors related to each video, and selecting a plurality of videos with highest scores to form a hot video list of the users under the same viewing authority in a first time period.

In some embodiments, the process of obtaining the interest close list includes:

the user behavior log data includes viewing data and collection data;

obtaining watching data of all users under the same watching authority within a preset second time period, wherein the watching data comprises field data of user names and video names;

and (3) putting field data of the user name and the video name into a pre-trained recall algorithm model for modeling to obtain the video names of a plurality of videos which are closest to each video and correspond to each video, and forming an interest close list.

In some embodiments, the process of obtaining the content affinity list includes:

acquiring video detail data of all videos in a video set, and putting the video detail data into a word segmentation device to obtain a keyword list;

the use of TF-IDF algorithm (TF-IDF, term frequency-inverse document frequency, is a common weighting technique for information retrieval and data mining, commonly used to mine keywords in articles). Training a plurality of numbers to replace keywords in a keyword list through a pre-trained feature vector model, and calculating word frequency of each keyword in the video, wherein the word frequency is the ratio of the total number of times a certain keyword appears in the video to the total number of words of the video;

obtaining the total number of videos in the video set and the number of videos containing keywords in the video set, and calculating the inverse document frequency IDF of each keyword in the videos, wherein n is the total number of videos in the video set, and m is the total number of videos containing keywords:

multiplying the obtained word frequency with the inverse document frequency to obtain feature vectors of each video, storing the feature vectors in a vector database, and calculating to obtain video names of a plurality of videos which are closest to each video in content and correspond to each video to form a content similarity list.

In some embodiments, the process of composing the quasi interest video list includes:

the latest user behavior log data comprises latest watching records and latest collection records;

according to the video names in the latest watching record of the target user, the interest close list and the content close list corresponding to the target user are called, the called interest close list is put into a first set, and the called content close list is put into a second set;

according to the video names in the latest collection records of the target users, the corresponding interest close list and content close list are called, the interest close list is put into a first set, and the content close list is put into a second set;

and extracting one or more video names from the first set and the second set respectively, and forming a quasi interest video list by the extracted video names.

In some embodiments, the specific steps of forming the popular recommendation list and the interest recommendation list include:

preset proportion M to N, preset constant k;

selecting M x k videos from the hot video list to form a hot recommendation list;

and selecting N x k videos from the quasi interest video list to form an interest recommendation list.

In some embodiments, when the number of videos in the hot video list is less than m×k, and/or the number of videos in the quasi interest video list is less than n×k, the videos are extracted from the video set or randomly extracted from the video set according to a preset rule to supplement the videos.

In some embodiments, the specific steps of forming the popular recommendation list and the interest recommendation list further comprise:

before selecting videos, detecting and removing expiration data, invalid data and repeated data in the hot video list and the quasi-interest video list.

In some embodiments, the first time period and the second time period may be a time range calculated forward from each whole point, or may be a time range calculated forward from a time node of the user accessing the interest recommendation list and the trending recommendation list from the front end.

In practical application, the video playing system covers multiple countries and regions, the regions are different in terms of language and the like, and users in different regions watch different video contents, in one embodiment, operators, namely tenants, manage and operate all users in the region where the operators are responsible, and users subordinate to the operators can only watch videos in the operation range of the operators, namely each operator corresponds to one watching authority in the video master library, and users managed by the operators can only watch videos in the video system in the watching range specified by the operators. And the generation of all subsequent video name lists is also directed to the audience under the same viewing authority under the management of the same operator. It should be noted that, different viewing rights in different regions may also cover the same video range in the video gallery, and the specific distinction between viewing rights in viewing content is not limited in this application.

In one embodiment, the video library may be an IMDb library (Internet Movie Database, internet movie database, a popular movie and television program database, an online comprehensive movie information platform containing a large amount of relevant information about movies, television programs, actors, directors, dramas, etc.).

The second part, the application proposes a video recommendation system, which can be used for implementing any recommendation method in the above technical scheme, including:

the video database is used for storing videos and detail data of each video, and a plurality of video sets are formed in the video database corresponding to the watching rights;

the right management module is used for managing the watching right of the target user and limiting the target user to only access the video set corresponding to the watching right;

the reading module is used for acquiring user behavior log data;

the training module is used for training and calculating the user log data and the user behavior data acquired by the reading module;

the caching module is used for storing a hot video list, a quasi-interest video list, an interest close list and a content close list;

and the output module is used for selecting, composing and displaying a popular recommendation list and an interest recommendation list in the cache module.

The recall algorithm and the recommendation system based on content recommendation and popular recommendation are integrated and innovated.

Firstly, through a mixed recall algorithm, the personalized recommendation of the user is realized, the historical behavior of the user and the preference of similar users are fully considered, and the accuracy and the user satisfaction degree of a recommendation system are improved.

And secondly, the consideration of the characteristics of the video is introduced based on a content recommendation algorithm, so that the understanding of the system to the user interest is further improved, and the defect of collaborative filtering in scenes such as cold start is overcome.

Finally, the introduction of hot recommendation enables the system to flexibly adapt to the real-time interest change of the user, and the novelty and diversity of recommendation are maintained.

This integrated design brings various advantages. Firstly, by organically combining a recall algorithm and content recommendation, the system integrates the advantages of the recall algorithm and the content recommendation, and realizes more comprehensive and accurate recommendation service. Secondly, for new users and cold start videos, the algorithm based on content recommendation can be better processed, and the applicability of the system is improved. The hot recommendation is added to promote the sensitivity of the system to hot videos, timely capture the instantaneous interests of users, and provide valuable initialization data for new users.

The beneficial effects are that: according to the video recommending method and system, the recall algorithm and the content recommending algorithm are organically combined, behavior log data of a user are fully utilized to obtain a content similarity list and an interest similarity list, and then the hot recommending list is introduced, so that the problem of cold start of a new user can be effectively solved, remarkable progress is achieved in the aspects of improving the accuracy, applicability and instantaneity of a recommending system, more personalized and comprehensive recommending service is provided for the user, and the user satisfaction degree and the activity degree of a platform are effectively improved.

Drawings

In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings that are needed in the embodiments will be briefly described below, it being understood that the following drawings only illustrate some embodiments of the present application and therefore should not be considered limiting the scope, and that other related drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.

FIG. 1 is a schematic diagram of specific steps for video recommendation in the present application;

FIG. 2 is a schematic flow chart of generating a popular video recommendation list in the present application;

FIG. 3 is a schematic flow chart of generating a content close list and an interest close list in the present application;

FIG. 4 is a schematic flow chart of generating a popular video recommendation list and an interest recommendation list in the present application;

fig. 5 is a schematic diagram of each module of the video recommendation system in the present application.

Reference numerals: 1-a video recommendation system; 11-a video database; 12-a rights management module; 13-a reading module; 14-a training module; 15-a cache module; 16-output module.

Detailed Description

Hereinafter, various embodiments of the present disclosure will be more fully described. The present disclosure is capable of various embodiments and its modifications and variations are possible in light of the above teachings. However, it should be understood that: there is no intention to limit the various embodiments disclosed herein to the specific embodiments disclosed herein, but rather the present disclosure is to be interpreted as covering all modifications, equivalents, and/or alternatives falling within the spirit and scope of the various embodiments disclosed herein.

The terminology used in the various embodiments disclosed herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the various embodiments disclosed herein. As used herein, the singular is intended to include the plural as well, unless the context clearly indicates otherwise. Unless otherwise defined, all terms (including technical and scientific terms) used herein have the same meaning as commonly understood by one of ordinary skill in the art to which various embodiments of this disclosure belong. Terms (such as those defined in commonly used dictionaries) will be interpreted as having a meaning that is the same as the context of the relevant art and will not be interpreted in an idealized or overly formal sense unless expressly so defined herein in connection with the various embodiments disclosed herein.

Example 1

Embodiment 1 of the application discloses a video recommendation method and system, and specific schematic diagrams are shown in fig. 1-4, and specific schemes are as follows:

the first part, the application proposes a video recommendation method, which can output a popular recommendation list and an interest recommendation list according to the needs of a user, and the specific steps are as shown in fig. 1, including:

s101, obtaining a video set corresponding to each viewing authority in a video database;

s102, obtaining user behavior log data of all users under the same viewing authority in a preset first time period to obtain first log data, and obtaining a hot video list in the first time period according to the first log data;

s103, acquiring user behavior log data of all users under the same viewing authority within a preset second time period to obtain second log data, and selecting video names of a plurality of videos closest to each video in the video set according to the second log data to obtain an interest similarity list;

s104, calculating the feature vector of each video according to the video detail data of all videos in the video set, and obtaining the video names of a plurality of videos closest to the content of each video to obtain a content similarity list;

s105, obtaining video names in the latest user log data of the target user, and calling the corresponding interest close list and content close list according to the video names to form a quasi interest video list;

s106, selecting a plurality of videos from the hot video list to form a hot recommendation list at regular time, and selecting a plurality of videos from the quasi-interest video list to form an interest recommendation list.

In step S102, the process of acquiring the hot video list includes:

In step S103, the acquisition process of the interest close list includes:

the user behavior log data includes viewing data and collection data;

the field data of the user name and the video name are put into a pre-trained recall algorithm model to carry out model training;

and calculating to obtain video names of a plurality of videos which are closest to each video and correspond to each video, and forming an interest similar list.

In step S104, the acquisition process of the content proximity list includes:

training a plurality of numbers to replace keywords in a keyword list through a pre-trained feature vector model by using a TF-IDF algorithm, and calculating word frequency TF of each keyword in the video, wherein the word frequency is the ratio of the total number of times a certain keyword appears in the video to the total number of words of the video;

In step S105, the specific steps of composing the quasi interest video list include:

In step S106, the specific steps of forming the popular recommendation list and the interest recommendation list include:

preset proportion M to N, preset constant k;

It should be noted that the present application does not limit the output interest recommendation list and the popular recommendation list to be in a certain proportion in all cases, and in practical application, the number of video names in the interest recommendation list and the popular recommendation list may be adjusted to be a self-defined numerical value according to practical needs.

In step S106, the specific steps of forming the popular recommendation list and the interest recommendation list further include: before selecting videos, detecting and removing expiration data, invalid data and repeated data in the hot video list and the quasi-interest video list.

In some embodiments, the first time period is a time range estimated forward from each full point in time, or a time range estimated forward from a time node of the user accessing the hot recommendation list as a starting point; the second time period is a time range calculated forward from a time node of the user accessing the interest recommendation list as a starting point.

In practical application, the video playing system covers multiple countries and regions, the regions are different in terms of language and the like, and users in different regions watch different video contents, in one embodiment, operators manage and operate all users in the region where the operators are responsible, that is, users under management of operators can only watch videos in the operation range of the operators, that is, each operator corresponds to one watching authority in the video master library, and users under management of the operators only have viewable authorities for videos in the range specified by the operators. And then all video recommendation lists are generated for the audience under the same operator management and under the same viewing authority. It should be noted that, different viewing rights in different regions may cover different video ranges in the video gallery, or may cover the same video range in the video gallery, and the specific distinction between viewing rights in viewing content is not limited in this application.

All the operations of recommendation, calculation and selection are performed on the premise of users under the same viewing authority, so that video recommendation lists under each authority can be distinguished according to actual conditions, users under different operators in each region can obtain video recommendation which is more in line with the characteristics of the users in the region, and the recommendation characteristics according to local conditions are presented.

In one embodiment, the video library may be an IMDb library (Internet Movie Database, internet movie database, a popular movie and television program database, an online comprehensive movie information platform, containing a large amount of relevant information about movies, television programs, actors, directors, dramas, etc.).

In a complete video recommendation process, in a first step, user behavior log data is collected only for all users under the same viewing authority. The time range of the first time period is defined, and in a specific embodiment, the first time period can be calculated to be 12 hours forward from the last full-time moment that has been spent as the time range.

User behavior log data of all users under the same viewing authority within 12 hours are obtained, and 3 kinds of log data are read by using a fly (Apache fly, a streaming data processing framework which can be used for real-time and batch data processing) and are cleaned and analyzed.

The user behavior log data comprise different user behaviors, such as complete watching, corresponding to complete watching log; watching, namely corresponding to a watching record log; and collecting, namely recording logs corresponding to the collection. The standard of recording the behavior watched by the user as a watching record is to watch the video for more than three minutes; the criteria entered as a complete viewing record is that the video is viewed for more than 80% of the video duration.

Corresponding weight scores are assigned to different user behaviors, and in one embodiment, each viewing record weight of the user is 0.1, each collection record weight is 0.5, and each complete viewing record weight is 0.5. Based on this weight score, each video may be assigned a score.

And assigning corresponding videos according to the watching, complete watching and collecting behaviors of all users within 12 hours, accumulating the scores within 12 hours, and selecting the top 100 videos with the highest accumulated scores within the time period after the calculation is completed, wherein the video names of the 100 videos form a popular video list of all users within 12 hours under the watching authority. The data of the dynamic hot video list within 12 hours is stored in a Redis cache (Remote Dictionary Server, a remote dictionary server is an open source, is written and supported by ANSIC language, can be based on a memory and can also be persistent, and is a log-type and Key-Value database) so as to be convenient to call and display when a user accesses.

Therefore, the hot video lists of all users under the same authority can be refreshed once at each integral point, sufficient references can be provided for new users, and the problem of cold start of the new users is effectively solved. The first time period may be divided by estimating 12 hours from a time node of the user accessing the popular recommendation list from the front end as a time range, so that popular ranking within the last 12 hours in real time can be achieved when the user accesses the popular video list from the front end.

Secondly, calculating according to the similarity of the user behaviors: the time range of the second time period is defined, in a specific embodiment, the second time period may be calculated by taking 7 days forward from the last elapsed whole point moment as the time range, or may be calculated by taking 7 days forward from the time node of the user accessing the interest recommendation list as the starting point. It should be noted that, the time ranges selected in the embodiment, such as 7 days and 12 hours, are all selected as one of the embodiments, and the specific values of the first time period and the second time period are not limited in the application, and in actual situations, the lengths, the starting points and the ending points of the first time period and the second time period may be selected according to actual needs.

The user behavior log data comprises field data of user names and video names, and the field data of the user names and the video names in the 7-day user behavior log data are acquired. In one embodiment, the field types requiring the user name and video name are long types for model calculation and training. And putting field data of the user name and the video name into a pre-trained recommended recall algorithm model. The recall algorithm can express the similarity between two videos by calculating the scale of users who like two videos at the same time, wherein the larger the user quantity of the two videos is commonly liked, and the lower the coincidence degree between the users is, the higher the similarity between the two videos can be considered.

According to the calculation of the model, 50 videos which are closest to the user behavior of each video in the video set can be obtained, and the video names of the 50 videos can form an interest similarity list, wherein in the list, the 50 videos are arranged in sequence from large to small according to the degree of similarity to the user behavior of the original video. Thus, each video in the video set has an interest close list corresponding thereto that includes 50 video names. In one embodiment, the data of all obtained interest close lists may be stored in the Redis cache for later invocation.

It should be noted that, the user behavior log data includes a viewing log and a collection log, where the viewing log and the collection log each include field data of a user name and a video name. In practical application, the data volume required by the recall algorithm model is larger, so in one embodiment, when the data volume of the collection record log is larger, the recall algorithm model can use field data of the user name and the video name in the collection record log to calculate, and when the data volume of the collection record log is smaller, field data of the user name and the video name in the viewing record log can be used to calculate.

Thirdly, calculating according to the content similarity of the video: and obtaining video detail data of all videos in the video set, wherein the video detail data comprises fields such as video names, labels, content descriptions, actor names and the like, and in one embodiment, the labels, the content descriptions and the actor names are put into a word segmentation device to obtain a keyword list.

128 digits are trained using the feature vector model instead of keywords. In practical applications, the feature vector model used is the HashingTF model (Hashing Term Frequency, a feature extraction model, commonly used for vectorizing text data, which is a hash function-based method for mapping words in text to a feature vector of fixed length). And calculating to obtain the word frequency TF of each keyword in the video, wherein the word frequency=the total number of times that a certain keyword appears in the video/the total word number of the video.

Calculating the inverse document frequency IDF of each keyword in the video by using a TF-IDF algorithm, wherein n is the total number of videos in the video set, and m is the number of videos containing the keywords:

multiplying the obtained word frequency TF by the inverse document frequency IDF to obtain a feature vector TF-IDF of each video, storing the feature vector in a vector database, and calculating to obtain video names of a plurality of videos which are closest to each video in content and correspond to each video to form a content similarity list.

And finally calculating and obtaining the characteristic vector TF-IDF=TF-IDF of each video, storing the characteristic vector in a vector database, and calculating to obtain the video names of 50 videos which are closest to each video in content and correspond to each video, wherein the video names of 50 videos can form a content similarity list, and in the list, 50 videos are arranged in sequence from large to small according to the similarity with the original video. Thus, each video in the video set has a content affinity list corresponding thereto that includes 50 video names. In one embodiment, the obtained data for all close-of-content lists may be stored in a Redis cache for later recall.

In practical application, the calculation of the interest close list and the content close list can be completed at a back-end server in an offline condition, and the refreshing of the output content of the interest close list and the content close list in each hour can be realized by executing the operation of calculation regularly. The timed refreshing enables the content in the interest close list and the content close list to keep up with the watching preference and the hot spot trend of the user in time.

And fourthly, obtaining a video list which is possibly interested by the user through prejudgment. At this time, for the target user, the latest user behavior log data of the target user is acquired, wherein the latest user behavior log data includes the latest viewing record and the latest collection record. The video names in the latest viewing record and the video names in the latest collection record are obtained, respectively.

Invoking interest close lists and content close lists corresponding to video names in the latest watching record from the Redis cache, putting one or more obtained interest close lists into a first set, and putting one or more obtained content close lists into a second set;

invoking interest close lists and content close lists corresponding to video names in the latest collection records from the Redis cache, putting one or more obtained interest close lists into a first set, and putting one or more obtained content close lists into a second set.

Therefore, the first set which is the total set of interest close lists corresponding to the latest user behaviors of the target user and the second set which is the total set of content close lists can be obtained.

In one embodiment, 25 video names are randomly selected from the first set and the second set respectively, and are combined to form a quasi interest video list, and the quasi interest video list is stored in a Redis cache so as to facilitate subsequent calling. It should be noted that, the present application is not limited to a specific number of video names selected in the set, and in practical application, the number of the selected video names may be flexibly adjusted according to the needs of the user and the computing power of the system.

And fifthly, processing the hot video list obtained in the first step and the quasi-interest video list obtained in the fourth step, and detecting and eliminating expiration data, invalid data and repeated data in the video list.

The final step, the interest recommendation list and the hot recommendation list are output for the user: n is a preset proportion, k is a preset constant, and when a target user accesses the interest recommendation list and the hot recommendation list from the front end, M x k videos are selected from the hot video list to form the hot recommendation list; and selecting N x k videos from the quasi interest video list to form an interest recommendation list.

When the number of videos in the hot video list is less than m×k, and/or the number of videos in the quasi interest video list is less than n×k, the videos can be extracted again from the video set or randomly extracted from the video set for supplementing according to a preset rule.

The embodiment provides a video recommendation method, which makes full use of behavior log data of a user by utilizing an organic combination of a recall algorithm and a content recommendation algorithm, respectively generates a content similarity list and an interest similarity list according to video content similarity and user behavior similarity, and further extracts a quasi-interest video list from the two lists according to the latest watching and collecting behaviors of the user, so that the user can be individually recommended, and meanwhile, the richness and diversity of recommended content can be improved. And by introducing a hot recommendation list, the cold start problem of a new user is effectively solved, and remarkable progress is made in the aspects of improving the accuracy, applicability and instantaneity of a recommendation system. And the operation is carried out on the premise of users under the same viewing authority, so that video recommendation lists under each authority can be distinguished according to actual conditions, and the users under different operators in each region can obtain video recommendation more conforming to the characteristics of the users in the region. The method and the system provide more personalized and comprehensive recommendation service for the user, and effectively improve the satisfaction degree of the user and the activity degree of the platform.

Example 2

Embodiment 2 of the present application discloses a video recommendation system, which can implement any one of the recommendation methods in the technical solutions in embodiment 1, and specifically as shown in fig. 5, the video recommendation system 1 includes:

a video database 11 for storing videos and detailed data of each video, wherein a plurality of video sets are formed in the video database 11 corresponding to viewing rights;

the right management module 12 is used for managing the watching right of the target user and limiting the target user to only access the video set corresponding to the watching right;

a reading module 13, configured to obtain user behavior log data;

the training module 14 is used for training and calculating the user behavior log data acquired by the reading module;

the caching module 15 is used for storing a hot video list, a quasi interest video list, an interest close list and a content close list;

and the output module 16 is used for selecting, composing and displaying the hot recommendation list and the interest recommendation list in the cache module.

The embodiment provides a video recommendation system, which is characterized in that video sets are correspondingly arranged in a video database through managing the watching authority of a user, user behavior log data of the user under the same watching authority are read, and the user behavior log data are analyzed and trained to obtain a video list, so that when the user accesses through the front end, the corresponding videos in the video list in a cache can be quickly called to form a popular recommendation list and an interest recommendation list. The interest recommendation list generated by the recommendation system fully considers the historical behaviors of the user and the preference of similar users, improves the accuracy and the user satisfaction of the recommendation system, effectively solves the problem of cold start of the user by using the hot recommendation list, and improves the richness of the recommended content of the user.

Those skilled in the art will appreciate that the drawings are merely schematic illustrations of one preferred implementation scenario, and that the modules or flows in the drawings are not necessarily required to practice the present application. Those skilled in the art will appreciate that modules in an apparatus in an implementation scenario may be distributed in an apparatus in an implementation scenario according to an implementation scenario description, or that corresponding changes may be located in one or more apparatuses different from the implementation scenario. The modules of the implementation scenario may be combined into one module, or may be further split into a plurality of sub-modules. The foregoing application serial numbers are merely for description, and do not represent advantages or disadvantages of the implementation scenario. The foregoing disclosure is merely a few specific implementations of the present application, but the present application is not limited thereto and any variations that can be considered by a person skilled in the art shall fall within the protection scope of the present application.

Claims

1. A video recommendation method, comprising:

acquiring user behavior log data of all users under the same viewing authority within a preset first time period to obtain first log data, and obtaining a hot video list of the first time period according to the first log data;

acquiring user behavior log data of all users under the same viewing authority within a preset second time period to obtain second log data, and selecting video names of a plurality of videos closest to each video in the video set according to the second log data to obtain an interest similarity list;

calculating the feature vector of each video according to the video detail data of all videos in the video set, and obtaining the video names of a plurality of videos closest to the content of each video to obtain a content similarity list;

obtaining a video name in the latest user log data of a target user, and calling a corresponding interest close list and a content close list according to the video name to form a quasi interest video list;

and selecting a plurality of videos from the hot video list to form a hot recommendation list, and selecting a plurality of videos from the quasi interest video list to form an interest recommendation list.

2. The video recommendation method according to claim 1, wherein the acquiring process of the hot video list comprises:

and assigning scores to the videos according to the first user behaviors related to each video, and selecting a plurality of videos with highest scores to form a hot video list of the user in the first time period under the same viewing authority.

3. The video recommendation method according to claim 1, wherein the obtaining process of the interest close list comprises:

the second log data comprises field data of a user name and a video name, and the field data of the user name and the video name in the second log data are obtained;

and the field data of the user name and the video name are put into a pre-trained recall algorithm model to obtain the video names of a plurality of videos which are closest to each video and correspond to each video, so as to form the interest close list.

4. The video recommendation method according to claim 1, wherein the specific step of forming the content proximity list comprises:

acquiring video detail data of all videos in the video set, and putting the video detail data into a word segmentation device to obtain a keyword list;

training a plurality of numbers to replace keywords in the keyword list by using a pre-trained feature vector model, and calculating word frequency of each keyword in the video, wherein the word frequency is the ratio of the total number of times a certain keyword appears in the video to the total number of words of the video;

obtaining the total number of videos in a video set and the number of videos containing the keywords in the video set, and calculating the inverse document frequency IDF of each keyword in the videos, wherein n is the total number of videos in the video set, and m is the total number of videos containing the keywords:

multiplying the obtained word frequency with the inverse document frequency to obtain feature vectors of each video, storing the feature vectors in a vector database, and calculating to obtain video names of a plurality of videos which are closest to the content of each video and correspond to each video to form the content similarity list.

5. The video recommendation method according to claim 1, wherein the specific step of composing the quasi interest video list comprises:

6. The video recommendation method according to claim 1, wherein the specific step of forming the hot recommendation list and the interest recommendation list comprises:

preset proportion M to N, preset constant k;

7. The video recommendation method according to claim 6, wherein when the number of videos in the popular video list is less than m×k, and/or the number of videos in the quasi interest video list is less than n×k, video is extracted from the video set or randomly extracted from the video set according to a preset rule to supplement the video.

8. The video recommendation method according to claim 6, wherein forming the hot recommendation list and the interest recommendation list further comprises:

before selecting the video, detecting and removing the expiration data, invalid data and repeated data in the popular video list sum.

9. The video recommendation method according to claim 1, wherein the first time period and the second time period are a time range calculated forward from a time node of a user's access to the interest recommendation list and the trending recommendation list.

10. A video recommendation system for implementing the recommendation method of any one of claims 1-9, comprising:

the video database is used for storing videos and detail data of each video, and a plurality of video sets are formed in the video database corresponding to the viewing rights;

the right management module is used for managing the viewing right of the target user and limiting the target user to only access the video set corresponding to the viewing right;

the reading module is used for acquiring the user behavior log data;

the training module is used for training and calculating the user behavior log data acquired by the reading module;

the caching module is used for storing the hot video list, the quasi-interest video list, the interest close list and the content close list;

and the output module is used for selecting, composing and displaying the hot recommendation list and the interest recommendation list in the cache module.