CN109800325B

CN109800325B - Video recommendation method and device and computer-readable storage medium

Info

Publication number: CN109800325B
Application number: CN201811599231.3A
Authority: CN
Inventors: 蔡锦龙
Original assignee: Beijing Dajia Internet Information Technology Co Ltd
Current assignee: Beijing Dajia Internet Information Technology Co Ltd
Priority date: 2018-12-26
Filing date: 2018-12-26
Publication date: 2021-10-26
Anticipated expiration: 2038-12-26
Also published as: CN109800325A

Abstract

The application relates to a video recommendation method, a video recommendation device and a computer storage medium. The video recommendation method comprises the following steps: establishing a click rate estimation model based on a neural network algorithm; respectively calculating the similarity between the attribute characteristics of a target user and the attribute characteristics of a plurality of historical users to obtain a first historical user which is most similar to the attribute characteristics of the target user; obtaining the click rate of the first historical user to the related video based on the click rate estimation model; and recommending a plurality of videos in the relevant videos to the target user based on the click rate of the first historical user on the relevant videos. In the video recommendation method, the attribute characteristics of the user are fully utilized to estimate the click rate of the user on the related video, so that the accuracy of estimating the click rate of the new user or the user with less user behaviors on the related video is improved, and the accuracy of video recommendation on the new user or the user with less user behaviors is improved.

Description

Video recommendation method and device and computer-readable storage medium

Technical Field

The application belongs to the field of computer software application, and particularly relates to a video recommendation method and device.

Background

With the increasing progress of science and technology and the popularization of the internet, more and more people transmit information and share life through videos, and massive personalized video recommendation is increasingly important. At present, the application is widely that the click rate and other targets of the user to the video are estimated through a machine learning method.

In personalized video recommendation, there are generally two phases: the first stage is video triggering, and finding out a video set which may be interested by a user from the video sets according to the attributes and behaviors of the user. And the second stage is video sequencing, and the video sequencing triggered in the first stage is scored through a neural network model or strategy to find out the most interesting video possibly shown by the user and show the most interesting video to the user. Similar videos are generally found according to videos browsed by a user, or similar authors are found according to authors concerned by the user, and then the similar videos or/and videos of the similar authors are recommended to the user. The method can not use the attributes of the users to recommend the related videos, when one user is a new user or the historical behaviors of the user are less, the personalized videos are not triggered sufficiently, and the accuracy rate of video recommendation is reduced.

Disclosure of Invention

In order to solve the problem that the accuracy of video recommendation is reduced due to insufficient personalized video triggering for a new user or a user with few historical behaviors in the related technology, the application discloses a video recommendation method and device.

According to a first aspect of embodiments of the present application, there is provided a video recommendation method, including:

establishing a click rate estimation model based on a neural network algorithm;

respectively calculating the similarity between the attribute characteristics of a target user and the attribute characteristics of a plurality of historical users to obtain a first historical user which is most similar to the attribute characteristics of the target user;

obtaining the click rate of the first historical user to the related video based on the click rate estimation model; and

recommending a plurality of videos in the relevant videos to the target user based on the click rate of the first historical user on the relevant videos.

Optionally, the attribute characteristics of the user include: user ID feature, static feature, and dynamic feature.

Optionally, the static characteristics of the user comprise at least one of the following characteristics: age, gender, geographic location, IP address, mobile phone model, mobile phone installed application program list;

the dynamic characteristics of the user include at least one of the following characteristics: a user click history feature, a user likes history feature, and a user concerns list feature.

Optionally, the video has video features including at least one of the following: a video ID feature and a video author ID feature.

Optionally, the calculating the similarity between the attribute features of the target user and the attribute features of the plurality of historical users respectively to obtain a first historical user which is closest to the attribute features of the target user, including;

extracting the attribute features of the target user and the attribute features of the plurality of historical users;

respectively calculating the distance between the attribute feature of the target user and the attribute features of the plurality of historical users; and

and sequencing the distances to obtain a first historical user which is closest to the attribute characteristics of the target user.

Optionally, the obtaining the click rate of the first historical user on the related video based on the click rate pre-estimation model includes:

extracting the attribute features of the first historical user and the video features of the related video;

performing vector summation on the static features of the first historical user and the video ID features of the related videos to obtain first features;

performing vector summation on the static features of the first historical user and the video author ID features of the related videos to obtain second features;

taking the first feature and the second feature and the user ID feature of the first historical user and the dynamic feature of the first historical user as third features; and

and inputting the third characteristic into the click rate estimation model to obtain the click rate of the first historical user on the related video.

Optionally, the recommending, to the target user, a plurality of videos in the related videos based on the click rate of the first historical user on the related videos includes:

according to the click rate of the first historical user on the related videos, the click rates of the related videos are sorted in a reverse order;

recommending a plurality of videos with the front sequence in the related videos to the target user according to the click rate reverse sequence of the related videos.

Optionally, the establishing a click rate prediction model based on a neural network algorithm includes:

extracting the attribute features of the sample user;

extracting the video characteristics of a sample video and labeling the sample video with a video label;

establishing a click rate estimation target model based on a neural network algorithm;

forward learning the click rate estimation target model based on the attribute features of the sample user and the video features of the sample video; and

and reversely learning the click rate estimation target model based on the attribute characteristics of the sample user and the video characteristics of the sample video.

Optionally, the establishing a click rate prediction model based on a neural network algorithm further includes:

performing vector summation on the static features of the sample user and the video ID features of the sample video to obtain fourth features;

performing vector summation on the static features of the sample user and the video author ID features of the sample video to obtain fifth features; and

taking the fourth and fifth features and the user ID feature of the historical user and the dynamic feature of the historical user as sixth features.

Optionally, the performing forward learning on the click-through rate pre-estimation target model based on the attribute features of the sample user and the video features of the sample video includes:

inputting the sixth feature into the click rate pre-estimated target model;

transforming the sixth feature layer by layer from bottom to top in the click rate estimation target model to obtain a top-level vector of the click rate estimation target model;

and converting the top-level vector into the probability of the click rate.

Optionally, the performing reverse learning on the click-through rate estimation target model based on the attribute features of the sample user and the video features of the sample video includes:

calculating a loss function of the click rate estimation target model according to the probability of the click rate and the video label of the sample video;

minimizing a loss function of the click rate pre-estimated target model by adopting a random gradient descent method;

solving the gradient of the loss function of the click rate estimation target model;

updating the network parameters of the click rate estimation target model layer by layer from top to bottom; and

and updating the network parameters corresponding to the sixth characteristic.

Optionally, the labeling the sample video with a video tag includes:

if the sample user clicks the sample video displayed on the operation page, marking the sample video as a positive sample;

and if the sample user does not click the sample video displayed by the operation page, marking the sample video as a negative sample.

According to a second aspect of the embodiments of the present application, there is provided a video recommendation apparatus, including: the method comprises the following steps:

the model establishing unit is used for establishing a click rate estimation model based on a neural network algorithm;

the nearest neighbor retrieval unit is used for respectively calculating the similarity between the attribute characteristics of a target user and the attribute characteristics of a plurality of historical users to obtain a first historical user which is closest to the attribute characteristics of the target user;

the click rate estimation unit is used for obtaining the click rate of the first historical user to the related video based on the click rate estimation model; and

and the video recommending unit is used for recommending a plurality of videos in the related videos to the target user based on the click rate of the first historical user on the related videos.

Optionally, the nearest neighbor retrieving unit includes:

a first feature extraction unit configured to extract the attribute features of the target user and the attribute features of the plurality of history users;

a distance calculation unit configured to calculate distances between the attribute feature of the target user and the attribute features of the plurality of history users, respectively; and

and the first sequencing unit is used for sequencing the distances to obtain a first historical user which is closest to the attribute characteristics of the target user.

Optionally, the click rate predicting unit includes:

a second feature extraction unit, configured to extract the attribute features of the first historical user and the video features of the related video;

the first feature fusion unit is used for carrying out vector summation on the static features of the first historical user and the video ID features of the related videos to obtain first features;

and the estimation unit is used for inputting the third characteristic into the click rate estimation model to obtain the click rate of the first historical user on the related video.

Optionally, the video recommending unit includes:

the second sorting unit is used for sorting the click rates of the related videos in a reverse order according to the click rates of the first historical user on the related videos;

and the recommending unit is used for recommending a plurality of videos with the front sequence in the related videos to the target user according to the click rate reverse sequence of the related videos.

Optionally, the model building unit includes:

a third feature extraction unit, configured to extract the attribute features of the sample user;

the target model establishing unit is used for establishing a click rate estimation target model based on a neural network algorithm;

a forward learning unit, configured to perform forward learning on the click rate estimation target model based on the attribute features of the sample user and the video features of the sample video; and

and the reverse learning unit is used for performing reverse learning on the click rate estimation target model based on the attribute characteristics of the sample user and the video characteristics of the sample video.

Optionally, the model building unit further includes:

the second feature fusion unit is used for carrying out vector summation on the static features of the sample user and the video ID features of the sample video to obtain fourth features;

inputting the sixth feature into the click rate pre-estimated target model;

and converting the top-level vector into the probability of the click rate.

and updating the network parameters corresponding to the sixth characteristic.

Optionally, the labeling the sample video with a video tag includes:

According to a third aspect of the embodiments of the present invention, there is provided a video recommendation apparatus including:

a processor;

a memory for storing processor-executable instructions;

wherein the processor is configured to perform any one of the video recommendation methods described above.

According to a fourth aspect of the embodiments of the present invention, there is provided a computer-readable storage medium, wherein the computer-readable storage medium stores computer instructions, and the computer instructions, when executed, implement the above-mentioned video recommendation method.

The technical scheme provided by the embodiment of the application can have the following beneficial effects:

and respectively calculating the similarity between the attribute characteristics of the target user and the attribute characteristics of each historical user to obtain a first historical user which is closest to the attribute characteristics of the target user. And the N layers of neural network layers of the click rate estimation model carry out layer-by-layer transformation on the attribute characteristics and the user behavior characteristics of the first historical user and the video characteristics of the related videos, and finally the click rate of the first historical user on the related videos is obtained. And recommending the videos with higher click rates of the first historical user to the related videos to the target user according to the click rates of the first historical user to the related videos. The attribute characteristics of the user are fully utilized to estimate the click rate of the user on the related video, so that the accuracy of estimating the click rate of the new user or the user with less user behaviors on the related video is improved, and the accuracy of video recommendation on the new user or the user with less user behaviors is improved.

And respectively calculating the distances between the attribute features of the target user and the attribute features of the plurality of historical users. The distances are sequenced to obtain the first historical user which is closest to the attribute characteristics of the target user, so that the accuracy of searching the first historical user which is closest to the attribute characteristics of the target user is improved, and the accuracy of video recommendation is further improved.

It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the application.

Drawings

Fig. 1 is a flow diagram illustrating a video recommendation method according to an example embodiment.

Fig. 2 is a flow diagram illustrating a video recommendation method according to an example embodiment.

Fig. 3 is a flow diagram illustrating a video recommendation method according to an example embodiment.

Fig. 4 is a flow diagram illustrating a video recommendation method according to an example embodiment.

Fig. 5 is a schematic diagram illustrating a video recommendation apparatus according to an example embodiment.

Fig. 6 is a schematic diagram illustrating a video recommendation apparatus according to an example embodiment.

Fig. 7 is a schematic diagram illustrating a video recommendation apparatus according to an example embodiment.

Fig. 8 is a schematic diagram illustrating a video recommendation apparatus according to an example embodiment.

Fig. 9 is a schematic diagram illustrating a video recommendation apparatus according to an example embodiment.

Fig. 10 is a block diagram illustrating an apparatus for performing a video recommendation method according to an example embodiment.

Fig. 11 is a block diagram illustrating an apparatus for performing a video recommendation method according to an example embodiment.

Detailed Description

Reference will now be made in detail to the exemplary embodiments, examples of which are illustrated in the accompanying drawings. When the following description refers to the accompanying drawings, like numbers in different drawings represent the same or similar elements unless otherwise indicated. The embodiments described in the following exemplary embodiments do not represent all embodiments consistent with the present application. Rather, they are merely examples of apparatus and methods consistent with certain aspects of the present application, as detailed in the appended claims.

Fig. 1 is a flowchart of a video recommendation method according to an exemplary embodiment, which specifically includes the following steps:

in step S101, a click rate estimation model is established based on a neural network algorithm.

In the step, a click rate estimation model is established based on a neural network algorithm. The click-through rate estimation model is a fully connected neural network. For example, the click-through rate estimation model has N neural network layers. Any node of the N-1 layer is connected with all nodes of the N layer.

In step S102, the similarity between the attribute feature of the target user and the attribute features of the plurality of historical users is calculated, respectively, to obtain a first historical user that is closest to the attribute feature of the target user.

In this step, the similarity between the attribute feature of the target user and the attribute feature of each history user is calculated respectively, and a first history user closest to the attribute feature of the target user is obtained. For example, the euclidean distance between the attribute feature of the target user and the attribute feature of each history user is calculated, and the nearest neighbor search is performed using the euclidean distance, thereby obtaining the first history user that is closest to the attribute feature of the target user.

In step S103, based on the click rate estimation model, the click rate of the first historical user on the related video is obtained.

In this step, based on the click-through rate estimation model established in step S101, the attribute characteristics of the first historical user, the user behavior characteristics, and the video characteristics of the related video are input into the click-through rate estimation model. And the N layers of neural network layers of the click rate estimation model carry out layer-by-layer transformation on the attribute characteristics, the user behavior characteristics and the video characteristics of the related video, and finally the click rate of the first historical user on the related video is obtained.

In step S104, recommending a plurality of videos in the relevant videos to the target user based on the click rate of the first historical user on the relevant videos.

In this step, according to the click rate of the first historical user on the related video in step S103, the videos with higher click rate of the first historical user on the related video are recommended to the target user.

According to the embodiment of the application, the similarity between the attribute characteristics of the target user and the attribute characteristics of each historical user is calculated respectively, and the first historical user which is closest to the attribute characteristics of the target user is obtained. And the N layers of neural network layers of the click rate estimation model carry out layer-by-layer transformation on the attribute characteristics and the user behavior characteristics of the first historical user and the video characteristics of the related videos, and finally the click rate of the first historical user on the related videos is obtained. And recommending the videos with higher click rates of the first historical user to the related videos to the target user according to the click rates of the first historical user to the related videos. The attribute characteristics of the user are fully utilized to estimate the click rate of the user on the related video, so that the accuracy of estimating the click rate of the new user or the user with less user behaviors on the related video is improved, and the accuracy of video recommendation on the new user or the user with less user behaviors is improved.

Fig. 2 is a flowchart of a video recommendation method according to an exemplary embodiment, specifically, a process of respectively calculating similarities between attribute features of a target user and attribute features of multiple historical users in step S102 in fig. 1 to obtain a first historical user that is closest to the attribute features of the target user includes the following steps:

in step S201, the attribute features of the target user and the attribute features of the plurality of history users are extracted.

The characteristics of a user include attribute characteristics and behavior characteristics of the user. The attribute features include a user ID feature, a static feature, and a dynamic feature. The static characteristics of the user include at least one of the following characteristics: age, gender, geographic location, IP address, cell phone model, list of applications installed on the cell phone. The dynamic characteristics of the user include at least one of the following characteristics: a user click history feature, a user likes history feature, and a user concerns list feature.

In which the attribute features of the target user and the attribute features of a plurality of history users are extracted.

In step S202, distances between the attribute features of the target user and the attribute features of the plurality of history users are calculated, respectively.

Several common distance calculation methods for characterizing similarity include: euclidean distance, manhattan distance, mahalanobis distance, cosine similarity, and hamming distance. Where the euclidean distance is the "normal" (i.e., straight line) distance between two points in euclidean space. The manhattan distance is the sum of the distances of projections generated by line segments formed by two points on a fixed rectangular coordinate system of euclidean space to axes. Mahalanobis distance represents the covariance distance of the data. Cosine similarity measures the similarity between two vectors by measuring their cosine values of their angle. In the information theory, the hamming distance between two character strings with equal length is the number of different characters at the corresponding positions of the two character strings.

In this step, for example, the target user is X, and the m history users are Y¹,Y²,Y³,…,Y^m. And respectively calculating Euclidean distances between the attribute feature of the target user X and the attribute features of the m historical users. In euclidean space, the attribute of target user X is characterized by X ═ X (X)₁,x₂,x₁,..,x_n) And ith historical user YⁱAttribute characteristics of

Has an Euclidean distance of

Wherein x is₁,x₂,x₁,..,x_nIs an attribute characteristic of the target user X,

is the ith historical user YⁱThe attribute characteristics of (1).

In step S203, the distances are ranked to obtain a first historical user that is closest to the attribute feature of the target user.

In this step, the distances between the attribute feature of the target user and the attribute features of the plurality of history users obtained in step S202 are sorted, for example, by euclidean distance, and a history user corresponding to the smallest distance is obtained. The smaller the distance, the more closely the attribute features between the target user and the historical user are characterized. The historical user corresponding to the minimum distance is the first historical user.

According to the embodiment of the application, the distances between the attribute features of the target user and the attribute features of the plurality of historical users are calculated respectively. The distances are sequenced to obtain the first historical user which is closest to the attribute characteristics of the target user, so that the accuracy of searching the first historical user which is closest to the attribute characteristics of the target user is improved, and the accuracy of video recommendation is further improved.

Fig. 3 is a flowchart of a video recommendation method according to an exemplary embodiment, specifically, in step S103 to step S104 in fig. 1, a process of obtaining a click rate of the first historical user on a related video based on the click rate estimation model and recommending a plurality of videos in the related video to the target user based on the click rate of the first historical user on the related video is obtained. The method comprises the following steps:

in step S301, the attribute features of the first historical user and the video features of the related video are extracted.

The characteristics of the first historical user comprise attribute characteristics and behavior characteristics of the user. The attribute features include a user ID feature, a static feature, and a dynamic feature. The static characteristics of the first historical user include at least one of: age, gender, geographic location, IP address, cell phone model, list of applications installed on the cell phone. The dynamic characteristics of the first historical user include at least one of: a user click history feature, a user likes history feature, and a user concerns list feature. The video features of the related video include at least one of: a video ID feature and a video author ID feature.

In which the attribute features of the first historical user and the attribute features of the associated video are extracted.

In step S302, vector summation is performed on the static features of the first historical user and the video ID features of the relevant video, so as to obtain first features.

In this step, the static features of the first historical user and the video ID features of each relevant video are vector-summed to obtain the first features of the first historical user with respect to each relevant video.

In step S303, vector summation is performed on the static feature of the first historical user and the video author ID feature of the related video, so as to obtain a second feature.

In this step, the static features of the first historical user and the video author ID features of each relevant video are vector-summed to obtain the second features of the first historical user with respect to each relevant video.

In step S304, the first feature and the second feature, and the user ID feature of the first historical user and the dynamic feature of the first historical user are taken as third features.

In this step, the first feature and the second feature obtained in steps S302 and S303, and the user ID feature of the first historical user and the dynamic feature of the first historical user are taken as the third feature of the first historical user with respect to each relevant video.

In step S305, the third feature is input into the click-through rate estimation model, so as to obtain the click-through rate of the first historical user on the related video.

In this step, the third feature of the first historical user obtained in step S304 about each relevant video is input into the click-through rate estimation model. And obtaining the click rate of the first historical user to each related video through the layer-by-layer transformation of the click rate estimation model.

In step S306, according to the click rate of the first historical user on the related video, the click rates of the related videos are sorted in a reverse order.

In this step, the click rate of each related video by the first historical user obtained in step S305 is sorted in reverse order, and the recommendation order of the related video corresponding to the reverse order of the click rate is obtained.

In step S307, recommending a plurality of videos in the related videos, which are in the top order, to the target user according to the click rate reverse order of the related videos.

In this step, according to the recommendation order of the related videos corresponding to the reverse order ranking of the click rate obtained in step S306, one or more videos with the top order in the related videos are recommended to the target user.

According to the embodiment of the application, the static features of the first historical user and the video ID features of each relevant video are subjected to vector summation to obtain the first features of the first historical user about each relevant video. And performing vector summation on the static characteristics of the first historical user and the video author ID characteristics of each related video to obtain second characteristics of the first historical user about each related video. And taking the first characteristic and the second characteristic as well as the user ID characteristic of the first historical user and the dynamic characteristic of the first historical user as a third characteristic of the first historical user about each related video. And obtaining the click rate of the first historical user to each related video through layer-by-layer transformation of the click rate estimation model based on the third characteristics of the first historical user about each related video. And performing reverse ordering on the click rate of each related video by the first historical user to obtain the recommendation sequence of the related video corresponding to the reverse ordering of the click rate. Recommending one or more videos which are in the related videos and are in the top order to the target user. The attribute characteristics of the first historical user and the video characteristics of the related videos are fused, so that the accuracy of predicting the click rate of the first historical user on the related videos is improved, and the accuracy of video recommendation is further improved.

Fig. 4 is a flowchart of a video recommendation method according to an exemplary embodiment, and in particular, a process of establishing a click rate prediction model based on a neural network algorithm in step S101 in fig. 1. The method comprises the following steps:

in step S401, the attribute features of the sample user and the video features of the sample video are extracted and a video label is labeled for the sample video.

The characteristics of the sample users include attribute characteristics and behavior characteristics of the users. The attribute features include a user ID feature, a static feature, and a dynamic feature. The static characteristics of the sample user include at least one of: age, gender, geographic location, IP address, cell phone model, list of applications installed on the cell phone. The dynamic characteristics of the sample user include at least one of: a user click history feature, a user likes history feature, and a user concerns list feature. The video features of the sample video include at least one of: a video ID feature and a video author ID feature.

In this step, the attribute features of the sample user and the video features of the sample video are extracted. And labeling the sample video with a video label, comprising: if the sample user clicks on the sample video presented by the action page, the sample video is marked as a positive sample. If the sample user does not click on the sample video presented by the operation page, the sample video is marked as a negative sample. In one embodiment, the video label of the positive exemplar is labeled 1 and the video label of the negative exemplar is labeled 0.

In step S402, vector summation is performed on the static features of the sample user and the video ID features of the sample video, so as to obtain a fourth feature.

In this step, the static features of the sample user and the video ID features of each sample video are vector-summed to obtain the fourth feature of the sample user with respect to each sample video.

In step S403, vector summation is performed on the static features of the sample user and the video author ID features of the sample video, so as to obtain a fifth feature.

In this step, the static features of the sample user and the video author ID features of each sample video are vector-summed to obtain the fifth feature of the sample user with respect to each sample video.

In step S404, the fourth feature and the fifth feature, and the user ID feature of the historical user and the dynamic feature of the historical user are taken as a sixth feature.

In this step, the fourth feature and the fifth feature obtained in steps S402 and S403, as well as the user ID feature of the sample user and the dynamic feature of the sample user, are used as the sixth feature of the sample user with respect to each sample video.

In step S405, a click rate estimation target model is established based on a neural network algorithm.

In the step, a click rate estimation target model is established based on a neural network algorithm. The click rate estimation target model is a fully connected neural network. For example, the click-through rate estimation target model has N neural network layers. Any node of the N-1 layer is connected with all nodes of the N layer.

In step S406, forward learning is performed on the click through rate estimation target model based on the attribute features of the sample user and the video features of the sample video.

In this step, the sixth feature of the sample user in step S404 about each sample video is input into the click-through rate estimation target model in step S405. And transforming the sixth characteristics of the sample user about each sample video layer by layer from bottom to top in the click rate estimation target model to obtain the top-level vector of the click rate estimation target model of the sample user about each sample video. The top-level vector is converted into a probability of the sample user's click-through rate for each sample video.

For example, the calculation formula of the probability of converting the top-level vector of the click rate estimation target model of the sample user about each sample video into the click rate is a sigmoid function:

wherein, aⁱPredicting a top-level vector, σ (a), of a target model for the click-through rate of the sample user with respect to the ith sample videoⁱ) Is a top level vector aⁱThe value range of the corresponding probability of (2) is (0, 1).

In step S407, reverse learning is performed on the click through rate estimation target model based on the attribute features of the sample user and the video features of the sample video.

In this step, a loss function of the click rate estimation target model of the sample user with respect to each sample video is calculated according to the probability of the click rate of the sample user with respect to each sample video and the video tag of the sample video in step S406. And minimizing the loss function of the click rate estimation target model of each sample video of the sample user by adopting a random gradient descent method. And solving the gradient of the loss function of the click rate estimation target model of the sample user relative to each sample video. And updating the network parameters of the click rate estimation target model layer by layer from top to bottom through the gradient of the loss function of the click rate estimation target model of the sample user about each sample video. And updating the network parameters corresponding to the sixth feature of the sample user about each sample video through the gradient of the loss function of the click rate estimation target model of the sample user about each sample video.

For example, the formula of the Loss function (Log Loss) of the click through rate prediction target model of the sample user with respect to each sample video is:

l＝-yⁱlogpⁱ-(1-yⁱ)log(1-pⁱ) (3)

wherein p isⁱ＝σ(aⁱ) Probability of click rate for the sample user with respect to the ith sample video, σ is sigmoid function, yⁱE {0,1} is the ith sample of the sample userA video tag of the video.

According to the embodiment of the application, the static features of the sample user and the video ID features of each sample video are subjected to vector summation to obtain the fourth features of the sample user about each sample video. And performing vector summation on the static characteristics of the sample user and the video author ID characteristics of each sample video to obtain fifth characteristics of the sample user about each sample video. And taking the fourth feature and the fifth feature, the user ID feature of the sample user and the dynamic feature of the sample user as a sixth feature of the sample user about each sample video. Training a click rate estimation target model through the sixth characteristics of the sample user about each sample video, and adjusting and optimizing the click rate estimation target model and the network parameters of the sample user about the sixth characteristics of each sample video, so as to establish the click rate estimation model. The accuracy of the established click rate estimation model is improved, and therefore the accuracy of video recommendation is further improved.

Fig. 5 is a schematic diagram illustrating a video recommendation apparatus according to an example embodiment. As shown in fig. 5, the apparatus 50 includes: a model establishing unit 501, a nearest neighbor searching unit 502, a click rate estimating unit 503 and a video recommending unit 504.

The model establishing unit 501 is configured to establish a click rate estimation model based on a neural network algorithm.

The unit establishes a click rate estimation model based on a neural network algorithm. The click-through rate estimation model is a fully connected neural network. For example, the click-through rate estimation model has N neural network layers. Any node of the N-1 layer is connected with all nodes of the N layer.

A nearest neighbor retrieving unit 502, configured to calculate similarities between the attribute features of the target user and the attribute features of the multiple history users, respectively, to obtain a first history user that is closest to the attribute features of the target user.

The unit respectively calculates the similarity between the attribute characteristics of the target user and the attribute characteristics of each historical user to obtain a first historical user which is closest to the attribute characteristics of the target user. For example, the euclidean distance between the attribute feature of the target user and the attribute feature of each history user is calculated, and the nearest neighbor search is performed using the euclidean distance, thereby obtaining the first history user that is closest to the attribute feature of the target user.

And the click rate estimation unit 503 is configured to obtain the click rate of the first historical user on the related video based on the click rate estimation model.

The unit inputs the attribute characteristics of the first historical user, the user behavior characteristics and the video characteristics of the related video into the click rate estimation model based on the click rate estimation model established by the model establishing unit 501. And the N layers of neural network layers of the click rate estimation model carry out layer-by-layer transformation on the attribute characteristics, the user behavior characteristics and the video characteristics of the related video, and finally the click rate of the first historical user on the related video is obtained.

A video recommending unit 504, configured to recommend multiple videos in the relevant videos to the target user based on the click rate of the first historical user on the relevant videos.

The unit recommends a plurality of videos with higher click rate of the first historical user to the related videos to the target user according to the click rate of the first historical user to the related videos obtained by the click rate estimation unit 503.

Fig. 6 is a schematic diagram illustrating a video recommendation apparatus according to an example embodiment. And in particular to the diagram of the nearest neighbor search unit 502 in fig. 5. As shown in fig. 6, the apparatus 60 includes: a first feature extraction unit 601, a distance calculation unit 602, and a first ranking unit 603.

A first feature extracting unit 601, configured to extract the attribute features of the target user and the attribute features of the plurality of historical users.

The first feature extraction unit 601 extracts the attribute feature of the target user and the attribute features of a plurality of history users.

A distance calculating unit 602, configured to calculate distances between the attribute feature of the target user and the attribute features of the plurality of historical users, respectively.

For example, the target user is X, and the m history users are Y respectively¹,Y²,Y³,…,Y^m. The distance calculating unit 602 calculates euclidean distances between the attribute feature of the target user X and the attribute features of the m history users, respectively.

In euclidean space, the attribute of target user X is characterized by X ═ X (X)₁,x₂,x₁,..,x_n) And ith historical user YⁱAttribute characteristics of

Has an Euclidean distance of

Wherein x is₁,x₂,x₁,..,x_nIs the X-attribute feature of the target user,

is the ith historical user YⁱThe attribute characteristics of (1).

A first sorting unit 603, configured to sort the distances to obtain a first historical user closest to the attribute feature of the target user.

The first sorting unit 603 sorts the distance between the attribute feature of the target user and the attribute features of the plurality of historical users, for example, the euclidean distance, obtained by the distance calculating unit 602, to obtain the historical user corresponding to the smallest distance. The smaller the distance, the more closely the attribute features between the target user and the historical user are characterized. The historical user corresponding to the minimum distance is the first historical user.

Fig. 7 is a schematic diagram illustrating a video recommendation apparatus according to an example embodiment. Specifically, the click rate estimation unit 503 in fig. 5 is illustrated. As shown in fig. 7, the apparatus 70 includes: a second feature extraction unit 701, a first feature fusion unit 702, and an estimation unit 703.

A second feature extraction unit 701, configured to extract the attribute features of the first historical user and the video features of the related video.

The characteristics of the first historical user comprise attribute characteristics and behavior characteristics of the user. The attribute features include a user ID feature, a static feature, and a dynamic feature. The static characteristics of the first historical user include at least one of: age, gender, geographic location, IP address, cell phone model, list of applications installed on the cell phone. The dynamic characteristics of the first historical user include at least one of: a user click history feature, a user likes history feature, and a user concerns list feature. The video features of the related video include at least one of: a video ID feature, a video author ID feature, and a video tag feature.

The second feature extraction unit 701 extracts the attribute feature of the first historical user and the attribute feature of the related video.

A first feature fusion unit 702, configured to perform vector summation on the static features of the first historical user and the video ID features of the relevant video to obtain first features. And carrying out vector summation on the static characteristics of the first historical user and the video author ID characteristics of the related videos to obtain second characteristics. And taking the first feature and the second feature, the user ID feature of the first historical user and the dynamic feature of the first historical user as third features.

The first feature fusion unit 702 performs vector summation on the static features of the first historical user and the video ID features of each relevant video to obtain the first features of the first historical user about each relevant video. And performing vector summation on the static characteristics of the first historical user and the video author ID characteristics of each related video to obtain second characteristics of the first historical user about each related video. And taking the first characteristic and the second characteristic as well as the user ID characteristic of the first historical user and the dynamic characteristic of the first historical user as a third characteristic of the first historical user about each related video.

The estimation unit 703 is configured to input the third feature into the click rate estimation model, so as to obtain the click rate of the first historical user on the related video.

The estimation unit 703 inputs the third feature of the first historical user with respect to each relevant video obtained in the first feature fusion unit 702 into the click-through rate estimation model. And obtaining the click rate of the first historical user to each related video through the layer-by-layer transformation of the click rate estimation model.

Fig. 8 is a schematic diagram illustrating a video recommendation apparatus according to an example embodiment. And in particular to the video recommendation unit 504 of fig. 5. As shown in fig. 8, the apparatus 80 includes: a second sorting unit 801 and a recommendation unit 802.

And a second sorting unit 801, configured to sort the click rates of the related videos in reverse order according to the click rates of the first historical user on the related videos.

The second sorting unit 801 performs reverse sorting on the click rate of each related video by the first historical user obtained in the estimating unit 703 to obtain a recommendation sequence of the related video corresponding to the reverse sorting of the click rate.

A recommending unit 802, configured to recommend, to the target user, multiple videos in the related videos that are in the top order according to the click rate reverse order of the related videos.

The recommending unit 802 recommends one or more videos with the top order among the related videos to the target user according to the recommendation order of the related videos corresponding to the reverse order of the click rate obtained in the second sorting unit 801.

Fig. 9 is a schematic diagram illustrating a video recommendation apparatus according to an example embodiment. And in particular to a schematic diagram of the model establishing unit 501 in fig. 5. As shown in fig. 9, the apparatus 90 includes: a third feature extraction unit 901, a second feature fusion unit 902, an object model building unit 903, a forward learning unit 904, and a backward learning unit 905.

A third feature extraction unit 901, configured to extract the attribute features of the sample user, extract the video features of the sample video, and label a video label for the sample video.

The third feature extraction unit 901 extracts the attribute feature of the sample user and the video feature of the sample video. And labeling the sample video with a video label, comprising: if the sample user clicks on the sample video presented by the action page, the sample video is marked as a positive sample. If the sample user does not click on the sample video presented by the operation page, the sample video is marked as a negative sample. In one embodiment, the video label of the positive exemplar is labeled 1 and the video label of the negative exemplar is labeled 0.

A second feature fusion unit 902, configured to perform vector summation on the static features of the sample user and the video ID features of the sample video to obtain a fourth feature. And carrying out vector summation on the static characteristics of the sample user and the video author ID characteristics of the sample video to obtain fifth characteristics. Taking the fourth and fifth features and the user ID feature of the historical user and the dynamic feature of the historical user as sixth features.

The second feature fusion unit 902 performs vector summation on the static features of the sample user and the video ID features of each sample video to obtain the fourth feature of the sample user with respect to each sample video. And performing vector summation on the static characteristics of the sample user and the video author ID characteristics of each sample video to obtain fifth characteristics of the sample user about each sample video. And taking the fourth feature and the fifth feature, the user ID feature of the sample user and the dynamic feature of the sample user as a sixth feature of the sample user about each sample video.

And the target model establishing unit 903 is used for establishing a click rate estimation target model based on a neural network algorithm.

The target model establishing unit 903 establishes a click rate estimation target model based on a neural network algorithm. The click rate estimation target model is a fully connected neural network. For example, the click-through rate estimation target model has N neural network layers. Any node of the N-1 layer is connected with all nodes of the N layer.

A forward learning unit 904, configured to forward learn the click through rate estimation target model based on the attribute features of the sample user and the video features of the sample video.

The forward learning unit 904 inputs the sixth feature of the sample user obtained by the second feature fusion unit 902, which is related to each sample video, into the click-through rate estimation target model established by the target model establishing unit 903. And transforming the sixth characteristics of the sample user about each sample video layer by layer from bottom to top in the click rate estimation target model to obtain the top-level vector of the click rate estimation target model of the sample user about each sample video. The top-level vector of the sample user for each sample video is converted into a probability of the sample user's click rate for each sample video.

For example, the formula for calculating the probability of converting the top-level vector of the sample user about each sample video into the click rate is sigmoid function:

A reverse learning unit 905, configured to perform reverse learning on the click through rate estimation target model based on the attribute features of the sample user and the video features of the sample video.

The backward learning unit 905 calculates a loss function of the click rate estimation target model of the sample user with respect to each sample video according to the probability of the click rate of the sample user with respect to each sample video and the video label of the sample video in the forward learning unit 904. And minimizing the loss function of the click rate estimation target model of each sample video of the sample user by adopting a random gradient descent method. And solving the gradient of the loss function of the click rate estimation target model of the sample user relative to each sample video. And updating the network parameters of the click rate estimation target model layer by layer from top to bottom through the gradient of the loss function of the click rate estimation target model of the sample user about each sample video. And updating the network parameters corresponding to the sixth feature of the sample user about each sample video through the gradient of the loss function of the click rate estimation target model of the sample user about each sample video.

l＝-yⁱlogpⁱ-(1-yⁱ)log(1-pⁱ) (3)

wherein p isⁱ＝σ(aⁱ) Probability of click rate for the sample user with respect to the ith sample video, σ is sigmoid function, yⁱE {0,1} is the video label of the ith sample video of the sample user.

Fig. 10 is a block diagram illustrating an apparatus 1200 that performs a video recommendation method according to an example embodiment. For example, the interaction apparatus 1200 may be a mobile phone, a computer, a digital broadcast terminal, a messaging device, a game console, a tablet device, a medical device, an exercise device, a personal digital assistant, and the like.

Referring to fig. 10, apparatus 1200 may include one or more of the following components: processing component 1202, memory 1204, power component 1206, multimedia component 1208, audio component 1210, input/output (I/O) interface 1212, sensor component 1214, and communications component 1216.

The processing component 1202 generally controls overall operation of the apparatus 1200, such as operations associated with display, telephone calls, data communications, camera operations, and recording operations. The processing components 1202 may include one or more processors 1220 to execute instructions to perform all or a portion of the steps of the methods described above. Further, the processing component 1202 can include one or more modules that facilitate interaction between the processing component 1202 and other components. For example, the processing component 1202 can include a multimedia module to facilitate interaction between the multimedia component 1208 and the processing component 1202.

The memory 1204 is configured to store various types of data to support operation at the device 1200. Examples of such data include instructions for any application or method operating on the device 1200, contact data, phonebook data, messages, pictures, videos, and so forth. The memory 1204 may be implemented by any type or combination of volatile or non-volatile memory devices such as Static Random Access Memory (SRAM), electrically erasable programmable read-only memory (EEPROM), erasable programmable read-only memory (EPROM), programmable read-only memory (PROM), read-only memory (ROM), magnetic memory, flash memory, magnetic or optical disks.

A power supply component 1206 provides power to the various components of the device 1200. Power components 1206 may include a power management system, one or more power sources, and other components associated with generating, managing, and distributing power for apparatus 1200.

The multimedia components 1208 include a screen that provides an output interface between the device 1200 and a user. In some embodiments, the screen may include a Liquid Crystal Display (LCD) and a Touch Panel (TP). If the screen includes a touch panel, the screen may be implemented as a touch screen to receive an input signal from a user. The touch panel includes one or more touch sensors to sense touch, slide, and gestures on the touch panel. The touch sensor may not only sense the boundary of a touch or slide action, but also detect the duration and pressure associated with the touch or slide operation. In some embodiments, the multimedia component 1208 includes a front facing camera and/or a rear facing camera. The front camera and/or the rear camera may receive external multimedia data when the device 1200 is in an operating mode, such as a shooting mode or a video mode. Each front camera and rear camera may be a fixed optical lens system or have a focal length and optical zoom capability.

Audio component 1210 is configured to output and/or input audio signals. For example, audio component 1210 includes a Microphone (MIC) configured to receive external audio signals when apparatus 1200 is in an operational mode, such as a call mode, a recording mode, and a voice recognition mode. The received audio signals may further be stored in the memory 1204 or transmitted via the communication component 1216. In some embodiments, audio assembly 1210 further includes a speaker for outputting audio signals.

The I/O interface 1212 provides an interface between the processing component 1202 and peripheral interface modules, which may be keyboards, click wheels, buttons, etc. These buttons may include, but are not limited to: a home button, a volume button, a start button, and a lock button.

The sensor assembly 1214 includes one or more sensors for providing various aspects of state assessment for the apparatus 1200. For example, the sensor assembly 1214 may detect an open/closed state of the device 1200, the relative positioning of the components, such as a display and keypad of the apparatus 1200, the sensor assembly 1214 may also detect a change in the position of the apparatus 1200 or a component of the apparatus 1200, the presence or absence of user contact with the apparatus 1200, an orientation or acceleration/deceleration of the apparatus 1200, and a change in the temperature of the apparatus 1200. The sensor assembly 1214 may include a proximity sensor configured to detect the presence of a nearby object in the absence of any physical contact. The sensor assembly 1214 may also include a light sensor, such as a CMOS or CCD image sensor, for use in imaging applications. In some embodiments, the sensor assembly 1214 may also include an acceleration sensor, a gyroscope sensor, a magnetic sensor, a pressure sensor, or a temperature sensor.

The communications component 1216 is configured to facilitate communications between the apparatus 1200 and other devices in a wired or wireless manner. The apparatus 1200 may access a wireless network based on a communication standard, such as WiFi, an operator network (such as 2G, 3G, 4G, or 5G), or a combination thereof. In an exemplary embodiment, the communication component 1216 receives the broadcast signal or broadcast related information from an external broadcast management system via a broadcast channel. In an exemplary embodiment, the communications component 1216 further includes a Near Field Communication (NFC) module to facilitate short-range communications. For example, the NFC module may be implemented based on Radio Frequency Identification (RFID) technology, infrared data association (IrDA) technology, Ultra Wideband (UWB) technology, Bluetooth (BT) technology, and other technologies.

In an exemplary embodiment, the apparatus 1200 may be implemented by one or more Application Specific Integrated Circuits (ASICs), Digital Signal Processors (DSPs), Digital Signal Processing Devices (DSPDs), Programmable Logic Devices (PLDs), Field Programmable Gate Arrays (FPGAs), controllers, micro-controllers, microprocessors or other electronic components for performing the above-described methods.

In an exemplary embodiment, a non-transitory computer readable storage medium comprising instructions, such as memory 1204 comprising instructions, executable by processor 1220 of apparatus 1200 to perform the above-described method is also provided. For example, the non-transitory computer readable storage medium may be a ROM, a Random Access Memory (RAM), a CD-ROM, a magnetic tape, a floppy disk, an optical data storage device, and the like.

Fig. 11 is a block diagram illustrating an apparatus 1300 that performs a video recommendation method according to an example embodiment. For example, the apparatus 1300 may be provided as a server. Referring to fig. 11, apparatus 1300 includes a processing component 1322, which further includes one or more processors, and memory resources, represented by memory 1332, for storing instructions, such as application programs, that may be executed by processing component 1322. The application programs stored in memory 1332 may include one or more modules that each correspond to a set of instructions. Further, processing component 1322 is configured to execute instructions to perform the above-described information list display method.

The apparatus 1300 may also include a power component 1326 configured to perform power management for the apparatus 1300, a wired or wireless network interface 1350 configured to connect the apparatus 1300 to a network, and an input-output (I/O) interface 1358. The apparatus 1300 may operate based on an operating system stored in the memory 1332, such as Windows Server, Mac OS XTM, UnixTM, LinuxTM, FreeBSDTM, or the like.

Other embodiments of the present application will be apparent to those skilled in the art from consideration of the specification and practice of the invention disclosed herein. This application is intended to cover any variations, uses, or adaptations of the invention following, in general, the principles of the application and including such departures from the present disclosure as come within known or customary practice within the art to which the invention pertains. It is intended that the specification and examples be considered as exemplary only, with a true scope and spirit of the application being indicated by the following claims.

It will be understood that the present application is not limited to the precise arrangements described above and shown in the drawings and that various modifications and changes may be made without departing from the scope thereof. The scope of the application is limited only by the appended claims.

Claims

1. A method for video recommendation, comprising:

establishing a click rate estimation model based on a neural network algorithm;

obtaining the click rate of the first historical user on the related video based on the click rate estimation model, wherein obtaining the click rate of the first historical user on the related video based on the click rate estimation model comprises: taking the attribute characteristics of the first historical user and the video characteristics of the related videos as the input of the click rate estimation model, and obtaining the click rate of the first historical user on the related videos by the click rate estimation model; and

2. The video recommendation method according to claim 1, wherein said attribute characteristics of the user comprise: user ID feature, static feature, and dynamic feature.

3. The video recommendation method of claim 2, wherein said static characteristics of the user comprise at least one of: age, gender, geographic location, IP address, mobile phone model, mobile phone installed application program list;

4. The video recommendation method according to claim 3, wherein the video has video features including at least one of the following features: a video ID feature and a video author ID feature.

5. The video recommendation method according to claim 4, wherein said calculating similarity between the attribute features of the target user and the attribute features of the plurality of historical users respectively to obtain the first historical user that is closest to the attribute features of the target user comprises:

6. The video recommendation method according to claim 5, wherein said obtaining the click-through rate of the first historical user for the related video based on the click-through rate estimation model comprises:

7. The video recommendation method according to claim 6, wherein recommending a plurality of videos of the related videos to the target user based on the click rate of the first historical user on the related videos comprises:

8. The video recommendation method according to claim 7, wherein the establishing a click-through rate prediction model based on a neural network algorithm comprises:

extracting the attribute features of the sample user;

9. The video recommendation method according to claim 8, wherein the establishing a click-through rate prediction model based on a neural network algorithm further comprises:

10. The video recommendation method of claim 9, wherein said forward learning said click-through rate estimation object model based on said attribute features of said sample user and said video features of said sample video comprises:

inputting the sixth feature into the click rate pre-estimated target model;

and converting the top-level vector into the probability of the click rate.

11. The video recommendation method of claim 10, wherein the backward learning of the click-through rate estimation object model based on the attribute features of the sample user and the video features of the sample video comprises:

and updating the network parameters corresponding to the sixth characteristic.

12. The video recommendation method according to claim 11, wherein said labeling the sample video with a video label comprises:

13. A video recommendation apparatus, comprising:

the click rate estimation unit is used for obtaining the click rate of the first historical user on the related video based on the click rate estimation model, and the obtaining of the click rate of the first historical user on the related video based on the click rate estimation model comprises the following steps: taking the attribute characteristics of the first historical user and the video characteristics of the related videos as the input of the click rate estimation model, and obtaining the click rate of the first historical user on the related videos by the click rate estimation model; and

14. The video recommendation device of claim 13, wherein the attribute features of a user comprise: user ID feature, static feature, and dynamic feature.

15. The video recommendation device of claim 14, wherein the static characteristics of a user comprise at least one of: age, gender, geographic location, IP address, mobile phone model, mobile phone installed application program list;

16. The video recommendation device of claim 15, wherein the video has video features, said video features comprising at least one of: a video ID feature and a video author ID feature.

17. The video recommendation device according to claim 16, wherein said nearest neighbor retrieval unit comprises:

18. The video recommendation device of claim 17, wherein the click-through rate estimation unit comprises:

19. The video recommendation device according to claim 18, wherein said video recommendation unit comprises:

20. The video recommendation device of claim 19, wherein said model building unit comprises:

21. The video recommendation device of claim 20, wherein said model building unit further comprises:

22. The video recommendation device of claim 21, wherein said forward learning said click-through rate estimation object model based on said attribute features of said sample user and said video features of said sample video comprises:

inputting the sixth feature into the click rate pre-estimated target model;

and converting the top-level vector into the probability of the click rate.

23. The video recommendation device of claim 22, wherein said backward learning the click-through rate estimation object model based on the attribute features of the sample user and the video features of the sample video comprises:

and updating the network parameters corresponding to the sixth characteristic.

24. The video recommendation device of claim 23, wherein said labeling said sample video with a video tag comprises:

25. A video recommendation apparatus, comprising:

a processor;

a memory for storing processor-executable instructions;

wherein the processor is configured to perform the video recommendation method of any of the preceding claims 1 to 12.

26. A computer-readable storage medium storing computer instructions which, when executed, implement the video recommendation method of any one of claims 1 to 12.