WO2021135588A1 - 推荐方法、模型生成方法、装置、介质以及设备 - Google Patents

推荐方法、模型生成方法、装置、介质以及设备 Download PDF

Info

Publication number
WO2021135588A1
WO2021135588A1 PCT/CN2020/124793 CN2020124793W WO2021135588A1 WO 2021135588 A1 WO2021135588 A1 WO 2021135588A1 CN 2020124793 W CN2020124793 W CN 2020124793W WO 2021135588 A1 WO2021135588 A1 WO 2021135588A1
Authority
WO
WIPO (PCT)
Prior art keywords
content
information
user behavior
rate
model
Prior art date
Application number
PCT/CN2020/124793
Other languages
English (en)
French (fr)
Inventor
杨晚鹏
谭怒涛
Original Assignee
百果园技术(新加坡)有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 百果园技术(新加坡)有限公司 filed Critical 百果园技术(新加坡)有限公司
Publication of WO2021135588A1 publication Critical patent/WO2021135588A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • G06F16/9535Search customisation based on user profiles and personalisation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/25Fusion techniques
    • G06F18/253Fusion techniques of extracted features

Definitions

  • This application relates to the field of data technology, in particular to a content recommendation method and device, a behavior prediction model generation method and device, a content recommendation model generation method and device, a storage medium, and a computer equipment.
  • the current content push method has at least the following problems: the traditional content ranking algorithm obtains user feedback behavior data offline, obtains sample labels, and performs features from the feature log stored on the online server. Extract and combine the two to obtain training samples, and then in the model training process, try to fit a model that fits the user's preference.
  • the online predicts the user's preference for content items based on the model, and according to the preference, selects several optimal content items to form a push list and pushes it to the user.
  • the user’s feedback behavior is diverse. According to the user’s feedback behavior, there are serious subjective limitations in setting the label of the training sample and the corresponding weight, and there are more behavioral habits between different users. Large differences, the user’s initiative will limit the model’s ability to distinguish differences in the same user’s preferences for different items.
  • the embodiments of the present application are proposed to provide a content recommendation method and a corresponding content recommendation device that overcome the above problems or at least partially solve the above problems.
  • an embodiment of the present application discloses a content recommendation method, and the method includes:
  • At least two target recommended content are determined.
  • the embodiment of the present application also provides a method for generating a behavior prediction model, including:
  • the initial behavior prediction model input the training vector information for iteration, and calculate multiple loss functions of the initial behavior prediction model after each iteration;
  • the multiple loss functions include information based on different historical user behaviors Loss function;
  • the historical user behavior information includes at least two types of click rate, like rate, completion rate, attention rate, sharing rate, comment rate, favorite rate, and browsing duration.
  • the embodiment of the present application also provides a method for generating a content recommendation model, including:
  • sample data including historical user behavior information, historical user behavior estimated values corresponding to the historical user behavior information, and content feature information of the recommended content
  • the initial content recommendation model input the training vector information for iteration, and calculate multiple loss functions of the initial content recommendation model after each iteration;
  • the historical user behavior information includes at least two types of click rate, like rate, completion rate, attention rate, sharing rate, comment rate, favorite rate, and browsing duration.
  • An embodiment of the present application also provides a content recommendation device, which includes:
  • the information acquisition module is used to acquire content feature information corresponding to the original recommended content and user behavior information
  • An estimated value generating module configured to generate an estimated value of user behavior according to the user behavior information
  • a recommendation value generating module configured to obtain the content recommendation value of each of the original recommended content according to the user behavior estimated value, the user behavior information, and the content feature information;
  • the recommended content determining module is configured to determine at least two target recommended content according to the content recommendation value.
  • the embodiment of the present application also provides a device for generating a behavior prediction model, including:
  • Information and model acquisition module used to acquire historical user behavior information and initial behavior prediction model
  • An information vectorization module for vectorizing the historical user behavior information to generate training vector information
  • the model iteration module is used to input the training vector information in the initial behavior prediction model for iteration, and calculate multiple loss functions of the initial behavior prediction model after each iteration; the multiple loss functions include Loss function based on different historical user behavior information;
  • the model generation module is used to stop the iteration and generate the target behavior prediction model when multiple loss functions of the initial behavior prediction model after the iteration are minimized;
  • the historical user behavior information includes at least two types of click rate, like rate, completion rate, attention rate, sharing rate, comment rate, favorite rate, and browsing duration.
  • An embodiment of the present application also provides an apparatus for generating a content recommendation model, including:
  • the data and model acquisition module is used to acquire sample data and an initial content recommendation model.
  • the sample data includes historical user behavior information, historical user behavior estimated values corresponding to the historical user behavior information, and content feature information of recommended content;
  • An information vectorization module configured to vectorize the historical user behavior information, the historical user behavior estimated value, and the content feature information to generate training vector information
  • the model iteration module is configured to input the training vector information in the initial content recommendation model to iterate, and calculate multiple loss functions of the initial content recommendation model after each iteration;
  • the model generation module is used to stop the iteration and generate the target content recommendation model when multiple loss functions of the initial content recommendation model after the iteration are all minimized;
  • the historical user behavior information includes at least two types of click rate, like rate, completion rate, attention rate, sharing rate, comment rate, favorite rate, and browsing duration.
  • the embodiment of the present application also provides a storage medium
  • a computer program is stored thereon; the computer program is adapted to be loaded by a processor and execute one or more of the above-mentioned methods.
  • the embodiment of the present application also provides a computer device, which includes:
  • One or more processors are One or more processors;
  • One or more application programs wherein the one or more application programs are stored in the memory and configured to be executed by the one or more processors, and the one or more application programs are configured to execute According to the above method.
  • the user behavior estimated value is first generated according to the user behavior information, and then the user behavior estimated value is generated according to the user behavior estimated value and the user behavior information And content feature information, get the content recommendation value of each original recommended content, and then determine at least two target recommendation content based on the content recommendation value, and display at least two target recommendation content, so as to more accurately predict the user's preference for the recommended content , And select the best recommended content in turn, and then show it to users, which improves the accuracy of content recommended to users.
  • the subjectivity of artificially setting the label and weight of the training sample during the model training process and the relationship between user behavior and habits are solved.
  • the interference of bias on model training further improves the accuracy of content recommended to users.
  • FIG. 1 is a flowchart of steps of an embodiment of a content recommendation method of the present application
  • FIG. 2 is an example diagram 1 in an embodiment of a content recommendation method of the present application
  • FIG. 3 is an example diagram 2 in an embodiment of a content recommendation method of the present application.
  • FIG. 4 is a flowchart of steps of an embodiment of a method for generating a behavior prediction model of the present application
  • FIG. 5 is a flowchart of steps of an embodiment of a method for generating a content recommendation model according to the present application
  • Fig. 6 is a structural block diagram of an embodiment of a content recommendation device of the present application.
  • FIG. 7 is a structural block diagram of an embodiment of a method for generating a behavior prediction model of the present application.
  • Fig. 8 is a structural block diagram of an embodiment of a method for generating a content recommendation model according to the present application.
  • FIG. 1 a flowchart of the steps of an embodiment of a content recommendation method of the present application is shown, which may specifically include the following steps:
  • Step 101 Obtain content feature information corresponding to the original recommended content and user behavior information
  • the original push content may be news, commodities, advertisements, articles, music, short videos, and other content.
  • the corresponding content feature information can include the content feature information of the original recommended content's own characteristics and the content feature information of the non-self-characteristics associated with the original recommended content.
  • the content feature information of its own characteristics can be content attributes, content types, and content characteristics.
  • Upload time and content uploader, etc., non-self-characteristic content feature information can be the click-through rate of the content, the click-through rate of the content, the like rate of the content, the reading rate of the content, the collection rate of the content, and the attention rate of the content.
  • the content feature information of its own characteristics can include the uploader of the short video, the type of the short video (funny, animation, TV series, movie, food, etc.), and upload of the short video Time, upload address of the short video, etc.
  • non-self-characteristic content feature information can include the click rate of the short video, the like rate of the short video, the collection rate of the short video, the attention rate of the short video uploader, the number of online viewers, and The number of historical viewers and so on.
  • User behavior information can be behavior information related to user behavior characteristics, such as click-through rate, like rate, broadcast completion rate, follow rate, share rate, comment rate, and other user feedback behaviors for different content.
  • the user can obtain the corresponding content through the application in the terminal.
  • the server can obtain the original recommended content to be recommended in the background in real time, and obtain the user's user behavior information according to The content feature information and user behavior information of the original recommended content can be accurately recommended based on the user's preference in real time.
  • the terminal may include mobile devices, which may specifically include mobile phones, PDAs (Personal Digital Assistants), laptop computers, handheld computers, smart wearable devices (such as smart bracelets, smart glasses, smart headbands, etc.), etc. It may also include fixed equipment, which may specifically include vehicle-mounted terminals, smart homes, etc. These terminals may support operating systems such as Windows, Android (Android), IOS, Windows Phone, etc., which are not limited in the embodiments of the present application.
  • the application program may include a news application program, a music application program, a short video application program, a reading application program, etc., which are not limited in the embodiment of the present application.
  • Step 102 Generate an estimated value of user behavior according to the user behavior information
  • the feedback behaviors of users are diverse and different users have different feedback behaviors, it is possible to generate multiple user behavior estimates for the user according to the user behavior information.
  • the user behavior is the same as that of the user.
  • the estimated value of user behavior corresponds one-to-one.
  • the user behavior information may be vectorized first to generate a behavior feature vector, and then the behavior feature vector is input into a preset target behavior prediction model to generate The estimated value of user behavior.
  • the target behavior prediction model can be a multi-target deep neural network model, which can include a preset number of hidden layers, a fully connected layer connected to the last hidden layer, and multiple outputs connected to the fully connected layer Nodes, where the fully connected layer is used to split the output results of the last hidden layer and input the split output results to multiple output nodes respectively, and each output node outputs the corresponding user behavior estimation value.
  • the trained behavior prediction model can be obtained, and then the user behavior information can be vectorized to generate the corresponding behavior feature vector, and then the behavior feature vector can be input into the target behavior prediction model to obtain the user behavior prediction model.
  • Estimated value where each feedback behavior of the user corresponds to an estimated value of user behavior.
  • the user behavior information includes the click-through rate, like rate, completion rate, attention rate, sharing rate, and comment rate for different content
  • the estimated value of user behavior may include the estimated click-through rate and click rate for different content.
  • a multi-objective deep neural network model is used to associate multiple user behavior information, which improves the accuracy of the output multiple user behavior estimates, which is conducive to providing users with more accurate content in the follow-up.
  • the behavior prediction model can be generated through the following steps:
  • Acquire historical user behavior information and an initial behavior prediction model vectorize the historical user behavior information to generate first training vector information; in the initial behavior prediction model, input the first training vector information for iteration , And calculate multiple loss functions of the initial behavior prediction model after each iteration; the multiple loss functions include loss functions based on different historical user behavior information; multiple loss functions of the initial behavior prediction model after iteration When both are minimized, stop the iteration and generate the target behavior prediction model.
  • historical user behavior information may include click-through rate, like rate, completion rate, attention rate, sharing rate, comment rate, and other historical user feedback behaviors for different content.
  • the historical user behavior information can be vectorized to generate the first training vector information, and then the first training vector information is input into the initial behavior prediction model to perform model training to obtain the target behavior Estimate model.
  • multiple loss functions of the initial behavior prediction model are used as the supervision and guidance of the initial behavior prediction model.
  • multiple loss functions include loss functions based on different historical user behavior information.
  • historical user behavior information includes click-through rate, like rate, broadcast completion rate, follow rate, share rate, comment rate, etc.
  • multiple loss functions include Corresponding loss functions such as click rate evaluation value, like rate evaluation value, broadcast completion rate evaluation value, attention rate evaluation value, sharing rate evaluation value, and comment rate evaluation value.
  • Step 103 Obtain the content recommendation value of each of the original recommended content according to the user behavior estimated value, the user behavior information, and the content feature information;
  • the user behavior estimation value, the user behavior information, and the content feature information of the original recommended content can be used to obtain specific information for each
  • the content recommendation value of the original recommended content, thereby integrating the user behavior estimate value, can more accurately predict the user's preference for the recommended content, so as to recommend high-quality content to the user.
  • the estimated user behavior, the user behavior information, and the content feature information may be vectorized to generate a content recommendation feature vector;
  • the recommendation vector is input into a preset target content recommendation model, and a content recommendation value of each of the original recommended content is generated.
  • the target content recommendation model can be a Pairwise-based LTR (learning To Rank) model, which can include a preset number of hidden layers, a fully connected layer connected to the last hidden layer, and a fully connected layer.
  • the Rank Cost layer connected by the layer and multiple output nodes connected to the Rank Cost layer, where the Rank Cost layer is used to convert the output results of the fully connected layer, and input the converted output results to multiple output nodes.
  • Each output node outputs the corresponding content recommendation value.
  • a trained content recommendation model can be obtained, and then the user behavior estimate, user behavior information, and content feature information of the original recommended content can be vectorized to generate a content recommendation feature vector, and then the content recommendation vector can be input to
  • content recommendation values for different original recommended content are obtained. If the original recommended content includes content one, content two, content three, content four, and content five, you can get content recommendation value A for content one, content recommendation value B for content two, content recommendation value C for content three, and content four Content recommendation value D of content 5 and content recommendation value E of content five, etc., so that content recommendation values corresponding to different original recommended content can be obtained, and then suitable content can be recommended to users according to different content recommendation values.
  • the user's preference for different content items is judged. Specifically, by fusing the user behavior estimate, user behavior information, and content feature information of the original recommended content, the user behavior estimate is added to the non-linear target content recommendation model, which realizes that the recommended content can be tailored to the individual. Depending on the video, it can more accurately improve the appropriate content for the user, and improve the user experience and user stickiness of the product.
  • the target content recommendation model can be generated through the following steps:
  • the sample data includes historical user behavior information, historical user behavior estimated values corresponding to the historical user behavior information, and content feature information of recommended content;
  • the historical user behavior estimated value and the content feature information are vectorized to generate second training vector information; in the initial content recommendation model, the second training vector information is input for iteration, and each iteration is calculated Multiple loss functions of the initial content recommendation model after the iteration; when multiple loss functions of the initial content recommendation model after the iteration are all minimized, the iteration is stopped, and the target content recommendation model is generated.
  • the training sample data may include historical user behavior information, historical user behavior estimated values corresponding to the historical user behavior information, and content feature information of recommended content.
  • the sample data can be vectorized to obtain the second training vector information, and then the second training vector information is input into the initial content recommendation model for model training, thereby obtaining the target content recommendation model.
  • multiple loss functions of the initial content recommendation model are used as the supervision and guidance of the initial content recommendation model.
  • multiple loss functions can be based on the content recommendation value of different recommended content.
  • the recommended content includes content one, content two, content three, content four, and content five, etc.
  • the multiple loss functions can include content recommendation values of content one, The content recommendation value of content two, content recommendation value of content three, content recommendation value of content four, and content recommendation value of content five are corresponding functions.
  • Step 104 Determine at least two target recommended content according to the content recommendation value.
  • At least two target recommended content can be determined from the original recommended content according to the content recommendation value and displayed to the user.
  • the server obtains the content recommendation value of the original recommended content after acquiring the user behavior information of the user and the content feature information of the original recommended content.
  • the recommended content is sorted, and the sorted original recommended content is used to generate a corresponding content recommendation list.
  • at least two original recommended content with the highest ranking can be extracted as the target recommended content and passed through the client
  • the target recommended content is displayed to the user, so that the target recommended content that fits the user's preference can be filtered from multiple original recommended content, and the target recommended content is displayed to the user, which improves the pertinence of content recommendation and guarantees the product User experience and user stickiness.
  • the server when the original recommended content is a short video and the client is a short video application, when the user starts the short video application in the terminal, the server can obtain the target according to the user’s ID.
  • User behavior information of the user and obtain short video content to be recommended.
  • the user behavior information can be vectorized, and the behavior prediction model can be input to obtain the user behavior estimated value that matches the user's feedback behavior, and then the obtained user behavior estimated value, user behavior information, and short video content can be obtained
  • Short video logo Recommended value for short video Short video1 75 Short video2 86 Short video 3 62 Short video4 80 Short video 5 90 Short video 6 98 Short video7 88 Short video8 56 Short video9 93 Short video10 74 ... N
  • the short videos can be sorted according to the content recommendation value from high to low to generate a short video recommendation list, as shown in Table 2:
  • Short video logo Recommended value for short video Short video 6 98 Short video9 93 Short video 5 90 Short video7 88 Short video2 86 Short video4 80 Short video1 75 Short video10 74 Short video 3 62 Short video8 56 ... N
  • the server can select the short video with the highest ranking from the short video recommendation list according to the information of the terminal or the information of the client, and display the recommended short video to the user through the client.
  • the screen information corresponding to different terminals is different
  • the number of short video items displayed by the terminal is also different.
  • the larger the terminal screen size the more short video items that can be displayed. Therefore, the server can, according to the screen information of the terminal, Select an appropriate number of short videos from the short video recommendation list and recommend them to users.
  • the server when the client shows 4 short videos to the user, when the user opens the client and updates, the server can select from the above short video recommendation list, Choose the 4 top short videos to recommend to users, such as short video 6, short video 9, short video 5, and short video 7.
  • the server can follow The short video recommendation list recommends short video content to users in real time, and then updates the short video content displayed on the client in real time.
  • the content displayed on the client can be updated to short video 5, short video 7 , Short video 2 and short video 4, when the user continues to touch operation, it can be further updated to short video 2, short video 4, short video 1 and short video 10, so that when the user is using the short video client, the server
  • the short video recommendation list can be updated in the background in real time, and the client can be updated in real time, so as to more accurately predict the user’s preference for the recommended content, and select the best recommended content in turn, and then show it to the user, which improves the user The accuracy of the recommended content.
  • the server when the original recommended content is an article and the client is a reading application, when the user starts the reading application in the terminal, the server can obtain the user-specific information according to the user’s ID. User behavior information, and obtain the content of the article to be recommended. Then the user behavior information can be vectorized, and the behavior prediction model can be input to obtain the user behavior estimated value that matches the user's feedback behavior, and then the obtained user behavior estimated value, user behavior information, and article content can be obtained.
  • the content feature information is vectorized and input to the article recommendation model to generate article recommendation values corresponding to the article content, as shown in Table 3:
  • the articles can be sorted according to the content recommendation value from high to low to generate an article recommendation list, as shown in Table 4:
  • the server can select the top ranked articles from the article recommendation list according to the information of the terminal or the information of the client, and display the recommended articles to the user through the client.
  • the screen information corresponding to different terminals is different, and the number of article items displayed on the terminal is also different. The larger the terminal screen size, the more article items can be displayed. Therefore, the server can read from the article according to the screen information of the terminal. Select an appropriate number of articles from the recommendation list and recommend them to users.
  • the server when the client shows 4 articles to the user, the user opens the client and updates, the server can select 4 from the above article recommendation list. Recommend the top articles to the user, such as article 6, article 9, article 5 and article 7.
  • the server can recommend to the user in real time according to the article recommendation list The content of the article, and then update the content of the article displayed on the client in real time. At this time, when the user slides up the user interface, the content displayed on the client can be updated to article 5, article 7, article 2, and article 4.
  • the server can update the article recommendation list in the background in real time, and update the client in real time, thereby It can more accurately predict the user's preference for recommended content, and select the best recommended content in turn, and then show it to the user, which improves the accuracy of the content recommended to the user.
  • the embodiments of this application include but are not limited to the above examples. It should be understood that under the guidance of the embodiments of this application, those skilled in the art can provide information based on different recommended content, different terminals, and different clients. The user recommends content items, which is not limited in the embodiment of this application.
  • the user behavior estimated value is first generated according to the user behavior information, and then the user behavior estimated value is generated according to the user behavior estimated value and the user behavior information And content feature information, get the content recommendation value of each original recommended content, and then determine at least two target recommendation content based on the content recommendation value, and display at least two target recommendation content, so as to more accurately predict the user's preference for the recommended content , And select the best recommended content in turn, and then show it to users, which improves the accuracy of content recommended to users.
  • FIG. 4 there is shown a step flow chart of an embodiment of a method for generating a behavior prediction model of the present application, which may specifically include the following steps:
  • Step 401 Obtain historical user behavior information and an initial behavior prediction model
  • historical user behavior information may include click-through rate, like rate, completion rate, attention rate, sharing rate, comment rate, and other historical user feedback behaviors for different content.
  • the initial behavior prediction model may be a multi-objective deep neural network model, which may include a preset number of hidden layers, a fully connected layer connected to the last hidden layer, and multiple output nodes connected to the fully connected layer; The fully connected layer is used to split the output results of the last hidden layer, and input the split output results to the multiple output nodes respectively, where each output node can output the output corresponding to the user's feedback behavior Estimated user behavior.
  • Step 402 vectorize the historical user behavior information to generate training vector information
  • vectorization processing can be performed to generate training vector information to input the initial behavior prediction model for model training.
  • Step 403 In the initial behavior prediction model, input the training vector information to iterate, and calculate multiple loss functions of the initial behavior prediction model after each iteration; the multiple loss functions include different history Loss function of user behavior information;
  • the multiple loss functions include different historical user Loss function of behavior information, such as historical user behavior information including click rate, like rate, completion rate, follow rate, share rate, comment rate, etc.
  • multiple loss functions include click rate evaluation value, like rate evaluation value, Corresponding loss functions such as the evaluation value of the broadcast completion rate, the evaluation value of the attention rate, the evaluation value of the sharing rate, and the evaluation value of the comment rate.
  • the training feature vector can be mapped layer by layer through the activation function of the preset number of hidden layer neurons in the initial behavior prediction model, and the output generated by the last hidden layer can be transmitted to the fully connected layer ; Through the fully connected layer, the output result and multiple loss functions corresponding to the output result are used to perform error calculation to generate multiple gradient values.
  • the initial behavior prediction model may include two hidden layers and a fully connected layer.
  • the historical user behavior information includes click-through rate, like rate, broadcast completion rate, attention rate, sharing rate.
  • the initial behavior prediction model may include 6 output nodes, which correspond to historical user behavior information one-to-one.
  • Step 404 When multiple loss functions of the initial behavior prediction model after the iteration are all minimized, stop the iteration and generate the target behavior prediction model;
  • the iteration of the model can be stopped to generate the target behavior prediction model.
  • each output node it can be judged through each output node whether multiple gradient values meet the preset threshold condition; if not, update the parameters of the activation function of each neuron according to the multiple gradient values, and continue to iterate the initial behavior prediction model; if so, Then, the target behavior prediction model is generated.
  • the parameter update of the activation function may be based on a gradient descent strategy to update the parameters in the target gradient direction.
  • a learning rate can be preset to control the update step length of the parameters in each iteration, so as to finally obtain the target behavior prediction model.
  • the historical user behavior information is vectorized to generate training vector information.
  • the training vector information is input for iteration, and Calculate multiple loss functions of the initial behavior prediction model after each iteration. Multiple loss functions include loss functions based on different historical user behavior information.
  • the user’s feedback behavior for different content is used as the input of the model, and the different feedback behaviors are associated through the model to achieve better generalization. The effect is that the learning rate of the shared layer is improved and the over-fitting is reduced, thereby improving the accuracy of content recommendation.
  • FIG. 5 there is shown a step flowchart of an embodiment of a method for generating a content recommendation model of the present application, which may specifically include the following steps:
  • Step 501 Obtain sample data and an initial content recommendation model, where the sample data includes historical user behavior information, historical user behavior estimated values corresponding to the historical user behavior information, and content feature information of recommended content;
  • the training sample data may include historical user behavior information, historical user behavior estimated values corresponding to the historical user behavior information, and content feature information of recommended content.
  • the historical user behavior information may include the click rate, the like rate, the completion rate, the following rate, the sharing rate, the comment rate, and other user's historical feedback behaviors for different content.
  • the initial content recommendation model may include a preset number of hidden layers, a fully connected layer connected to the last hidden layer, a Rank Cost layer connected to the fully connected layer, and multiple output nodes connected to the Rank Cost layer.
  • the Rank Cost layer is used to convert the output results of the fully connected layer, and input the converted output results to multiple output nodes, where the output node can input the content recommendation value corresponding to the recommended content.
  • Step 502 Vectorize the historical user behavior information, the estimated value of the historical user behavior, and the content feature information to generate training vector information;
  • the three can be vectorized and spliced to generate training vector information to input the initial content recommendation model for model training.
  • Step 503 In the initial content recommendation model, input the training vector information for iteration, and calculate multiple loss functions of the initial content recommendation model after each iteration;
  • multiple loss functions include losses based on different recommended content Function, if the recommended content includes content one, content two, content three, content four, content five, etc.
  • multiple loss functions can include content recommendation value of content one, content recommendation value of content two, content recommendation value of content three, The content recommendation value of content four and the content recommendation value of content five are corresponding functions.
  • the training feature vector can be mapped layer by layer through the preset number of hidden layers in the initial content recommendation model and the activation function of each neuron in the fully connected layer, and the output result generated by the fully connected layer can be transmitted to Rank
  • the Cost layer uses the output result and multiple loss functions corresponding to the output result through the Rank Cost layer to perform error calculation to generate multiple gradient values.
  • the initial content recommendation model may include two hidden layers and a fully connected layer.
  • the initial content recommendation model may include 6 output nodes. The content corresponds to each other.
  • Step 504 When multiple loss functions of the initial content recommendation model after the iteration are all minimized, stop the iteration and generate the target content recommendation model;
  • the iteration of the model can be stopped, thereby generating a target content recommendation model.
  • each output node can determine whether multiple gradient values meet the preset threshold condition; if not, update the parameters of the activation function of each neuron according to the multiple gradient values, and continue to iterate the initial content recommendation model; if so, then Generate a target content recommendation model.
  • the parameter update of the activation function may be based on a gradient descent strategy to update the parameters in the target gradient direction.
  • a learning rate can be preset to control the update step length of the parameters in each iteration, so as to finally obtain the target content recommendation model.
  • the sample data includes historical user behavior information, historical user behavior estimates corresponding to historical user behavior information, content feature information of recommended content, and historical user behavior Information, historical user behavior estimates, and content feature information are vectorized to generate training vector information.
  • the initial content recommendation model input the training vector information for iteration, and calculate multiple losses of the initial content recommendation model after each iteration Function, when multiple loss functions of the initial content recommendation model after iteration are minimized, the iteration is stopped and the initial content recommendation model is generated, so that the user behavior estimate output by the behavior prediction model is used as the input of the content recommendation model.
  • FIG. 6 there is shown a structural block diagram of an embodiment of a content recommendation apparatus of the present application, which may specifically include the following modules:
  • the information acquisition module 601 is configured to acquire content feature information corresponding to the original recommended content and user behavior information
  • the estimated value generating module 602 is configured to generate an estimated value of user behavior according to the user behavior information
  • the recommendation value generation module 603 is configured to obtain the content recommendation value of each of the original recommended content according to the user behavior estimated value, the user behavior information, and the content feature information;
  • the recommended content determining module 604 is configured to determine at least two target recommended content according to the content recommendation value.
  • the estimated value generating module 602 includes:
  • the behavior vector generation sub-module is used to vectorize the user behavior information to generate a behavior feature vector
  • the estimated value generation sub-module is used to input the behavior feature vector into a preset target behavior estimation model to generate the user behavior estimated value.
  • the recommended value generating module 603 includes:
  • the content recommendation vector generation sub-module is configured to perform vectorization processing on the user behavior estimated value, the user behavior information, and the content feature information to generate a content recommendation feature vector;
  • the recommendation value generation sub-module is used to input the content recommendation vector into a preset target content recommendation model to generate the content recommendation value of each of the original recommended content.
  • the recommended content determination module 604 includes:
  • the recommended content sorting sub-module is used to sort each of the original recommended content according to the order of content recommendation value from high to low;
  • the recommendation list generation sub-module is used to generate a content recommendation list using the sorted original recommendation content
  • the recommended content extraction submodule is configured to extract at least two original recommended content ranked first from the content recommendation list as target recommended content.
  • the target behavior prediction model is generated by the following modules:
  • Information and model acquisition module used to acquire historical user behavior information and initial behavior prediction model
  • the first information vectorization module is used to vectorize the historical user behavior information to generate first training vector information
  • the first model iteration module is used to input the first training vector information in the initial behavior prediction model to iterate, and calculate multiple loss functions of the initial behavior prediction model after each iteration;
  • a loss function includes a loss function based on different historical user behavior information;
  • the first model generation module is used to stop the iteration and generate the target behavior prediction model when multiple loss functions of the initial behavior prediction model after the iteration are all minimized.
  • the target content recommendation model is generated by the following modules:
  • the data and model acquisition module is used to acquire sample data and an initial content recommendation model.
  • the sample data includes historical user behavior information, historical user behavior estimated values corresponding to the historical user behavior information, and content feature information of recommended content;
  • the second information vectorization module is configured to vectorize the historical user behavior information, the historical user behavior estimated value, and the content feature information to generate second training vector information;
  • the second model iteration module is configured to input the second training vector information in the initial content recommendation model for iteration, and calculate multiple loss functions of the initial content recommendation model after each iteration;
  • the second model generation module is used to stop the iteration and generate the target content recommendation model when multiple loss functions of the initial content recommendation model after the iteration are all minimized.
  • the user behavior information includes at least two types of click rate, like rate, broadcast completion rate, attention rate, sharing rate, comment rate, favorite rate, browsing duration, etc. .
  • FIG. 7 a structural block diagram of an embodiment of an apparatus for generating a behavior prediction model according to the present application is shown, which may specifically include the following modules:
  • Information and model acquisition module 701 for acquiring historical user behavior information and initial behavior prediction model
  • the information vectorization module 702 is used to vectorize the historical user behavior information to generate training vector information
  • the model iteration module 703 is configured to input the training vector information in the initial behavior prediction model to iterate, and calculate multiple loss functions of the initial behavior prediction model after each iteration; the multiple loss functions Including loss functions based on different historical user behavior information;
  • the model generation module 704 is configured to stop the iteration and generate the target behavior prediction model when multiple loss functions of the initial behavior prediction model after the iteration are all minimized;
  • the historical user behavior information includes at least two types of click rate, like rate, completion rate, attention rate, sharing rate, comment rate, favorite rate, and browsing duration.
  • the initial behavior prediction model includes a preset number of hidden layers, a fully connected layer connected to the last hidden layer, and multiple layers connected to the fully connected layer. Output nodes; the fully connected layer is used to split the output results of the last hidden layer, and input the split output results to the multiple output nodes respectively.
  • the model iteration module 703 includes:
  • the vector mapping sub-module is used to map the training feature vector layer by layer through the activation function of each neuron in the predetermined number of hidden layers, and transmit the output result generated by the last hidden layer to the Fully connected layer;
  • the gradient value generation sub-module is configured to use the output result and multiple loss functions corresponding to the output result through the fully connected layer to perform error calculation to generate multiple gradient values.
  • model generation module 704 is specifically configured to:
  • FIG. 8 there is shown a structural block diagram of an embodiment of an apparatus for generating a content recommendation model of the present application, which may specifically include the following modules:
  • the data and model acquisition module 801 is used to acquire sample data and an initial content recommendation model.
  • the sample data includes historical user behavior information, historical user behavior estimated values corresponding to the historical user behavior information, and content feature information of recommended content ;
  • the information vectorization module 802 is configured to vectorize the historical user behavior information, the historical user behavior estimated value, and the content feature information to generate training vector information;
  • the model iteration module 803 is configured to input the training vector information in the initial content recommendation model for iteration, and calculate multiple loss functions of the initial content recommendation model after each iteration;
  • the model generation module 804 is configured to stop the iteration and generate the target content recommendation model when multiple loss functions of the initial content recommendation model after the iteration are all minimized;
  • the historical user behavior information includes at least two types of click rate, like rate, completion rate, attention rate, sharing rate, comment rate, favorite rate, and browsing duration.
  • the initial content recommendation model includes a preset number of hidden layers, a fully connected layer connected to the last hidden layer, and a Rank Cost connected to the fully connected layer.
  • Layer and multiple output nodes connected to the Rank Cost layer; the Rank Cost layer is used to convert the output results of the fully connected layer, and input the converted output results to the multiple output nodes respectively .
  • the model iteration module 803 includes:
  • the vector mapping sub-module is used to map the training feature vector layer by layer through the preset number of hidden layers and the activation function of each neuron in the fully connected layer, and generate the fully connected layer
  • the output result of is transmitted to the Rank Cost layer;
  • the gradient value generation sub-module is configured to use the output result and multiple loss functions corresponding to the output result through the Rank and Cost layer to perform error calculation to generate multiple gradient values.
  • model generation module 804 is specifically configured to:
  • the description is relatively simple, and for related parts, please refer to the part of the description of the method embodiment.
  • the embodiment of the present application also provides a storage medium
  • a computer program is stored thereon; the computer program is adapted to be loaded by a processor and execute one or more of the above-mentioned methods.
  • the embodiment of the present application also provides a computer device, which includes:
  • One or more processors are One or more processors;
  • One or more application programs wherein the one or more application programs are stored in the memory and configured to be executed by the one or more processors, and the one or more application programs are configured to execute According to the above method.
  • any reference signs placed between parentheses should not be constructed as a limitation to the claims.
  • the word “comprising” does not exclude the presence of elements or steps not listed in the claims.
  • the word “a” or “an” preceding an element does not exclude the presence of multiple such elements.
  • the application can be realized by means of hardware including several different elements and by means of a suitably programmed computer. In the unit claims listing several devices, several of these devices may be embodied in the same hardware item.
  • the use of the words first, second, and third, etc. do not indicate any order. These words can be interpreted as names.

Abstract

一种内容的推荐方法、模型生成方法、装置、存储介质以及计算机设备,其中,所述推荐方法包括:获取与原始推荐内容对应的内容特征信息,以及,用户行为信息(101),接着先根据用户行为信息,生成用户行为预估值(102),然后根据用户行为预估值、用户行为信息以及内容特征信息,得到各个原始推荐内容的内容推荐值(103),然后根据内容推荐值,确定至少两个目标推荐内容(104),并展示至少两个目标推荐内容,从而更加精准地预测用户对推荐内容的喜好度,并依次挑选最优的推荐内容,然后展示给用户,提高了向用户推荐内容的精准度。

Description

推荐方法、模型生成方法、装置、介质以及设备
本申请要求在2019年12月31日提交中国专利局、申请号201911418934.6、发明名称为“推荐方法、模型生成方法、装置、介质及设备”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本申请涉及数据技术领域,特别是涉及一种内容的推荐方法及装置、一种行为预估模型的生成方法及装置、一种内容推荐模型的生成方法及装置、一种存储介质以及一种计算机设备。
背景技术
随着互联网领域的飞速发展,爆炸式增长的信息导致用户想要获取感兴趣的有效内容越来越困难,个性化推荐系统显然已经成为互联网领域不可或缺的基础技术,在新闻、短视频和音乐等产品中扮演着越来越重要的角色。
在实施本申请过程中,发明人发现当前的内容推送方式中至少存在如下问题:传统的内容排序算法通过离线获取用户反馈行为数据,得到样本标签,同时从线上服务器存储的特征日志中进行特征提取,将两者结合后得到训练样本,然后在模型训练过程中,尽量拟合适配用户喜好度的模型。在模型应用过程中,线上基于该模型预测用户对内容条目的喜好度,并根据喜好度的高低,选择若干条最优内容条目构成推送列表,并推送给用户。
在大多数产品推荐场景中,用户的反馈行为是多种多样的,根据用户的反馈行为,设定训练样本的标签以及对应的权重存在严重的主观局限性,并且不同用户之间行为习惯存在较大差异,用户的主观能动性会限制模型分辨同一个用户对不同条目喜好度差异的能力。
发明内容
鉴于上述问题,提出了本申请实施例以便提供一种克服上述问题或者至少部分地解决上述问题的一种内容的推荐方法和相应的一种内容的推荐装置。
为了解决上述问题,本申请实施例公开了一种内容的推荐方法,所述方法包括:
获取与原始推荐内容对应的内容特征信息,以及,用户行为信息;
根据所述用户行为信息,生成用户行为预估值;
根据所述用户行为预估值、所述用户行为信息以及所述内容特征信息,得到各个所述原始推荐内容的内容推荐值;
根据所述内容推荐值,确定至少两个目标推荐内容。
本申请实施例还提供了一种行为预估模型的生成方法,包括:
获取历史用户行为信息以及初始行为预估模型;
对所述历史用户行为信息进行向量化,生成训练向量信息;
在所述初始行为预估模型中,输入所述训练向量信息进行迭代,并计算每次迭代后的初始行为预估模型的多个损失函数;所述多个损失函数包括基于不同历史用户行为信息的损失函数;
当迭代之后的初始行为预估模型的多个损失函数均最小化时,停止迭代,生成目标行为预估模型;
其中,所述历史用户行为信息包括点击率、点赞率、播完率、关注率、分享率、评论率、收藏率、浏览时长等中至少两种。
本申请实施例还提供了一种内容推荐模型的生成方法,包括:
获取样本数据以及初始内容推荐模型,所述样本数据包括历史用户行为信息,与所述历史用户行为信息对应的历史用户行为预估值,推荐内容的内容特征信息;
对所述历史用户行为信息、所述历史用户行为预估值以及所述内容特征信息进行向量化,生成训练向量信息;
在所述初始内容推荐模型中,输入所述训练向量信息进行迭代,并计算每次迭代后的初始内容推荐模型的多个损失函数;
当迭代之后的初始内容推荐模型的多个损失函数均最小化时,停止迭代,生成目标内容推荐模型;
其中,所述历史用户行为信息包括点击率、点赞率、播完率、关注率、分享率、评论率、收藏率、浏览时长等中至少两种。
本申请实施例还提供了一种内容的推荐装置,所述装置包括:
信息获取模块,用于获取与原始推荐内容对应的内容特征信息,以及,用户行为信息;
预估值生成模块,用于根据所述用户行为信息,生成用户行为预估值;
推荐值生成模块,用于根据所述用户行为预估值、所述用户行为信息以及所述内容特征信息,得到各个所述原始推荐内容的内容推荐值;
推荐内容确定模块,用于根据所述内容推荐值,确定至少两个目标推荐内容。
本申请实施例还提供了一种行为预估模型的生成装置,包括:
信息与模型获取模块,用于获取历史用户行为信息以及初始行为预估模型;
信息向量化模块,用于对所述历史用户行为信息进行向量化,生成训练向量信息;
模型迭代模块,用于在所述初始行为预估模型中,输入所述训练向量信息进行迭代,并计算每次迭代后的初始行为预估模型的多个损失函数;所述多个损失函数包括基于不同历史用户行为信息的损失函数;
模型生成模块,用于当迭代之后的初始行为预估模型的多个损失函数均最小化时,停止迭代,生成目标行为预估模型;
其中,所述历史用户行为信息包括点击率、点赞率、播完率、关注率、分享率、评论率、收藏率、浏览时长等中至少两种。
本申请实施例还提供了一种内容推荐模型的生成装置,包括:
数据与模型获取模块,用于获取样本数据以及初始内容推荐模型,所述样本数据包括历史用户行为信息,与所述历史用户行为信息对应的历史用户行为预估值,推荐内容的内容特征信息;
信息向量化模块,用于对所述历史用户行为信息、所述历史用户行为预估值以及所述内容特征信息进行向量化,生成训练向量信息;
模型迭代模块,用于在所述初始内容推荐模型中,输入所述训练向量信息进行迭代,并计算每次迭代后的初始内容推荐模型的多个损失函数;
模型生成模块,用于当迭代之后的初始内容推荐模型的多个损失函数均最小化时,停止迭代,生成目标内容推荐模型;
其中,所述历史用户行为信息包括点击率、点赞率、播完率、关注率、分享率、评论率、收藏率、浏览时长等中至少两种。
本申请实施例还提供了一种存储介质,
其上存储有计算机程序;所述计算机程序适于由处理器加载并执行上述的一个或多个的方法。
本申请实施例还提供了一种计算机设备,其包括:
一个或多个处理器;
存储器;
一个或多个应用程序,其中所述一个或多个应用程序被存储在所述存储器中并被配置为由所述一个或多个处理器执行,所述一个或多个应用程序配置用于执行根据上述的方法。
本申请实施例包括以下优点:
在本申请实施例中,通过获取与原始推荐内容对应的内容特征信息,以及,用户行为信息,接着先根据用户行为信息,生成用户行为预估值,然后根据用户行为预估值、用户行为信息以及内容特征信息,得到各个原始推荐内容的内容推荐值,然后根据内容推荐值,确定至少两个目标推荐内容,并展示至少两个目标推荐内容,从而更加精准地预测用户对推荐内容的喜好度,并依次挑选最优的推荐内容,然后展示给用户,提高了向用户推荐内容的精准度。
此外,通过将行为预估模型输出的用户行为预估值作为内容推荐模型的输入,从而解决了模型训练过程中,人为设定训练样本的标签与权重的主观性,以及用户行为习惯之间的偏差对模型训练的干扰,进一步提高了向用户推荐内容的精准度。
附图说明
为了更清楚地说明本申请实施例或现有技术中的技术方案,下面将对实施例或现有技术描述中所需要使用的附图作一简单地介绍,显而易见地,下面描述中的附图是本申请的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其他的附图。
图1是本申请的一种内容的推荐方法实施例的步骤流程图;
图2是本申请的一种内容的推荐方法实施例中的示例图一;
图3是本申请的一种内容的推荐方法实施例中的示例图二;
图4是本申请的一种行为预估模型的生成方法实施例的步骤流程图;
图5是本申请的一种内容推荐模型的生成方法实施例的步骤流程图;
图6是本申请的一种内容的推荐装置实施例的结构框图;
图7是本申请的一种行为预估模型的生成方法实施例的结构框图;
图8是本申请的一种内容推荐模型的生成方法实施例的结构框图。
具体实施例
为使本申请的上述目的、特征和优点能够更加明显易懂,下面结合附图和具体实施方式对本申请作进一步详细的说明。
参照图1,示出了本申请的一种内容的推荐方法实施例的步骤流程图,具体可以包括如下步骤:
步骤101,获取与原始推荐内容对应的内容特征信息,以及,用户行为信息;
作为一种示例,随着互联网领域的飞速发展,爆炸式增长的信息导致用户想要获取感兴趣的有效内容越来越难,因此,个性化推荐系统已经成为互联网领域不可或缺的基础技术。则如何预测用户的喜好,并根据用户的喜好为用户推荐相应的内容,显得愈发重要。
在本申请实施例中,原始推送内容可以是新闻、商品、广告、文章、音乐以及短视频等内容。其对应的内容特征信息可以包括原始推荐内容自身特性的内容特征信息以及与原始推荐内容关联的非自身特性的内容特征信息,其中,自身特性的内容特征信息可以为内容属性、内容类型、内容的上传时间以及内容的上传者等等,非自身特征的内容特征信息可以为内容的点击率、内容的点开率、内容的点赞率、内容的阅读率、内容的收藏率,内容的关注率等等,如当原始推荐内容为短视频内容时,则自身特性的内容特征信息可以包括短视频的上传者、短视频的类型(搞笑、动漫、电视剧、电影、饮食等)、短视频的上传时间、短视频的上传地址等等,非自身特征的内容特征信息可以包括短视频的点击率、短视频的点赞率、短视频的收藏率、短视频上传者的关注率、在线观看人数以及历史观看人数等等。
用户行为信息可以为与用户行为特征相关的行为信息,如点击率、点赞 率、播完率、关注率、分享率、评论率等用户针对不同内容的反馈行为。
在具体实现中,用户可以通过终端中的应用程序获取相应的内容,当用户打开终端进行内容获取时,服务器可以在后台实时获取待推荐的原始推荐内容,并获取用户的用户行为信息,以根据原始推荐内容的内容特征信息以及用户行为信息,实时根据用户的喜好度,进行精准地内容推荐。
其中,终端可以包括移动设备,具体可以包括手机、PDA(Personal Digital Assistant,个人数字助理)、膝上型计算机、掌上电脑、智能穿戴设备(如智能手环、智能眼镜、智能头箍等)等等,也可以包括固定设备,具体可以包括车载终端、智能家居等等,这些终端可以支持Windows、Android(安卓)、IOS、WindowsPhone等操作系统,本申请实施例对此不作限制。应用程序可以包括新闻应用程序、音乐应用程序、短视频应用程序、阅读应用程序等等,本申请实施例对此不作限制。
步骤102,根据所述用户行为信息,生成用户行为预估值;
在本申请实施例中,由于用户的反馈行为多种多样,且不同用户的反馈行为各不相同,则可以根据用户行为信息,生成针对用户的多个用户行为预估值,其中,用户行为与用户行为预估值一一对应。
在本申请实施例的一种可选实施例中,可以先对所述用户行为信息进行向量化处理,生成行为特征向量,接着将所述行为特征向量输入预设的目标行为预估模型,生成所述用户行为预估值。
在具体实现中,目标行为预估模型可以为多目标深度神经网络模型,其可以包括预设数目的隐藏层、与最后一层隐藏层连接的全连接层以及与全连接层连接的多个输出节点,其中,全连接层用于将最后一层隐藏层的输出结果拆分,并将拆分的输出结果分别输入到多个输出节点,各个输出节点输出对应的用户行为预估值。
具体的,可以获取已经训练好的行为预估模型,接着将用户行为信息进行向量化处理,生成对应的行为特征向量,然后可以将行为特征向量输入至目标行为预估模型,从而得到用户行为预估值,其中,用户的每个反馈行为均对应一个用户行为预估值。如当用户行为信息包括针对不同内容的点击率、点赞率、播完率、关注率、分享率、评论率时,则用户行为预估值可以包括 针对不同内容的点击率预估值、点赞率预估值、播完率预估值、关注率预估值、分享率预估值、评论率预估值等,从而可以得到用户针对不同内容的用户行为预估值。
在行为评估过程中,利用多目标深度神经网络模型将多个用户行为信息进行关联,提高了输出的多个用户行为预估值的准确度,有利于后续为用户提供更加精准地内容。
在本申请实施例的一种可选实施例中,可以通过如下步骤生成行为预估模型:
获取历史用户行为信息以及初始行为预估模型;对所述历史用户行为信息进行向量化,生成第一训练向量信息;在所述初始行为预估模型中,输入所述第一训练向量信息进行迭代,并计算每次迭代后的初始行为预估模型的多个损失函数;所述多个损失函数包括基于不同历史用户行为信息的损失函数;当迭代之后的初始行为预估模型的多个损失函数均最小化时,停止迭代,生成目标行为预估模型。
在具体实现中,历史用户行为信息可以包括点击率、点赞率、播完率、关注率、分享率、评论率等用户针对不同内容的历史反馈行为。则当获取了历史用户行为信息后,可以对历史用户行为信息进行向量化处理,生成第一训练向量信息,然后将第一训练向量信息输入初始行为预估模型,进行模型训练,从而得到目标行为预估模型。
在训练过程中,将初始行为预估模型的多个损失函数作为初始行为预估模型的监督和指导。其中,多个损失函数包括基于不同历史用户行为信息的损失函数,如历史用户行为信息包括点击率、点赞率、播完率、关注率、分享率、评论率等,则多个损失函数包括点击率评估值、点赞率评估值、播完率评估值、关注率评估值、分享率评估值、评论率评估值等对应的损失函数。
在具体实现中,可以设置模型迭代的停止条件是:初始行为预估模型的多个损失函数最小化,当初始行为预估模型的多个损失函数都最小化时,停止迭代初始行为预估模型,并生成对应的目标行为预估模型。
步骤103,根据所述用户行为预估值、所述用户行为信息以及所述内容特征信息,得到各个所述原始推荐内容的内容推荐值;
在本申请实施例中,当通过行为预估模型得到与用户行为信息匹配的用户行为预估值后,可以根据用户行为预估值、用户行为信息以及原始推荐内容的内容特征信息,得到针对各个原始推荐内容的内容推荐值,从而融合用户行为预估值,可以更加精准地预测用户对待推荐内容的喜好度,以便向用户推荐优质的内容。
在本申请实施例的一种可选实施例中,可以对所述用户行为预估值、所述用户行为信息以及所述内容特征信息进行向量化处理,生成内容推荐特征向量;将所述内容推荐向量输入预设的目标内容推荐模型,生成各个所述原始推荐内容的内容推荐值。
在具体实现中,目标内容推荐模型可以为基于Pairwise的LTR(learning To Rank,排序学习)模型,其可以包括预设数目的隐藏层、与最后一层隐藏层连接的全连接层、与全连接层连接的Rank Cost层以及与Rank Cost层连接的多个输出节点,其中,Rank Cost层用于将全连接层的输出结果进行转换,并将转换后的输出结果分别输入到多个输出节点,各个输出节点输出对应的内容推荐值。
具体的,可以获取已经训练好的内容推荐模型,接着将用户行为预估值、用户行为信息以及原始推荐内容的内容特征信息进行向量化处理,生成内容推荐特征向量,然后将内容推荐向量输入至目标内容推荐模型中,从而得到针对不同原始推荐内容的内容推荐值。如原始推荐内容包括内容一、内容二、内容三、内容四以及内容五等,则可以得到内容一的内容推荐值A、内容二的内容推荐值B、内容三的内容推荐值C、内容四的内容推荐值D以及内容五的内容推荐值E等,从而可以得到不同原始推荐内容对应的内容推荐值,进而可以根据不同的内容推荐值,向用户推荐合适的内容。
在内容推荐过程中,根据同一用户在一定时间段内对两个历史展示过程的内容条目的不同反馈行为,判断用户对不同内容条目的喜好程度。具体的,通过将用户行为预估值、用户行为信息以及原始推荐内容的内容特征信息进行融合处理,将用户行为预估值加入到非线性的目标内容推荐模型中,实现了推荐内容可以因人因视频而异,更加精准地为用户提高合适的内容,提高了产品的用户体验以及用户粘性。
在本申请实施例的一种可选实施例中,可以通过如下步骤生成目标内容推荐模型:
获取样本数据以及初始内容推荐模型,所述样本数据包括历史用户行为信息,与所述历史用户行为信息对应的历史用户行为预估值,推荐内容的内容特征信息;对所述历史用户行为信息、所述历史用户行为预估值以及所述内容特征信息进行向量化,生成第二训练向量信息;在所述初始内容推荐模型中,输入所述第二训练向量信息进行迭代,并计算每次迭代后的初始内容推荐模型的多个损失函数;当迭代之后的初始内容推荐模型的多个损失函数均最小化时,停止迭代,生成目标内容推荐模型。
在具体实现中,训练样本数据可以包括历史用户行为信息,与历史用户行为信息对应的历史用户行为预估值,以及推荐内容的内容特征信息等。当获取了样本数据后,可以将样本数据进行向量化处理,得到第二训练向量信息,然后将第二训练向量信息输入初始内容推荐模型,进行模型训练,从而得到目标内容推荐模型。
在训练过程中,将初始内容推荐模型的多个损失函数作为初始内容推荐模型的监督和指导。其中,多个损失函数可以基于不同推荐内容的内容推荐值,如推荐内容包括内容一、内容二、内容三、内容四以及内容五等,则多个损失函数可以包括内容一的内容推荐值、内容二的内容推荐值、内容三的内容推荐值、内容四的内容推荐值以及内容五的内容推荐值等对应的算是函数。在具体实现中,可以设置模型迭代的停止条件是:初始内容推荐模型的多个损失函数最小化,当初始内容推荐模型的多个损失函数都最小化时,停止迭代初始内容推荐模型,并生成对应的目标内容推荐模型。
步骤104,根据所述内容推荐值,确定至少两个目标推荐内容。
在本申请实施例中,当得到与不同原始推荐内容对应的内容推荐值后,可以根据内容推荐值从原始推荐内容中,确定至少两个目标推荐内容,并展示给用户。
在具体实现中,服务器通过获取了用户的用户行为信息,以及原始推荐内容的内容特征信息后,得到原始推荐内容的内容推荐值,接着可以按照内容推荐值从高到低的顺序,对各个原始推荐内容进行排序,并采用排序后的 原始推荐内容,生成对应内容推荐列表,然后可以从该内容推荐列表中,提取排序靠前的至少两个原始推荐内容,作为目标推荐内容,并通过客户端将目标推荐内容展示给用户,从而可以从多个原始推荐内容中筛选出与用户喜好度贴合的目标推荐内容,并向用户展示该目标推荐内容,提高了内容推荐的针对性,保证了产品的用户体验及用户粘性。
在本申请实施例的一种示例中,当原始推荐内容为短视频,客户端为短视频应用程序时,当用户启动终端中的短视频应用程序时,服务器可以根据用户的ID标识,获取针对用户的用户行为信息,并获取待推荐的短视频内容。接着可以对用户行为信息进行向量化处理,并输入行为预估模型,得到与用户的反馈行为匹配的用户行为预估值,然后可以所得到的用户行为预估值、用户行为信息以及短视频内容的内容特征信息进行向量化处理,并输入短视频推荐模型,生成与短视频内容对应的短视频推荐值,如表1所示:
短视频标识 短视频推荐值
短视频① 75
短视频② 86
短视频③ 62
短视频④ 80
短视频⑤ 90
短视频⑥ 98
短视频⑦ 88
短视频⑧ 56
短视频⑨ 93
短视频⑩ 74
N
表1
接着可以根据内容推荐值从高到低的顺序对短视频进行排序,生成短视频推荐列表,如表2所示:
短视频标识 短视频推荐值
短视频⑥ 98
短视频⑨ 93
短视频⑤ 90
短视频⑦ 88
短视频② 86
短视频④ 80
短视频① 75
短视频⑩ 74
短视频③ 62
短视频⑧ 56
N
表2
然后服务器可以根据终端的信息,或客户端的信息,从该短视频推荐列表中选择排序靠前的短视频,并通过客户端向用户展示推荐的短视频。具体的,不同的终端对应的屏幕信息不同,则终端所展示的短视频条目数量也不同,终端屏幕尺寸越大,则可以展示的短视频条目越多,因此,服务器可以根据终端的屏幕信息,从短视频推荐列表中选择合适数目的短视频,并推荐给用户。
如图2所示,为本申请实施例中的示例图一,当客户端向用户展示4个短视频时,则用户打开客户端,并进行更新时,服务器可以从上述短视频推荐列表中,选择4个最靠前的短视频推荐给用户,如短视频⑥、短视频⑨、短视频⑤以及短视频⑦,当用户在终端进行触摸操作,如将用户界面向上滑动时,则服务器可以根据短视频推荐列表实时向用户推荐短视频内容,进而实时更新客户端所展示的短视频内容,则此时用户将用户界面上滑时,客户端展示的内容可以更新为短视频⑤、短视频⑦、短视频②以及短视频④,当用户继续进行触摸操作时,则可以进一步更新为短视频②、短视频④、短视频①以及短视频⑩,从而当用户在使用短视频客户端时,服务器可以在后台 实时地更新短视频推荐列表,并实时对客户端进行更新,从而更加精准地预测用户对推荐内容的喜好度,并依次挑选最优的推荐内容,然后展示给用户,提高了向用户推荐内容的精准度。
在本申请实施例的另一种示例中,当原始推荐内容为文章,客户端为阅读应用程序时,当用户启动终端中的阅读应用程序时,服务器可以根据用户的ID标识,获取针对用户的用户行为信息,并获取待推荐的文章内容。接着可以对用户行为信息进行向量化处理,并输入行为预估模型,得到与用户的反馈行为匹配的用户行为预估值,然后可以所得到的用户行为预估值、用户行为信息以及文章内容的内容特征信息进行向量化处理,并输入文章推荐模型,生成与文章内容对应的文章推荐值,如表3所示:
文章标识 文章推荐值
文章① 75
文章② 86
文章③ 62
文章④ 80
文章⑤ 90
文章⑥ 98
文章⑦ 88
文章⑧ 56
文章⑨ 93
文章⑩ 74
N
表3
接着可以根据内容推荐值从高到低的顺序对文章进行排序,生成文章推荐列表,如表4所示:
文章标识 文章推荐值
文章⑥ 98
文章⑨ 93
文章⑤ 90
文章⑦ 88
文章② 86
文章④ 80
文章① 75
文章⑩ 74
文章③ 62
文章⑧ 56
N
表4
然后服务器可以根据终端的信息,或客户端的信息,从该文章推荐列表中选择排序靠前的文章,并通过客户端向用户展示推荐的文章。具体的,不同的终端对应的屏幕信息不同,则终端所展示的文章条目数量也不同,终端屏幕尺寸越大,则可以展示的文章条目越多,因此,服务器可以根据终端的屏幕信息,从文章推荐列表中选择合适数目的文章,并推荐给用户。
如图3所示,为本申请实施例中的示例图二,当客户端向用户展示4个文章时,则用户打开客户端,并进行更新时,服务器可以从上述文章推荐列表中,选择4个最靠前的文章推荐给用户,如文章⑥、文章⑨、文章⑤以及文章⑦,当用户在终端进行触摸操作,如将用户界面向上滑动时,则服务器可以根据文章推荐列表实时向用户推荐文章内容,进而实时更新客户端所展示的文章内容,则此时用户将用户界面上滑时,客户端展示的内容可以更新为文章⑤、文章⑦、文章②以及文章④,当用户继续进行触摸操作时,则可以进一步更新为文章②、文章④、文章①以及文章⑩,从而当用户在使用文章客户端时,服务器可以在后台实时地更新文章推荐列表,并实时对客户端进行更新,从而更加精准地预测用户对推荐内容的喜好度,并依次挑选最优 的推荐内容,然后展示给用户,提高了向用户推荐内容的精准度。
需要说明的是,本申请实施例包括但不限于上述示例,可以理解的是,本领域技术人员在本申请实施例的思想指导下,可以根据不同的推荐内容、不同终端、不同客户端等向用户推荐内容条目,本申请实施例对此不作限制。
在本申请实施例中,通过获取与原始推荐内容对应的内容特征信息,以及,用户行为信息,接着先根据用户行为信息,生成用户行为预估值,然后根据用户行为预估值、用户行为信息以及内容特征信息,得到各个原始推荐内容的内容推荐值,然后根据内容推荐值,确定至少两个目标推荐内容,并展示至少两个目标推荐内容,从而更加精准地预测用户对推荐内容的喜好度,并依次挑选最优的推荐内容,然后展示给用户,提高了向用户推荐内容的精准度。
参照图4,示出了本申请的一种行为预估模型的生成方法实施例的步骤流程图,具体可以包括如下步骤:
步骤401,获取历史用户行为信息以及初始行为预估模型;
在具体实现中,历史用户行为信息可以包括点击率、点赞率、播完率、关注率、分享率、评论率等用户针对不同内容的历史反馈行为。初始行为预估模型可以为多目标深度神经网络模型,可以包括预设数目的隐藏层、与最后一层隐藏层连接的全连接层以及与所述全连接层连接的多个输出节点;所述全连接层用于将所述最后一层隐藏层的输出结果拆分,并将拆分的输出结果分别输入到所述多个输出节点,其中,各个输出节点可以输出与用户的反馈行为对应的用户行为预估值。
步骤402,对所述历史用户行为信息进行向量化,生成训练向量信息;
在具体实现中,当得到历史用户行为信息后,可以进行向量化处理,生成训练向量信息,以输入初始行为预估模型进行模型训练。
步骤403,在所述初始行为预估模型中,输入所述训练向量信息进行迭代,并计算每次迭代后的初始行为预估模型的多个损失函数;所述多个损失函数包括基于不同历史用户行为信息的损失函数;
在具体实现中,可以在初始行为预估模型中,输入训练向量信息进行迭 代,并计算每次迭代后初始行为预估模型对应的多个损失函数,其中,多个损失函数包括基于不同历史用户行为信息的损失函数,如历史用户行为信息包括点击率、点赞率、播完率、关注率、分享率、评论率等,则多个损失函数包括点击率评估值、点赞率评估值、播完率评估值、关注率评估值、分享率评估值、评论率评估值等对应的损失函数。
具体的,可以通过初始行为预估模型中预设数目的隐藏层每一神经元的激活函数,对训练特征向量逐层进行映射,并将最后一层隐藏层生成的输出结果传输至全连接层;通过全连接层采用输出结果,和与输出结果对应的多个损失函数,进行误差计算,生成多个梯度值。
在本申请实施例的一种示例中,初始行为预估模型可以包括2层隐藏层以及全连接层,当历史用户行为信息包括点击率、点赞率、播完率、关注率、分享率、评论率等6个行为信息时,则初始行为预估模型可以包括6个输出节点,与历史用户行为信息一一对应。
需要说明的是,本申请实施例包括但不限于上述示例,本领域技术人员在本申请实施例的思想指导下,可以根据实际情况设置行为预估模型的隐藏层数目,以及输出节点的数目,本申请实施例对此不作限制。
步骤404,当迭代之后的初始行为预估模型的多个损失函数均最小化时,停止迭代,生成目标行为预估模型;
在具体实现中,当迭代后的初始行为预估模型的多个损失函数均最小化,可以停止模型的迭代,从而生成目标行为预估模型。
具体的,可以通过各个输出节点判断多个梯度值是否满足预设阈值条件;若否,则根据多个梯度值更新每一神经元的激活函数的参数,继续迭代初始行为预估模型;若是,则生成目标行为预估模型。
其中,对激活函数的参数更新,可以是基于梯度下降策略,以目标梯度方向对参数进行更新。在具体实现中,可以预设一学习率,控制每一轮迭代中参数的更新步长,从而最终得到目标行为预估模型。
在本申请实施例中,通过获取历史用户行为信息以及初始行为预估模型,对历史用户行为信息进行向量化,生成训练向量信息,在初始行为预估模型中,输入训练向量信息进行迭代,并计算每次迭代后的初始行为预估模型的 多个损失函数,多个损失函数包括基于不同历史用户行为信息的损失函数,当迭代之后的初始行为预估模型的多个损失函数均最小化时,停止迭代,生成目标行为预估模型,在训练的过程中,将用户针对不同内容的反馈行为作为模型的输入,通过模型的将各个不同的反馈行为进行关联,从而可以取得较好的泛化效果,同时提升了共享层的学习速率和减少过拟合的情况,进而提高了内容推荐的准确度。
参照图5,示出了本申请的一种内容推荐模型的生成方法实施例的步骤流程图,具体可以包括如下步骤:
步骤501,获取样本数据以及初始内容推荐模型,所述样本数据包括历史用户行为信息,与所述历史用户行为信息对应的历史用户行为预估值,推荐内容的内容特征信息;
在具体实现中,训练样本数据可以包括历史用户行为信息,与历史用户行为信息对应的历史用户行为预估值,以及推荐内容的内容特征信息等。其中,历史用户行为信息可以包括点击率、点赞率、播完率、关注率、分享率、评论率等用户针对不同内容的历史反馈行为。
其中,初始内容推荐模型可以包括预设数目的隐藏层、与最后一层隐藏层连接的全连接层、与全连接层连接的Rank Cost层以及与Rank Cost层连接的多个输出节点,其中,Rank Cost层用于将全连接层的输出结果进行转换,并将转换后的输出结果分别输入到多个输出节点,其中,输出节点可以输入与推荐内容对应的内容推荐值。
步骤502,对所述历史用户行为信息、所述历史用户行为预估值以及所述内容特征信息进行向量化,生成训练向量信息;
在具体实现中,当得到历史用户行为信息、历史用户行为预估值以及内容特征信息后,可以分别对三者进行向量化,并进行拼接,生成训练向量信息,以输入初始内容推荐模型进行模型训练。
步骤503,在所述初始内容推荐模型中,输入所述训练向量信息进行迭代,并计算每次迭代后的初始内容推荐模型的多个损失函数;
在具体实现中,可以在初始内容推荐模型中,输入训练向量信息进行迭 代,并计算每次迭代后初始内容推荐模型对应的多个损失函数,其中,多个损失函数包括基于不同推荐内容的损失函数,如推荐内容包括内容一、内容二、内容三、内容四以及内容五等,则多个损失函数可以包括内容一的内容推荐值、内容二的内容推荐值、内容三的内容推荐值、内容四的内容推荐值以及内容五的内容推荐值等对应的算是函数。
具体的,可以通过初始内容推荐模型中预设数目的隐藏层、以及全连接层每一神经元的激活函数,对训练特征向量逐层进行映射,并将全连接层生成的输出结果传输至Rank Cost层,通过Rank Cost层采用输出结果,和与输出结果对应的多个损失函数,进行误差计算,生成多个梯度值。
在本申请实施例的一种示例中,初始内容推荐模型可以包括2层隐藏层以及全连接层,当待推荐内容为6个时,则初始内容推荐模型可以包括6个输出节点,与待推荐内容一一对应。
需要说明的是,本申请实施例包括但不限于上述示例,本领域技术人员在本申请实施例的思想指导下,可以根据实际情况设置内容推荐模型的隐藏层数目,以及输出节点的数目,本申请实施例对此不作限制。
步骤504,当迭代之后的初始内容推荐模型的多个损失函数均最小化时,停止迭代,生成目标内容推荐模型;
在具体实现中,当迭代后的初始内容推荐模型的多个损失函数均最小化,可以停止模型的迭代,从而生成目标内容推荐模型。
具体的,可以通过各个输出节点判断多个梯度值是否满足预设阈值条件;若否,则根据多个梯度值更新每一神经元的激活函数的参数,继续迭代初始内容推荐模型;若是,则生成目标内容推荐模型。
其中,对激活函数的参数更新,可以是基于梯度下降策略,以目标梯度方向对参数进行更新。在具体实现中,可以预设一学习率,控制每一轮迭代中参数的更新步长,从而最终得到目标内容推荐模型。
在本申请实施例中,通过获取样本数据以及初始内容推荐模型,样本数据包括历史用户行为信息,与历史用户行为信息对应的历史用户行为预估值,推荐内容的内容特征信息,对历史用户行为信息、历史用户行为预估值以及内容特征信息进行向量化,生成训练向量信息,在初始内容推荐模型中,输 入训练向量信息进行迭代,并计算每次迭代后的初始内容推荐模型的多个损失函数,当迭代之后的初始内容推荐模型的多个损失函数均最小化时,停止迭代,生成初始内容推荐模型,从而通过将行为预估模型输出的用户行为预估值作为内容推荐模型的输入,从而解决了模型训练过程中,人为设定训练样本的标签与权重的主观性,以及用户行为习惯之间的偏差对模型训练的干扰,进一步提高了向用户推荐内容的精准度。
需要说明的是,对于方法实施例,为了简单描述,故将其都表述为一系列的动作组合,但是本领域技术人员应该知悉,本申请实施例并不受所描述的动作顺序的限制,因为依据本申请实施例,某些步骤可以采用其他顺序或者同时进行。其次,本领域技术人员也应该知悉,说明书中所描述的实施例均属于优选实施例,所涉及的动作并不一定是本申请实施例所必须的。
参照图6,示出了本申请的一种内容的推荐装置实施例的结构框图,具体可以包括如下模块:
信息获取模块601,用于获取与原始推荐内容对应的内容特征信息,以及,用户行为信息;
预估值生成模块602,用于根据所述用户行为信息,生成用户行为预估值;
推荐值生成模块603,用于根据所述用户行为预估值、所述用户行为信息以及所述内容特征信息,得到各个所述原始推荐内容的内容推荐值;
推荐内容确定模块604,用于根据所述内容推荐值,确定至少两个目标推荐内容。
在本申请实施例的一种可选实施例中,所述预估值生成模块602包括:
行为向量生成子模块,用于对所述用户行为信息进行向量化处理,生成行为特征向量;
预估值生成子模块,用于将所述行为特征向量输入预设的目标行为预估模型,生成所述用户行为预估值。
在本申请实施例的一种可选实施例中,所述推荐值生成模块603包括:
内容推荐向量生成子模块,用于对所述用户行为预估值、所述用户行为信息以及所述内容特征信息进行向量化处理,生成内容推荐特征向量;
推荐值生成子模块,用于将所述内容推荐向量输入预设的目标内容推荐模型,生成各个所述原始推荐内容的内容推荐值。
在本申请实施例的一种可选实施例中,所述推荐内容确定模块604包括:
推荐内容排序子模块,用于按照内容推荐值从高到低的顺序,对各个所述原始推荐内容进行排序;
推荐列表生成子模块,用于采用排序后的原始推荐内容,生成内容推荐列表;
推荐内容提取子模块,用于从所述内容推荐列表中,提取排序在前的至少两个原始推荐内容,作为目标推荐内容。
在本申请实施例的一种可选实施例中,所述目标行为预估模型通过如下模块生成:
信息与模型获取模块,用于获取历史用户行为信息以及初始行为预估模型;
第一信息向量化模块,用于对所述历史用户行为信息进行向量化,生成第一训练向量信息;
第一模型迭代模块,用于在所述初始行为预估模型中,输入所述第一训练向量信息进行迭代,并计算每次迭代后的初始行为预估模型的多个损失函数;所述多个损失函数包括基于不同历史用户行为信息的损失函数;
第一模型生成模块,用于当迭代之后的初始行为预估模型的多个损失函数均最小化时,停止迭代,生成目标行为预估模型。
在本申请实施例的一种可选实施例中,所述目标内容推荐模型通过如下模块生成:
数据与模型获取模块,用于获取样本数据以及初始内容推荐模型,所述样本数据包括历史用户行为信息,与所述历史用户行为信息对应的历史用户行为预估值,推荐内容的内容特征信息;
第二信息向量化模块,用于对所述历史用户行为信息、所述历史用户行为预估值以及所述内容特征信息进行向量化,生成第二训练向量信息;
第二模型迭代模块,用于在所述初始内容推荐模型中,输入所述第二训练向量信息进行迭代,并计算每次迭代后的初始内容推荐模型的多个损失函数;
第二模型生成模块,用于当迭代之后的初始内容推荐模型的多个损失函数均最小化时,停止迭代,生成目标内容推荐模型。
在本申请实施例的一种可选实施例中,所述用户行为信息包括点击率、点赞率、播完率、关注率、分享率、评论率、收藏率、浏览时长等中至少两种。
参照图7,示出了本申请的一种行为预估模型的生成装置实施例的结构框图,具体可以包括如下模块:
信息与模型获取模块701,用于获取历史用户行为信息以及初始行为预估模型;
信息向量化模块702,用于对所述历史用户行为信息进行向量化,生成训练向量信息;
模型迭代模块703,用于在所述初始行为预估模型中,输入所述训练向量信息进行迭代,并计算每次迭代后的初始行为预估模型的多个损失函数;所述多个损失函数包括基于不同历史用户行为信息的损失函数;
模型生成模块704,用于当迭代之后的初始行为预估模型的多个损失函数均最小化时,停止迭代,生成目标行为预估模型;
其中,所述历史用户行为信息包括点击率、点赞率、播完率、关注率、分享率、评论率、收藏率、浏览时长等中至少两种。
在本申请实施例的一种可选实施例中,所述初始行为预估模型包括预设数目的隐藏层、与最后一层隐藏层连接的全连接层以及与所述全连接层连接的多个输出节点;所述全连接层用于将所述最后一层隐藏层的输出结果拆分,并将拆分的输出结果分别输入到所述多个输出节点。
在本申请实施例的一种可选实施例中,所述模型迭代模块703包括:
向量映射子模块,用于通过所述预设数目的隐藏层每一神经元的激活函数,对所述训练特征向量逐层进行映射,并将最后一层隐藏层生成的输出结 果传输至所述全连接层;
梯度值生成子模块,用于通过所述全连接层采用所述输出结果,和与所述输出结果对应的多个损失函数,进行误差计算,生成多个梯度值。
在本申请实施例的一种可选实施例中,所述模型生成模块704具体用于:
通过所述输出节点判断所述多个梯度值是否满足预设阈值条件;
若否,则根据所述多个梯度值更新所述每一神经元的激活函数的参数,继续迭代所述初始行为预估模型;
若是,则生成所述目标行为预估模型。
参照图8,示出了本申请的一种内容推荐模型的生成装置实施例的结构框图,具体可以包括如下模块:
数据与模型获取模块801,用于获取样本数据以及初始内容推荐模型,所述样本数据包括历史用户行为信息,与所述历史用户行为信息对应的历史用户行为预估值,推荐内容的内容特征信息;
信息向量化模块802,用于对所述历史用户行为信息、所述历史用户行为预估值以及所述内容特征信息进行向量化,生成训练向量信息;
模型迭代模块803,用于在所述初始内容推荐模型中,输入所述训练向量信息进行迭代,并计算每次迭代后的初始内容推荐模型的多个损失函数;
模型生成模块804,用于当迭代之后的初始内容推荐模型的多个损失函数均最小化时,停止迭代,生成目标内容推荐模型;
其中,所述历史用户行为信息包括点击率、点赞率、播完率、关注率、分享率、评论率、收藏率、浏览时长等中至少两种。
在本申请实施例的一种可选实施例中,所述初始内容推荐模型包括预设数目的隐藏层、与最后一层隐藏层连接的全连接层、与所述全连接层连接的Rank Cost层以及与所述Rank Cost层连接的多个输出节点;所述Rank Cost层用于将所述全连接层的输出结果进行转换,并将转换后的输出结果分别输入到所述多个输出节点。
在本申请实施例的一种可选实施例中,所述模型迭代模块803包括:
向量映射子模块,用于通过所述预设数目的隐藏层、以及所述全连接层 每一神经元的激活函数,对所述训练特征向量逐层进行映射,并将所述全连接层生成的输出结果传输至所述Rank Cost层;
梯度值生成子模块,用于通过所述Rank Cost层采用所述输出结果,和与所述输出结果对应的多个损失函数,进行误差计算,生成多个梯度值。
在本申请实施例的一种可选实施例中,所述模型生成模块804具体用于:
通过所述输出节点判断所述多个梯度值是否满足预设阈值条件;
若否,则根据所述多个梯度值更新所述每一神经元的激活函数的参数,继续迭代所述初始内容推荐模型;
若是,则生成目标内容推荐模型。
对于装置实施例而言,由于其与方法实施例基本相似,所以描述的比较简单,相关之处参见方法实施例的部分说明即可。
本申请实施例还提供了一种存储介质,
其上存储有计算机程序;所述计算机程序适于由处理器加载并执行上述的一个或多个的方法。
本申请实施例还提供了一种计算机设备,其包括:
一个或多个处理器;
存储器;
一个或多个应用程序,其中所述一个或多个应用程序被存储在所述存储器中并被配置为由所述一个或多个处理器执行,所述一个或多个应用程序配置用于执行根据上述的方法。
本说明书中的各个实施例均采用递进的方式描述,每个实施例重点说明的都是与其他实施例的不同之处,各个实施例之间相同相似的部分互相参见即可。
本说明书中的各个实施例均采用递进的方式描述,每个实施例重点说明 的都是与其他实施例的不同之处,各个实施例之间相同相似的部分互相参见即可。本文中所称的“一个实施例”、“实施例”或者“一个或者多个实施例”意味着,结合实施例描述的特定特征、结构或者特性包括在本申请的至少一个实施例中。此外,请注意,这里“在一个实施例中”的词语例子不一定全指同一个实施例。在此处所提供的说明书中,说明了大量具体细节。然而,能够理解,本申请的实施例可以在没有这些具体细节的情况下被实践。在一些实例中,并未详细示出公知的方法、结构和技术,以便不模糊对本说明书的理解。在权利要求中,不应将位于括号之间的任何参考符号构造成对权利要求的限制。单词“包含”不排除存在未列在权利要求中的元件或步骤。位于元件之前的单词“一”或“一个”不排除存在多个这样的元件。本申请可以借助于包括有若干不同元件的硬件以及借助于适当编程的计算机来实现。在列举了若干装置的单元权利要求中,这些装置中的若干个可以是通过同一个硬件项来具体体现。单词第一、第二、以及第三等的使用不表示任何顺序。可将这些单词解释为名称。最后应说明的是:以上实施例仅用以说明本申请的技术方案,而非对其限制;尽管参照前述实施例对本申请进行了详细的说明,本领域的普通技术人员应当理解:其依然可以对前述各实施例所记载的技术方案进行修改,或者对其中部分技术特征进行等同替换;而这些修改或者替换,并不使相应技术方案的本质脱离本申请各实施例技术方案的范围。

Claims (20)

  1. 一种内容的推荐方法,其特征在于,所述方法包括:
    获取与原始推荐内容对应的内容特征信息,以及,用户行为信息;
    根据所述用户行为信息,生成用户行为预估值;
    根据所述用户行为预估值、所述用户行为信息以及所述内容特征信息,得到各个所述原始推荐内容的内容推荐值;
    根据所述内容推荐值,确定至少两个目标推荐内容。
  2. 根据权利要求1所述的方法,其特征在于,所述根据所述用户行为信息,生成用户行为预估值,包括:
    对所述用户行为信息进行向量化处理,生成行为特征向量;
    将所述行为特征向量输入预设的目标行为预估模型,生成所述用户行为预估值。
  3. 根据权利要求1或2所述的方法,其特征在于,所述根据所述评估值以及所述内容特征信息,得到各个所述原始推荐内容的内容推荐值,包括:
    对所述用户行为预估值、所述用户行为信息以及所述内容特征信息进行向量化处理,生成内容推荐特征向量;
    将所述内容推荐向量输入预设的目标内容推荐模型,生成各个所述原始推荐内容的内容推荐值。
  4. 根据权利要求1所述的方法,其特征在于,所述根据所述内容推荐值,确定至少两个目标推荐内容,包括:
    按照内容推荐值从高到低的顺序,对各个所述原始推荐内容进行排序;
    采用排序后的原始推荐内容,生成内容推荐列表;
    从所述内容推荐列表中,提取排序在前的至少两个原始推荐内容,作为目标推荐内容。
  5. 根据权利要求1所述的方法,其特征在于,所述目标行为预估模型通过如下方式生成:
    获取历史用户行为信息以及初始行为预估模型;
    对所述历史用户行为信息进行向量化,生成第一训练向量信息;
    在所述初始行为预估模型中,输入所述第一训练向量信息进行迭代,并计算每次迭代后的初始行为预估模型的多个损失函数;所述多个损失函数包括基于不同历史用户行为信息的损失函数;
    当迭代之后的初始行为预估模型的多个损失函数均最小化时,停止迭代,生成目标行为预估模型。
  6. 根据权利要求1所述的方法,其特征在于,所述目标内容推荐模型通过如下方式生成:
    获取样本数据以及初始内容推荐模型,所述样本数据包括历史用户行为信息,与所述历史用户行为信息对应的历史用户行为预估值,推荐内容的内容特征信息;
    对所述历史用户行为信息、所述历史用户行为预估值以及所述内容特征信息进行向量化,生成第二训练向量信息;
    在所述初始内容推荐模型中,输入所述第二训练向量信息进行迭代,并计算每次迭代后的初始内容推荐模型的多个损失函数;
    当迭代之后的初始内容推荐模型的多个损失函数均最小化时,停止迭代,生成目标内容推荐模型。
  7. 根据权利要求5或6所述的方法,其特征在于,所述用户行为信息包括点击率、点赞率、播完率、关注率、分享率、评论率、收藏率、浏览时长等中至少两种。
  8. 一种行为预估模型的生成方法,其特征在于,包括:
    获取历史用户行为信息以及初始行为预估模型;
    对所述历史用户行为信息进行向量化,生成训练向量信息;
    在所述初始行为预估模型中,输入所述训练向量信息进行迭代,并计算每次迭代后的初始行为预估模型的多个损失函数;所述多个损失函数包括基于不同历史用户行为信息的损失函数;
    当迭代之后的初始行为预估模型的多个损失函数均最小化时,停止迭代,生成目标行为预估模型;
    其中,所述历史用户行为信息包括点击率、点赞率、播完率、关注率、分享率、评论率、收藏率、浏览时长等中至少两种。
  9. 根据权利要求8所述的方法,其特征在于,所述初始行为预估模型包括预设数目的隐藏层、与最后一层隐藏层连接的全连接层以及与所述全连接层连接的多个输出节点;所述全连接层用于将所述最后一层隐藏层的输出结果拆分,并将拆分的输出结果分别输入到所述多个输出节点。
  10. 根据权利要求9所述的方法,其特征在于,所述在所述初始行为预估模型中,输入所述训练向量信息进行迭代,并计算每次迭代后的初始行为预估模型的多个损失函数,包括:
    通过所述预设数目的隐藏层每一神经元的激活函数,对所述训练特征向量逐层进行映射,并将最后一层隐藏层生成的输出结果传输至所述全连接层;
    通过所述全连接层采用所述输出结果,和与所述输出结果对应的多个损失函数,进行误差计算,生成多个梯度值。
  11. 根据权利要求10所述的方法,其特征在于,所述当迭代之后的初始行为预估模型的多个损失函数均最小化时,停止迭代,生成目标行为预估模型,包括:
    通过所述输出节点判断所述多个梯度值是否满足预设阈值条件;
    若否,则根据所述多个梯度值更新所述每一神经元的激活函数的参数,继续迭代所述初始行为预估模型;
    若是,则生成所述目标行为预估模型。
  12. 一种内容推荐模型的生成方法,其特征在于,包括:
    获取样本数据以及初始内容推荐模型,所述样本数据包括历史用户行为信息,与所述历史用户行为信息对应的历史用户行为预估值,推荐内容的内容特征信息;
    对所述历史用户行为信息、所述历史用户行为预估值以及所述内容特征信息进行向量化,生成训练向量信息;
    在所述初始内容推荐模型中,输入所述训练向量信息进行迭代,并计算每次迭代后的初始内容推荐模型的多个损失函数;
    当迭代之后的初始内容推荐模型的多个损失函数均最小化时,停止迭代,生成目标内容推荐模型;
    其中,所述历史用户行为信息包括点击率、点赞率、播完率、关注率、分享率、评论率、收藏率、浏览时长等中至少两种。
  13. 根据权利要求12所述的方法,其特征在于,所述初始内容推荐模型包括预设数目的隐藏层、与最后一层隐藏层连接的全连接层、与所述全连接层连接的Rank Cost层以及与所述Rank Cost层连接的多个输出节点;所述Rank Cost层用于将所述全连接层的输出结果进行转换,并将转换后的输出结果分别输入到所述多个输出节点。
  14. 根据权利要求13所述的方法,其特征在于,所述在所述初始内容推荐模型中,输入所述训练向量信息进行迭代,并计算每次迭代后的初始内容推荐模型的多个损失函数,包括:
    通过所述预设数目的隐藏层、以及所述全连接层每一神经元的激活函数,对所述训练特征向量逐层进行映射,并将所述全连接层生成的输出结果传输至所述Rank Cost层;
    通过所述Rank Cost层采用所述输出结果,和与所述输出结果对应的多个损失函数,进行误差计算,生成多个梯度值。
  15. 根据权利要求14所述的方法,其特征在于,当迭代之后的初始内容推荐模型的多个损失函数均最小化时,停止迭代,生成目标内容推荐模型,包括:
    通过所述输出节点判断所述多个梯度值是否满足预设阈值条件;
    若否,则根据所述多个梯度值更新所述每一神经元的激活函数的参数,继续迭代所述初始内容推荐模型;
    若是,则生成目标内容推荐模型。
  16. 一种内容的推荐装置,其特征在于,所述装置包括:
    信息获取模块,用于获取与原始推荐内容对应的内容特征信息,以及,用户行为信息;
    预估值生成模块,用于根据所述用户行为信息,生成用户行为预估值;
    推荐值生成模块,用于根据所述用户行为预估值、所述用户行为信息以及所述内容特征信息,得到各个所述原始推荐内容的内容推荐值;
    推荐内容确定模块,用于根据所述内容推荐值,确定至少两个目标推荐内容。
  17. 一种行为预估模型的生成装置,其特征在于,包括:
    信息与模型获取模块,用于获取历史用户行为信息以及初始行为预估模型;
    信息向量化模块,用于对所述历史用户行为信息进行向量化,生成训练向量信息;
    模型迭代模块,用于在所述初始行为预估模型中,输入所述训练向量信息进行迭代,并计算每次迭代后的初始行为预估模型的多个损失函数;所述多个损失函数包括基于不同历史用户行为信息的损失函数;
    模型生成模块,用于当迭代之后的初始行为预估模型的多个损失函数均最小化时,停止迭代,生成目标行为预估模型;
    其中,所述历史用户行为信息包括点击率、点赞率、播完率、关注率、分享率、评论率、收藏率、浏览时长等中至少两种。
  18. 一种内容推荐模型的生成装置,其特征在于,包括:
    数据与模型获取模块,用于获取样本数据以及初始内容推荐模型,所述样本数据包括历史用户行为信息,与所述历史用户行为信息对应的历史用户行为预估值,推荐内容的内容特征信息;
    信息向量化模块,用于对所述历史用户行为信息、所述历史用户行为预估值以及所述内容特征信息进行向量化,生成训练向量信息;
    模型迭代模块,用于在所述初始内容推荐模型中,输入所述训练向量信息进行迭代,并计算每次迭代后的初始内容推荐模型的多个损失函数;
    模型生成模块,用于当迭代之后的初始内容推荐模型的多个损失函数均最小化时,停止迭代,生成目标内容推荐模型;
    其中,所述历史用户行为信息包括点击率、点赞率、播完率、关注率、分享率、评论率、收藏率、浏览时长等中至少两种。
  19. 一种存储介质,其特征在于,
    其上存储有计算机程序;所述计算机程序适于由处理器加载并执行上述权利要求1-7或8-11或12-15所述的一个或多个的方法。
  20. 一种计算机设备,其特征在于,其包括:
    一个或多个处理器;
    存储器;
    一个或多个应用程序,其中所述一个或多个应用程序被存储在所述存储器中并被配置为由所述一个或多个处理器执行,所述一个或多个应用程序配置用于执行根据权利要求1-7或8-11或12-15所述的方法。
PCT/CN2020/124793 2019-12-31 2020-10-29 推荐方法、模型生成方法、装置、介质以及设备 WO2021135588A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201911418934.6 2019-12-31
CN201911418934.6A CN111209476B (zh) 2019-12-31 2019-12-31 推荐方法、模型生成方法、装置、介质及设备

Publications (1)

Publication Number Publication Date
WO2021135588A1 true WO2021135588A1 (zh) 2021-07-08

Family

ID=70788323

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2020/124793 WO2021135588A1 (zh) 2019-12-31 2020-10-29 推荐方法、模型生成方法、装置、介质以及设备

Country Status (2)

Country Link
CN (1) CN111209476B (zh)
WO (1) WO2021135588A1 (zh)

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113642740A (zh) * 2021-08-12 2021-11-12 百度在线网络技术(北京)有限公司 模型训练方法及装置、电子设备和介质
CN113705782A (zh) * 2021-08-18 2021-11-26 上海明略人工智能(集团)有限公司 一种用于媒体数据推荐的模型训练方法及装置
CN113779386A (zh) * 2021-08-24 2021-12-10 北京达佳互联信息技术有限公司 模型训练方法和信息推荐方法
CN113836291A (zh) * 2021-09-29 2021-12-24 北京百度网讯科技有限公司 数据处理方法、装置、设备和存储介质
CN114257842A (zh) * 2021-12-20 2022-03-29 中国平安财产保险股份有限公司 一种点赞数据处理系统、方法、装置及存储介质
CN114430503A (zh) * 2022-01-25 2022-05-03 上海影宴数码科技有限公司 一种基于短视频大数据叠加推荐方法
CN114707041A (zh) * 2022-04-11 2022-07-05 中国电信股份有限公司 消息推荐方法、装置、计算机可读介质及电子设备

Families Citing this family (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111209476B (zh) * 2019-12-31 2023-09-01 广州市百果园信息技术有限公司 推荐方法、模型生成方法、装置、介质及设备
CN113836390B (zh) * 2020-06-24 2023-10-27 北京达佳互联信息技术有限公司 资源推荐方法、装置、计算机设备及存储介质
CN112785390B (zh) * 2021-02-02 2024-02-09 微民保险代理有限公司 推荐处理方法、装置、终端设备以及存储介质
CN113343024B (zh) * 2021-08-04 2021-12-07 北京达佳互联信息技术有限公司 对象推荐方法、装置、电子设备及存储介质
CN113611389A (zh) * 2021-08-11 2021-11-05 东南数字经济发展研究院 一种基于梯度策略决策算法的个性化运动推荐方法
CN113821731A (zh) * 2021-11-23 2021-12-21 湖北亿咖通科技有限公司 信息推送方法、设备和介质

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109408731A (zh) * 2018-12-27 2019-03-01 网易(杭州)网络有限公司 一种多目标推荐方法、多目标推荐模型生成方法以及装置
CN109684543A (zh) * 2018-12-14 2019-04-26 北京百度网讯科技有限公司 用户行为预测和信息投放方法、装置、服务器和存储介质
CN110442790A (zh) * 2019-08-07 2019-11-12 腾讯科技(深圳)有限公司 推荐多媒体数据的方法、装置、服务器和存储介质
CN110569427A (zh) * 2019-08-07 2019-12-13 智者四海(北京)技术有限公司 一种多目标排序模型训练、用户行为预测方法及装置
CN111209476A (zh) * 2019-12-31 2020-05-29 广州市百果园信息技术有限公司 推荐方法、模型生成方法、装置、介质及设备

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109684543A (zh) * 2018-12-14 2019-04-26 北京百度网讯科技有限公司 用户行为预测和信息投放方法、装置、服务器和存储介质
CN109408731A (zh) * 2018-12-27 2019-03-01 网易(杭州)网络有限公司 一种多目标推荐方法、多目标推荐模型生成方法以及装置
CN110442790A (zh) * 2019-08-07 2019-11-12 腾讯科技(深圳)有限公司 推荐多媒体数据的方法、装置、服务器和存储介质
CN110569427A (zh) * 2019-08-07 2019-12-13 智者四海(北京)技术有限公司 一种多目标排序模型训练、用户行为预测方法及装置
CN111209476A (zh) * 2019-12-31 2020-05-29 广州市百果园信息技术有限公司 推荐方法、模型生成方法、装置、介质及设备

Cited By (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113642740A (zh) * 2021-08-12 2021-11-12 百度在线网络技术(北京)有限公司 模型训练方法及装置、电子设备和介质
CN113642740B (zh) * 2021-08-12 2023-08-01 百度在线网络技术(北京)有限公司 模型训练方法及装置、电子设备和介质
CN113705782A (zh) * 2021-08-18 2021-11-26 上海明略人工智能(集团)有限公司 一种用于媒体数据推荐的模型训练方法及装置
CN113779386A (zh) * 2021-08-24 2021-12-10 北京达佳互联信息技术有限公司 模型训练方法和信息推荐方法
CN113836291A (zh) * 2021-09-29 2021-12-24 北京百度网讯科技有限公司 数据处理方法、装置、设备和存储介质
CN113836291B (zh) * 2021-09-29 2023-08-15 北京百度网讯科技有限公司 数据处理方法、装置、设备和存储介质
CN114257842A (zh) * 2021-12-20 2022-03-29 中国平安财产保险股份有限公司 一种点赞数据处理系统、方法、装置及存储介质
CN114257842B (zh) * 2021-12-20 2023-06-23 中国平安财产保险股份有限公司 一种点赞数据处理系统、方法、装置及存储介质
CN114430503A (zh) * 2022-01-25 2022-05-03 上海影宴数码科技有限公司 一种基于短视频大数据叠加推荐方法
CN114430503B (zh) * 2022-01-25 2023-08-04 上海影宴数码科技有限公司 一种基于短视频大数据叠加推荐方法
CN114707041A (zh) * 2022-04-11 2022-07-05 中国电信股份有限公司 消息推荐方法、装置、计算机可读介质及电子设备
CN114707041B (zh) * 2022-04-11 2023-12-01 中国电信股份有限公司 消息推荐方法、装置、计算机可读介质及电子设备

Also Published As

Publication number Publication date
CN111209476B (zh) 2023-09-01
CN111209476A (zh) 2020-05-29

Similar Documents

Publication Publication Date Title
WO2021135588A1 (zh) 推荐方法、模型生成方法、装置、介质以及设备
CN110020094B (zh) 一种搜索结果的展示方法和相关装置
Li et al. Towards context-aware social recommendation via individual trust
CA3007853C (en) End-to-end deep collaborative filtering
CN108230058B (zh) 产品推荐方法及系统
CN110717098B (zh) 基于元路径的上下文感知用户建模方法、序列推荐方法
CN108829808B (zh) 一种页面个性化排序方法、装置及电子设备
WO2022016522A1 (zh) 推荐模型的训练方法、推荐方法、装置及计算机可读介质
CN111242310B (zh) 特征有效性评估方法、装置、电子设备及存储介质
WO2018121700A1 (zh) 基于已安装应用来推荐应用信息的方法、装置、终端设备及存储介质
CN109903086B (zh) 一种相似人群扩展方法、装置及电子设备
CN106127506B (zh) 一种基于主动学习解决商品冷启动问题的推荐方法
JP2024503774A (ja) 融合パラメータの特定方法及び装置、情報推奨方法及び装置、パラメータ測定モデルのトレーニング方法及び装置、電子機器、記憶媒体、並びにコンピュータプログラム
CN111950593A (zh) 一种推荐模型训练的方法及装置
CN114202061A (zh) 基于生成对抗网络模型及深度强化学习的物品推荐方法、电子设备及介质
US20220172083A1 (en) Noise contrastive estimation for collaborative filtering
WO2019099913A1 (en) Aspect pre-selection using machine learning
CN111695024A (zh) 对象评估值的预测方法及系统、推荐方法及系统
Wang A hybrid recommendation for music based on reinforcement learning
CN110781377B (zh) 一种文章推荐方法、装置
CN109063120B (zh) 一种基于聚类的协同过滤推荐方法和装置
Sharma et al. Suggestive approaches to create a recommender system for GitHub
CN112258285A (zh) 一种内容推荐方法及装置、设备、存储介质
CN117391824B (zh) 基于大语言模型和搜索引擎推荐物品的方法及装置
CN112801743B (zh) 一种商品推荐方法、装置、电子设备及存储介质

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20908675

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 20908675

Country of ref document: EP

Kind code of ref document: A1