CN113596528B - Training method and device of video push model, server and storage medium - Google Patents

Training method and device of video push model, server and storage medium Download PDF

Info

Publication number
CN113596528B
CN113596528B CN202010366374.0A CN202010366374A CN113596528B CN 113596528 B CN113596528 B CN 113596528B CN 202010366374 A CN202010366374 A CN 202010366374A CN 113596528 B CN113596528 B CN 113596528B
Authority
CN
China
Prior art keywords
video
information
account
video information
prediction
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202010366374.0A
Other languages
Chinese (zh)
Other versions
CN113596528A (en
Inventor
王琳
叶璨
黄俊逸
胥凯
闫阳辉
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Dajia Internet Information Technology Co Ltd
Original Assignee
Beijing Dajia Internet Information Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Dajia Internet Information Technology Co Ltd filed Critical Beijing Dajia Internet Information Technology Co Ltd
Priority to CN202010366374.0A priority Critical patent/CN113596528B/en
Publication of CN113596528A publication Critical patent/CN113596528A/en
Application granted granted Critical
Publication of CN113596528B publication Critical patent/CN113596528B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/25Management operations performed by the server for facilitating the content distribution or administrating data related to end-users or client devices, e.g. end-user or client device authentication, learning user preferences for recommending movies
    • H04N21/262Content or additional data distribution scheduling, e.g. sending additional data at off-peak times, updating software modules, calculating the carousel transmission frequency, delaying a video stream transmission, generating play-lists
    • H04N21/26208Content or additional data distribution scheduling, e.g. sending additional data at off-peak times, updating software modules, calculating the carousel transmission frequency, delaying a video stream transmission, generating play-lists the scheduling operation being performed under constraints
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/23Processing of content or additional data; Elementary server operations; Server middleware
    • H04N21/234Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs
    • H04N21/23418Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs involving operations for analysing video streams, e.g. detecting features or characteristics
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/25Management operations performed by the server for facilitating the content distribution or administrating data related to end-users or client devices, e.g. end-user or client device authentication, learning user preferences for recommending movies
    • H04N21/251Learning process for intelligent management, e.g. learning user preferences for recommending movies
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/25Management operations performed by the server for facilitating the content distribution or administrating data related to end-users or client devices, e.g. end-user or client device authentication, learning user preferences for recommending movies
    • H04N21/262Content or additional data distribution scheduling, e.g. sending additional data at off-peak times, updating software modules, calculating the carousel transmission frequency, delaying a video stream transmission, generating play-lists
    • H04N21/26258Content or additional data distribution scheduling, e.g. sending additional data at off-peak times, updating software modules, calculating the carousel transmission frequency, delaying a video stream transmission, generating play-lists for generating a list of items to be played back in a given order, e.g. playlist, or scheduling item distribution according to such list
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/45Management operations performed by the client for facilitating the reception of or the interaction with the content or administrating data related to the end-user or to the client device itself, e.g. learning user preferences for recommending movies, resolving scheduling conflicts
    • H04N21/466Learning process for intelligent management, e.g. learning user preferences for recommending movies
    • H04N21/4662Learning process for intelligent management, e.g. learning user preferences for recommending movies characterized by learning algorithms
    • H04N21/4666Learning process for intelligent management, e.g. learning user preferences for recommending movies characterized by learning algorithms using neural networks, e.g. processing the feedback provided by the user
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/45Management operations performed by the client for facilitating the reception of or the interaction with the content or administrating data related to the end-user or to the client device itself, e.g. learning user preferences for recommending movies, resolving scheduling conflicts
    • H04N21/466Learning process for intelligent management, e.g. learning user preferences for recommending movies
    • H04N21/4667Processing of monitored end-user data, e.g. trend analysis based on the log file of viewer selections
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/45Management operations performed by the client for facilitating the reception of or the interaction with the content or administrating data related to the end-user or to the client device itself, e.g. learning user preferences for recommending movies, resolving scheduling conflicts
    • H04N21/466Learning process for intelligent management, e.g. learning user preferences for recommending movies
    • H04N21/4668Learning process for intelligent management, e.g. learning user preferences for recommending movies for recommending content, e.g. movies
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/47End-user applications
    • H04N21/482End-user interface for program selection
    • H04N21/4826End-user interface for program selection using recommendation lists, e.g. of programs or channels sorted out according to their score

Landscapes

  • Engineering & Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Signal Processing (AREA)
  • Multimedia (AREA)
  • Evolutionary Computation (AREA)
  • Data Mining & Analysis (AREA)
  • Artificial Intelligence (AREA)
  • Theoretical Computer Science (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Evolutionary Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Computing Systems (AREA)
  • Software Systems (AREA)
  • Human Computer Interaction (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Two-Way Televisions, Distribution Of Moving Picture Or The Like (AREA)

Abstract

The disclosure relates to a training method, a device, a server and a storage medium of a video push model, wherein the method comprises the following steps: acquiring account information of a sample account and actual operation information of the sample account on pushed video information; inputting the account information and the video information into a video operation prediction model to obtain prediction operation information of the sample account on the video information; training the video operation prediction model according to the prediction operation information and the actual operation information; according to the trained video operation prediction model, obtaining prediction operation information of a plurality of sample accounts on target video information, and using the prediction operation information as training sample data of a video push model to be trained; and training the video push model to be trained according to the training sample data. By adopting the method, the training efficiency of the video push model can be improved.

Description

Training method and device of video push model, server and storage medium
Technical Field
The present disclosure relates to the field of computer technologies, and in particular, to a method and an apparatus for training a video push model, a server, and a storage medium.
Background
With the development of computer technology, various applications for browsing videos are in endless, and more accounts are selected to browse videos through the applications; to achieve accurate pushing of videos, the videos pushed to the account are typically determined by training a model for pushing the videos.
In the related art, a training mode of a model for pushing a video is generally to repeatedly train the model for pushing the video by acquiring a large amount of video operation sample data on a line until the model for pushing the video converges; however, the process of operating sample data through a large amount of videos on the acquisition line is complicated, so that the training time of the model is long, and the training efficiency of the model is low.
Disclosure of Invention
The present disclosure provides a training method, an apparatus, a server and a storage medium for a video push model, so as to at least solve the problem of low training efficiency of models in the related art. The technical scheme of the disclosure is as follows:
according to a first aspect of the embodiments of the present disclosure, there is provided a training method for a video push model, including:
acquiring account information of a sample account and actual operation information of the sample account on pushed video information;
inputting the account information and the video information into a video operation prediction model to obtain prediction operation information of the sample account on the video information;
training the video operation prediction model according to the prediction operation information and the actual operation information;
according to the trained video operation prediction model, obtaining prediction operation information of a plurality of sample accounts on target video information, and using the prediction operation information as training sample data of a video push model to be trained;
and training the video push model to be trained according to the training sample data.
In an exemplary embodiment, the inputting the account information and the video information into a video operation prediction model to obtain the prediction operation information of the sample account on the video information includes:
extracting first video information which is operated by the sample account in sequence before a preset moment and second video information which is operated by the sample account at the preset moment from the video information;
acquiring account information characteristic codes of the account information and first video information characteristic codes of the first video information;
inputting the account information feature code and the first video information feature code into an account state coding network in the video operation prediction model to obtain an account state code of the sample account at the preset moment;
inputting a second video information feature code of the second video information and the account status code into an operation prediction network in the video operation prediction model to obtain the prediction operation information of the sample account on the second video information at the preset moment.
In an exemplary embodiment, the inputting the account information feature code and the first video information feature code into an account status coding network in the video operation prediction model to obtain an account status code of the sample account at the preset time includes:
inputting the first video information feature code into a first network in the account state coding network to obtain the video state code at the preset moment;
and inputting the account information characteristic code and the video state code into a second network in the account state code network to obtain the account state code of the sample account at the preset moment.
In an exemplary embodiment, the inputting the second video information feature coding and the account status coding into an operation prediction network in the video operation prediction model to obtain the prediction operation information of the sample account on the second video information includes:
inputting the second video information feature code and the account state code into an operation prediction network in the video operation prediction model to obtain a plurality of operation behavior probabilities of the sample account on the second video information;
and according to preset weights corresponding to the operation behavior probabilities, weighting the operation behavior probabilities to obtain a target operation probability of the sample account on the second video information, wherein the target operation probability is correspondingly used as the prediction operation information of the sample account on the second video information at the preset moment.
In an exemplary embodiment, the prediction operation information of the sample account on the target video information includes the prediction operation information of the sample account on the target video information at each preset time;
the training the video push model to be trained according to the training sample data comprises the following steps:
acquiring target video information characteristic codes of the target video information;
inputting the account information feature codes and the target video information feature codes into an account state coding network in the video pushing model to be trained to obtain target account state codes of the sample accounts at all the preset moments;
inputting the target video information feature code and the target account state code into an operation prediction network in the video push model to be trained to obtain target prediction operation information of the sample account on the target video information at each preset moment;
inputting the target prediction operation information into a preset video push evaluation model to obtain an operation feedback value of the sample account on the target video information at each preset moment;
and repeatedly training the video push model to be trained and the preset video push evaluation model according to the target account state code, the prediction operation information of the sample account on the target video information at each preset moment, the target prediction operation information and the operation feedback value until the video push model to be trained and the preset video push evaluation model both meet the convergence condition.
According to a second aspect of the embodiments of the present disclosure, there is provided a video push method, including:
acquiring account information of an account to be pushed;
inputting the account information of the account to be pushed into the trained video pushing model to obtain the pushed video information of the account to be pushed; the trained video push model is obtained according to the training method of the video push model;
and pushing the pushed video information to the account to be pushed.
In an exemplary embodiment, pushing the pushed video information to the account to be pushed includes:
arranging the push video information according to the sequence of outputting the push video information by the trained video push model to obtain the arranged push video information;
and pushing the arranged pushed video information to the account to be pushed.
According to a third aspect of the embodiments of the present disclosure, there is provided a training apparatus for a video push model, including:
the information acquisition unit is configured to acquire account information of a sample account and actual operation information of the sample account on pushed video information;
an information prediction unit configured to perform input of the account information and the video information into a video operation prediction model, so as to obtain prediction operation information of the sample account on the video information;
a prediction model training unit configured to perform training of the video operation prediction model according to the prediction operation information and the actual operation information;
the sample data acquisition unit is configured to execute the video operation prediction model after training to obtain prediction operation information of a plurality of sample accounts on target video information, and the prediction operation information is used as training sample data of a video push model to be trained;
and the push model training unit is configured to train the video push model to be trained according to the training sample data.
In an exemplary embodiment, the information prediction unit is further configured to extract, from the video information, first video information in which the sample account has been operated sequentially before a preset time and second video information in which the sample account has been operated at the preset time; acquiring account information characteristic codes of the account information and first video information characteristic codes of the first video information; inputting the account information feature code and the first video information feature code into an account state coding network in the video operation prediction model to obtain an account state code of the sample account at the preset moment; and inputting a second video information characteristic code and the account state code of the second video information into an operation prediction network in the video operation prediction model to obtain the prediction operation information of the sample account on the second video information at the preset moment.
In an exemplary embodiment, the information prediction unit is further configured to perform input of the first video information feature code into a first network of the account status coding networks, so as to obtain a video status code at the preset time; and inputting the account information characteristic code and the video state code into a second network in the account state code network to obtain the account state code of the sample account at the preset moment.
In an exemplary embodiment, the information prediction unit is further configured to execute an operation prediction network that inputs the second video information feature coding and the account status coding into the video operation prediction model, and obtain a plurality of operation behavior probabilities of the sample account on the second video information; and according to preset weights corresponding to the operation behavior probabilities, weighting the operation behavior probabilities to obtain a target operation probability of the sample account on the second video information, wherein the target operation probability is correspondingly used as the prediction operation information of the sample account on the second video information at the preset moment.
In an exemplary embodiment, the prediction operation information of the sample account on the target video information includes the prediction operation information of the sample account on the target video information at each preset time;
the push model training unit is further configured to perform target video information feature coding for acquiring the target video information; inputting the account information feature codes and the target video information feature codes into an account state coding network in the video pushing model to be trained to obtain target account state codes of the sample accounts at all the preset moments; inputting the target video information feature code and the target account state code into an operation prediction network in the video push model to be trained to obtain target prediction operation information of the sample account on the target video information at each preset moment; inputting the target prediction operation information into a preset video push evaluation model to obtain an operation feedback value of the sample account on the target video information at each preset moment; repeatedly training the to-be-trained video push model and the preset video push evaluation model according to the target account state code, the prediction operation information of the sample account on the target video information at each preset moment, the target prediction operation information and the operation feedback value until the to-be-trained video push model and the preset video push evaluation model both meet the convergence condition.
According to a fourth aspect of the embodiments of the present disclosure, there is provided a video push apparatus including:
the account information acquisition unit is configured to acquire account information of an account to be pushed;
the video information acquisition unit is configured to execute the video pushing model which inputs the account information of the account to be pushed into the training completion to obtain the pushed video information of the account to be pushed; the trained video push model is obtained according to the training method of the video push model;
a video information pushing unit configured to perform pushing of the pushed video information to the account to be pushed.
In an exemplary embodiment, the video information pushing unit is further configured to perform arranging the pushed video information according to an order in which the trained video pushing model outputs the pushed video information, so as to obtain arranged pushed video information; and pushing the arranged pushed video information to the account to be pushed.
According to a fifth aspect of embodiments of the present disclosure, there is provided a server including: a processor; a memory for storing the processor-executable instructions; wherein the processor is configured to execute the instructions to implement a method of training a video push model as described in any embodiment of the first aspect and a method of video push as described in any embodiment of the second aspect.
According to a sixth aspect of embodiments of the present disclosure, there is provided a storage medium comprising: the instructions in the storage medium, when executed by the processor of the server, enable the server to perform the method of training a video push model as described in any of the embodiments of the first aspect and the method of video pushing as described in any of the embodiments of the second aspect.
According to a seventh aspect of embodiments of the present disclosure, there is provided a computer program product, the program product comprising a computer program, the computer program being stored in a readable storage medium, from which the at least one processor of the apparatus reads and executes the computer program, so that the apparatus performs the training method of the video push model described in any of the embodiments of the first aspect and the video push method described in any of the embodiments of the second aspect.
The technical scheme provided by the embodiment of the disclosure at least brings the following beneficial effects:
obtaining the prediction operation information of the sample account on the video information by obtaining the account information of the sample account and the actual operation information of the sample account on the pushed video information and inputting the account information and the video information into a video operation prediction model; then, training a video operation prediction model according to the prediction operation information and the actual operation information; finally, according to the trained video operation prediction model, obtaining the prediction operation information of a plurality of sample accounts on the target video information, using the prediction operation information as sample data of the video push model to be trained, and further training the video push model to be trained; the aim of training the video push model according to the prediction operation information of the target video information by a plurality of sample accounts output by the trained video operation prediction model is fulfilled; the trained video operation prediction model is used as a training sample simulator, training sample data of the video push model can be generated rapidly, and a large amount of video operation sample data on a line does not need to be acquired, so that the training process of the video push model is simplified, and the training efficiency of the video push model is improved.
It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the disclosure.
Drawings
The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the present disclosure and, together with the description, serve to explain the principles of the disclosure and are not to be construed as limiting the disclosure.
Fig. 1 is a diagram illustrating an application environment of a training method of a video push model according to an exemplary embodiment.
FIG. 2 is a flowchart illustrating a method of training a video push model, according to an example embodiment.
Fig. 3 is a flowchart illustrating steps for obtaining prediction operation information of a sample account for video information according to an exemplary embodiment.
FIG. 4 is a diagram illustrating training of a video operation prediction model according to an exemplary embodiment.
FIG. 5 is a flowchart illustrating the training steps of a video push model according to an exemplary embodiment.
Fig. 6 is a flowchart illustrating a video push method according to an exemplary embodiment.
FIG. 7 is a block diagram illustrating a training apparatus for a video push model in accordance with an exemplary embodiment.
Fig. 8 is a block diagram illustrating a video push device according to an example embodiment.
Fig. 9 is an internal block diagram of a server according to an example embodiment.
Detailed Description
In order to make the technical solutions of the present disclosure better understood by those of ordinary skill in the art, the technical solutions in the embodiments of the present disclosure will be clearly and completely described below with reference to the accompanying drawings.
It should be noted that the terms "first," "second," and the like in the description and claims of the present disclosure and in the above-described drawings are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used is interchangeable under appropriate circumstances such that the embodiments of the disclosure described herein are capable of operation in sequences other than those illustrated or otherwise described herein. The implementations described in the exemplary embodiments below are not intended to represent all implementations consistent with the present disclosure. Rather, they are merely examples of apparatus and methods consistent with certain aspects of the disclosure, as detailed in the appended claims.
The training method of the video push model provided by the present disclosure can be applied to the application environment shown in fig. 1. Referring to fig. 1, the application environment diagram includes a server 110, and the server 110 may be implemented by an independent server or a server cluster composed of a plurality of servers. In fig. 1, the server 110 is taken as an independent server for illustration, and referring to fig. 1, the server 110 obtains account information of a sample account and actual operation information of the sample account on pushed video information; inputting the account information and the video information into a video operation prediction model to obtain prediction operation information of the sample account on the video information; training a video operation prediction model according to the prediction operation information and the actual operation information; according to the trained video operation prediction model, obtaining prediction operation information of a plurality of sample accounts on target video information, and using the prediction operation information as training sample data of a video push model to be trained; and training the video push model to be trained according to the training sample data to obtain a trained video push model, and outputting video information corresponding to the account to be pushed through the trained video push model.
Fig. 2 is a flowchart illustrating a training method of a video push model according to an exemplary embodiment, where as shown in fig. 2, the training method of the video push model is used in the server 110 shown in fig. 1, and includes the following steps:
in step S210, account information of the sample account and actual operation information of the sample account on the pushed video information are acquired.
The account refers to a registered account of a video application in the terminal, such as a registered account of a short video application, a registered account of a video browsing application, and the like. The sample account is an authorized account which needs to be processed and analyzed, and the account information refers to information for identifying the account, such as the age of the account, the sex of the account, the city where the account is located, the type of the terminal used by the account, the network connection mode of the terminal used by the account, the video operation behavior information of the account, and the like; the video operation behavior information of the account can be click, praise, attention, long-time viewing (particularly whether the video operation is finished playing), and the like.
The pushed video information may be short video information, micro-movie information, drama information, and the like, and has corresponding video characteristics, such as a video category, a score category in a video, video operation behavior information, and the like.
The actual operation information of the sample account on the pushed video information refers to that the sample account performs operations such as clicking, praise, attention and long-term watching on the pushed video information; of course, the present invention may also refer to actual operation information of the sample account on the pushed video information at each preset time, specifically, what operation the sample account performs on a certain video at a first preset time, what operation the sample account performs on a certain video at a second preset time, what operation the sample account performs on a certain video at a last preset time, and the like for the video information. For example, for video information pushed to the sample account, the sample account clicks on the a video at a first preset time, approves the B video at a second preset time, and pays attention to the N video at a last preset time, and the like.
Specifically, the server acquires account information corresponding to an authorized account on the network based on a big data technology, and the account information is used as account information of the sample account; acquiring a video operation log of a sample account, and sampling video information pushed to the sample account and actual operation information of the sample account on the pushed video information from the video operation log of the sample account; therefore, the video operation prediction model can be trained subsequently according to the account information of the sample account and the actual operation information of the sample account on the pushed video information, so that the trained video operation prediction model can be obtained.
In step S220, the account information and the video information are input into the video operation prediction model, so as to obtain the prediction operation information of the sample account on the video information.
In step S230, a video operation prediction model is trained based on the prediction operation information and the actual operation information.
The video operation prediction model is a supervised learning network, is used for counting the prediction operation information of an account on video information, and mainly comprises two parts, wherein the first part is an account state coding network and is used for coding the state of a sample account to obtain an account state code; the second part is an operation prediction network, which is used for calculating the prediction operation information of the sample account on the video information under the current account state, such as the click probability, the like probability, the attention probability, the long viewing probability and the like.
Specifically, the server inputs the account information of the sample account and the video information pushed to the sample account into a video operation prediction model to obtain the prediction operation information of the sample account on the video information; obtaining a difference value between the prediction operation information of the sample account on the video information and the corresponding actual operation information, and determining a loss value of a video operation prediction model according to the difference value; for example, the difference value is used as a loss value of the video operation prediction model, or a loss value of the video operation prediction model is obtained through calculation by combining a cross entropy loss function according to the difference value; secondly, the server reversely trains the video operation prediction model according to the loss value until the training times of the video operation prediction model reach preset training times or until the network parameters of the video operation prediction model reach convergence; and if the training times of the video operation prediction model reach the preset training times or the network parameters of the video operation prediction model reach convergence, taking the current video operation prediction model as the trained video operation prediction model. Therefore, the method is beneficial to obtaining the prediction operation information of a plurality of sample accounts on the target video information through the trained video operation prediction model subsequently, and the online collection of a large amount of video operation sample data is not needed, so that the training process of the video push model is simplified, and the training efficiency of the video push model is improved.
In step S240, according to the trained video operation prediction model, obtaining prediction operation information of the target video information by multiple sample accounts, which is used as training sample data of the video push model to be trained. The video push model to be trained is a model capable of pushing video information to an account, the network structure of the model is basically consistent with that of a video operation prediction model, and the model also has an account state coding network and an operation prediction network.
Specifically, the server collects video information on the network or video information in the candidate set as target video information; inputting the account information of the sample account and the target video information into the trained video operation prediction model to obtain the prediction operation information of the sample account on the target video information; by analogy, the prediction operation information of the plurality of sample accounts on the target video information can be obtained, and the prediction operation information of the plurality of sample accounts on the target video information is used as training sample data of the video push model to be trained. Therefore, the method is beneficial to rapidly obtaining the prediction operation information of the target video information by the multiple sample accounts through the trained video operation prediction model, and further improves the training efficiency of the video push model.
Further, the server can also input the 0 th video information (full 0 vector) and the account information of the sample account into an account state coding network in a video push model to be trained to obtain an account state code at the first moment; inputting the account state code at the first moment and the video information in the candidate set into an operation prediction network in a video push model to be trained to obtain the probability that each video information in the candidate set is selected by the sample account at the first moment; the video information with the maximum probability is used as first target video information pushed to the sample account; inputting the 0 th video information, the first target video information and the account information into an account state coding network in a video push model to be trained to obtain an account state code at a second moment; inputting the account state code at the second moment and the video information in the candidate set into an operation prediction network in a video push model to be trained to obtain the probability that each video information in the candidate set is selected by the sample account at the second moment; the video information with the maximum probability is used as second target video information pushed to the sample account; by analogy, target video information pushed again to the sample account by the video pushing model to be trained can be obtained; with reference to the method, target video information pushed again to a plurality of sample accounts can be obtained; secondly, inputting the account information of the sample account and the target video information pushed to the sample account again into the trained video operation prediction model to obtain the prediction operation information of the sample account on the target video information; by analogy, the prediction operation information of the target video information re-pushed to the sample accounts by the multiple sample accounts can be obtained and used as training sample data of the video pushing model to be trained.
In step S250, the video push model to be trained is trained according to the training sample data.
Specifically, the server repeatedly trains the video push model to be trained according to training sample data to obtain a predicted loss value of the video push model to be trained on the training sample data; reversely training a video push model to be trained according to the predicted loss value until the video push model meets a convergence condition; and if the video pushing model meets the convergence condition, for example, the training times of the video pushing model reach the preset training times or the network parameters of the video pushing model reach convergence, taking the video pushing model as the trained video pushing model.
Further, after the trained video push model is obtained, the server may determine, according to the above-described manner of determining the target video information to be pushed to the sample account again, first video information to be pushed to the account to be pushed, then second video information to be pushed to the account to be pushed, and so on, may determine final video information to be pushed to the account to be pushed; therefore, the influence of the first K-1 video information is comprehensively considered when the Kth video information is determined, the accuracy of the determined video information is improved, and the video pushing accuracy is further improved.
In the training method of the video push model, the prediction operation information of the sample account on the video information is obtained by acquiring the account information of the sample account and the actual operation information of the sample account on the pushed video information and inputting the account information and the video information into the video operation prediction model; then, training a video operation prediction model according to the prediction operation information and the actual operation information; finally, according to the trained video operation prediction model, obtaining the prediction operation information of a plurality of sample accounts on the target video information, using the prediction operation information as sample data of the video push model to be trained, and further training the video push model to be trained; the aim of training the video push model according to the prediction operation information of the target video information by a plurality of sample accounts output by the trained video operation prediction model is fulfilled; the trained video operation prediction model is used as a training sample simulator, training sample data of the video push model can be generated quickly, and a large amount of online video operation sample data does not need to be acquired, so that the training process of the video push model is simplified, and the training efficiency of the video push model is improved.
In an exemplary embodiment, as shown in fig. 3, in step S220, the account information and the video information are input into the video operation prediction model to obtain the prediction operation information of the sample account on the video information, which may be specifically implemented by the following steps:
in step S310, first video information in which the sample account has been operated sequentially before a preset time and second video information in which the sample account has been operated at the preset time are extracted from the video information.
The preset time refers to the time corresponding to the video information of the sample account operation, for example, 16 points and 15 points. The first video information is video information that is operated by the sample account before the preset time, and may be one or more pieces of video information, for example, the sample account clicks the video information a, the video information B, and the video information C in sequence before the preset time; the second video information refers to video information that the sample account has operated at a preset time, for example, the sample account clicks the D video information at the preset time. It should be noted that the first video information carries actual operation information of the sample account on the corresponding video information at each time before the preset time, such as clicking, praise, and the like; the second video information carries actual operation information of the sample account on the second video information at a preset time, such as clicking, praise and the like.
Specifically, the server obtains operation sequence information of the sample account on the pushed video information according to actual operation information of the sample account on the pushed video information; according to the operation sequence information, video information of sample accounts which are sequentially operated before a preset moment is extracted from the pushed video information and is used as first video information; and extracting the video information of which the sample account is operated at the preset moment from the pushed video information as second video information.
In step S320, the account information feature code of the account information and the first video information feature code of the first video information are obtained.
The account information feature coding refers to a low-dimensional feature vector which is subjected to compression coding and used for representing low-level semantics of account information, and the first video information feature coding is also a low-dimensional feature vector which is subjected to compression coding and used for representing low-level semantics of the first video information.
It should be noted that the first video information feature codes of the first video information include video information feature codes corresponding to video information that is operated by a sample account in sequence before a preset time; for example, the sample account sequentially clicks the video information a, the video information B, and the video information C before the preset time, and the first video information feature coding includes video information feature coding a, video information feature coding B, and video information feature coding C.
Specifically, the server acquires a preset feature coding instruction, respectively extracts feature information in the account information and feature information in the first video information according to the preset feature coding instruction, and codes the feature information in the account information and the feature information in the first video information to obtain an account information feature code of the account information and a first video information feature code of the first video information.
Further, the server can input the account information and the first video information into a pre-trained feature coding model, and output the account information feature coding of the account information and the first video information feature coding of the first video information through the feature coding model; the pre-trained feature coding model is a neural network model, such as a convolutional neural network model, capable of performing feature extraction and feature coding on account information and video information to obtain account information feature coding of the account information and video information feature coding of the video information.
In step S330, the account information feature code and the first video information feature code are input to an account status coding network in the video operation prediction model, so as to obtain an account status code of the sample account at a preset time.
The account status code also refers to a low-dimensional feature vector of low-level semantics for representing the account status after compression coding. And the account state code of the sample account at the preset moment is used for representing the operation state of the sample account on the video information at the preset moment.
Specifically, the server inputs the account information characteristic code and the first video information characteristic code into an account state coding network in a video operation prediction model, and the first video information characteristic code is coded through the account state coding network to obtain a target characteristic code corresponding to the first video information characteristic code; splicing the account information feature code and the target feature code to obtain a spliced feature code; and carrying out full connection processing on the spliced feature codes to obtain the fully connected feature codes which are used as account state codes of the sample accounts at preset moments.
In step S340, the second video information feature code and the account status code of the second video information are input to the operation prediction network in the video operation prediction model, so as to obtain the predicted operation information of the sample account on the second video information at the preset time.
And the second video information feature coding is also a low-dimensional feature vector which is subjected to compression coding and used for representing low-level semantics of the second video information. It should be noted that the determination manner of the second video information characteristic code is consistent with the determination manner of the first video information characteristic code, and details are not repeated herein.
The prediction operation information of the sample account on the second video information at the preset time refers to an operation probability of the sample account on the second video information at the preset time, such as a click probability, a like probability, an attention probability, and the like of the sample account on the video information a at the preset time.
Specifically, the server acquires a second video information feature code of the second video information, inputs the second video information feature code of the second video information and an account state code of a sample account at a preset moment into an operation prediction network in a video operation prediction model, and performs splicing processing on the account state code and the second video information feature code through the operation prediction network to obtain a spliced feature code; and performing full-connection processing on the spliced feature codes to obtain the prediction operation probability of the sample account on the second video information at the preset time, and using the prediction operation probability as the prediction operation information of the sample account on the second video information at the preset time.
It should be noted that, with reference to this method, prediction operation information of the video information operated by the sample account at a plurality of preset times at the preset time can be obtained; for example, the sample account clicks the video information a at a first preset time, clicks the video information B at a second preset time, and clicks the video information C at a third preset time; then, referring to the above method, the prediction operation information of the sample account on the a video information at the first preset time, the prediction operation information on the B video information at the second preset time, and the prediction operation information on the C video information at the third preset time may be obtained.
Further, the server can obtain a loss value of the video operation prediction model according to actual operation information and prediction operation information of the sample account on the second video information at a preset time; reversely training a video operation prediction model according to the loss value until the video operation prediction model meets a preset convergence condition; and if the video operation prediction model meets the preset convergence condition, taking the current video operation prediction model as the trained video operation prediction model. For example, the server obtains a prediction loss value of the video operation prediction model at a preset moment based on a cross entropy loss function and by combining actual operation information and prediction operation information of the sample account on the second video information at the preset moment; adding the prediction loss values of the video operation prediction model at a plurality of preset moments to obtain the loss value of the video operation prediction model; and determining a network parameter updating gradient of the video operation prediction model according to the loss value, and updating the network parameter of the video operation prediction model according to the network parameter updating gradient until the video operation prediction model meets a preset convergence condition, for example, until the training times of the video operation prediction model reach the preset training times or the network parameter of the video operation prediction model reaches convergence.
According to the technical scheme provided by the embodiment of the disclosure, the video operation prediction model is repeatedly trained, so that the prediction operation information output by the video operation prediction model is close to the real operation information of the sample account, the accuracy of the prediction operation information output by the video operation prediction model is further improved, the video operation prediction model which is trained subsequently can quickly generate training sample data of the video push model, and the video push model can be quickly trained.
In an exemplary embodiment, in step S330, the account information feature code and the first video information feature code are input to an account status coding network in the video operation prediction model, so as to obtain an account status code of the sample account at a preset time, which specifically includes: inputting the first video information characteristic code into a first network in the account state coding network to obtain a video state code at a preset moment; and inputting the account information characteristic code and the video state code into a second network in the account state coding network to obtain the account state code of the sample account at the preset moment.
The video state coding also refers to a low-dimensional feature vector used for representing low-level semantics of a video state after compression coding.
Referring to fig. 4, a first network in the account status coding network is an LSTM (Long Short-Term Memory network) for determining a video status code at each preset time; the video status code at each preset time is determined by the video information selected or operated by the sample account at each preset time before the preset time, for example, the video status code at the K-th time is determined by the video information selected by the sample account at the 0-th time and the 1-th time.
The second network in the account status encoding network is a density network, that is, a fully connected network, and is configured to determine the account status encoding of the sample account at each preset time, as shown in S1, S2.
For example, referring to fig. 4, assuming that the preset time is a third preset time, and the preset time includes a 0 th preset time, a first preset time, and a second preset time before the preset time, the first video information feature code includes a video information feature code of the video information selected at the 0 th preset time, a video information feature code of the video information selected at the first preset time, and a video information feature code of the video information selected at the second preset time; secondly, the server inputs video information feature codes (full 0 vectors) of video information (such as 0 th video information) selected by the sample account at 0 th preset time into a first network (such as an LSTM network) in the account state coding network, and the video information feature codes are coded through the first network to obtain video state codes at the first preset time; inputting the video state code at the first preset moment and the video information characteristic code of the video information (such as the 1 st video information) selected by the sample account at the first preset moment into a first network, and coding the video state code at the first preset moment and the video information characteristic code of the video information selected by the sample account at the first preset moment through the first network to obtain the video state code at the second preset moment; inputting the video state code at the second preset moment and the video information characteristic code of the video information (such as 2 nd video information) selected by the sample account at the second preset moment into the first network, and coding the video state code at the second preset moment and the video information characteristic code of the video information selected by the sample account at the second preset moment through the first network to obtain the video state code at the preset moment; and splicing the user information feature code and the video state code at the preset moment to obtain a spliced feature code, inputting the spliced feature code into a second network in the account state code network, and performing full connection processing on the spliced feature code through the second network to obtain the account state code of the sample account at the preset moment (such as S3).
It should be noted that, in the process of obtaining the account status code of the sample account at the preset time, the user information feature code and the video status code at the first preset time are spliced to obtain a spliced first feature code, the spliced first feature code is input to a second network in the account status code network, and the spliced first feature code is fully connected through the second network to obtain the account status code of the sample account at the first preset time (for example, S1); and splicing the user information characteristic code and the video state code at the second preset moment to obtain a spliced second characteristic code, inputting the spliced second characteristic code into a second network, and performing full connection processing on the spliced second characteristic code through the second network to obtain an account state code of the sample account at the second preset moment (such as S2), and so on to obtain account state codes of the sample account at each preset moment.
According to the technical scheme provided by the embodiment of the disclosure, the account state code of the sample account at the preset time is determined, so that the prediction operation information of the sample account on the second video information at the preset time can be determined according to the second video information feature code and the account state code of the sample account at the preset time.
In an exemplary embodiment, in step S340, inputting the second video information feature coding and the account status coding into an operation prediction network in a video operation prediction model, to obtain the prediction operation information of the sample account on the second video information at a preset time, specifically including: inputting the second video information characteristic code and the account state code into an operation prediction network in a video operation prediction model to obtain a plurality of operation behavior probabilities of the sample account on the second video information; and according to the preset weight corresponding to the operation behavior probabilities, weighting the operation behavior probabilities to obtain a target operation probability of the sample account on the second video information, and correspondingly using the target operation probability as the prediction operation information of the sample account on the second video information at the preset moment.
The plurality of operation behavior probabilities of the second video information refer to click probability, praise probability, attention probability, long viewing probability and the like; the preset weight corresponding to the operation behavior probability is preset, and may be adjusted according to an actual scene, which is not limited herein.
Specifically, the server splices a second video information feature code of second video information operated by the sample account at a preset time and an account state code of the sample account at the preset time to obtain a spliced feature code; inputting the spliced feature codes into an operation prediction network in the video operation prediction model, and performing full connection processing on the spliced feature codes through the operation prediction network to obtain the click probability, the praise probability, the attention probability and the long viewing probability of a sample account on second video information at a preset moment; respectively obtaining preset weights corresponding to the click probability, the praise probability, the attention probability and the long viewing probability, weighting the click probability, the praise probability, the attention probability and the long viewing probability according to the preset weights corresponding to the click probability, the praise probability, the attention probability and the long viewing probability to obtain the target operation probability of the sample account on the second video information at the preset moment, and correspondingly using the target operation probability as the prediction operation information of the sample account on the second video information at the preset moment.
According to the technical scheme provided by the embodiment of the disclosure, the prediction operation information of the sample account on the second video information at the preset time is obtained, so that the loss value of the video operation prediction model can be obtained according to the actual operation information and the prediction operation information of the sample account on the second video information at the preset time, and the video operation prediction model can be trained repeatedly according to the loss value, so that the trained video operation prediction model can be obtained.
In an exemplary embodiment, referring to fig. 4, the video operation prediction model may be trained by:
(1) The server collects data on the network to obtain sample data (u, V) i ,Y i click ,Y i like ,Y i follow ,Y i longview ) I is more than or equal to 1 and less than or equal to N, wherein u represents account characteristics corresponding to the account information of the sample account, V i Video characteristics, Y, representing the ith video information i click Showing the click condition of the sample account on the ith video information, if the sample account has clicked on the ith video information, Y i click =1, if sample account does not click ith video information, then Y i click =0;Y i like Indicating the like of the sample account for the ith video information, Y i follow Indicating the attention of the sample account to the ith video information, Y i longview Indicating a long viewing of the ith video information by the sample account.
(2) When the result of selecting the video information in the t step is predicted, the video characteristics of the video information selected in the steps from 0 to t-1 are required to be input into an LSTM network, and the video state coding in the t step can be obtained, wherein the 0 th video characteristic is a full zero vector; then coding the video state at the t step anduser features u are spliced together, and a user state code S of the t step is obtained through a full-connection network (such as Dense) t (ii) a Finally, the user status is coded S t Respectively spliced with each video feature in the candidate set, and each video V of the user u at the current moment is obtained through a full-connection network i Click probability Pctr of θ Probability of like, pltr θ Attention probability Pwtr θ And a long viewing probability Plvtr θ
(3) Defining an optimization objective of a video operation prediction model:
Figure BDA0002476859990000151
(4) And updating the network parameter theta of the video operation prediction model by adopting the optimization target and utilizing a random gradient descent algorithm until the formula reaches the minimum value, thereby obtaining the trained video operation prediction model.
In an exemplary embodiment, as shown in fig. 5, in step S250, training a video push model to be trained according to training sample data may specifically be implemented by the following steps:
in step S510, a target video information feature code corresponding to the target video information is obtained.
The specific implementation of obtaining the target video information feature code corresponding to the target video information refers to the specific implementation of obtaining the first video information feature code corresponding to the first video information, and is not described herein again. It should be noted that the training sample data refers to prediction operation information of a sample account on target video information; the prediction operation information of the sample account on the target video information specifically comprises the prediction operation information of the sample account on the target video information at each preset moment.
In step S520, the account information feature codes and the target video information feature codes are input into an account status coding network in the video push model to be trained, so as to obtain target account status codes of the sample accounts at each preset time.
The target account state code also refers to a low-dimensional feature vector which is subjected to compression coding and used for representing the low-level semantics of the account state. And the target account state code of the sample account at each preset moment is used for representing the operation state of the sample account on the target video information at each preset moment.
The specific implementation of step S520 refers to the specific implementation of step S330, and is not described herein again.
In step S530, the target video information feature code and the target account status code are input to the operation prediction network in the video push model to be trained, so as to obtain the target prediction operation information of the sample account on the target video information at each preset time.
The specific implementation of step S530 refers to the specific implementation of step S340, which is not described herein again.
In step S540, the target prediction operation information is input into a preset video push evaluation model, and an operation feedback value of the sample account on the target video information at each preset time is obtained.
The preset video push evaluation model is an evaluation model capable of outputting expected rewards obtained by a current video push strategy in a current account state, and the operation feedback values of the sample accounts on the target video information at each preset moment are expected rewards of the sample accounts at each preset moment.
In step S550, the video push model to be trained and the preset video push evaluation model are repeatedly trained according to the target account status code, the prediction operation information of the sample account on the target video information at each preset time, the target prediction operation information, and the operation feedback value until both the video push model to be trained and the preset video push evaluation model satisfy the convergence condition.
Specifically, the server uses a loss function to count a loss value of a video push model to be trained and a loss value of a preset video push evaluation model based on a target account state code of a sample account at each preset time and prediction operation information, target prediction operation information and an operation feedback value of the sample account on target video information at each preset time, update network parameters of the video push model to be trained according to the loss value of the video push model to be trained, and update the network parameters of the preset video push evaluation model according to the loss value of the preset video push evaluation model; and continuously repeating the process until the network parameters of the video push model reach convergence and the network parameters of the preset video push evaluation model reach convergence, and ending the training.
According to the technical scheme provided by the embodiment of the disclosure, the video push model to be trained is repeatedly trained, so that the expected reward obtained by the video information output by the video push model can be maximized, and the push accuracy of the video information is further improved.
In an exemplary embodiment, the present disclosure may further train the video push model by using an Actor-Critic algorithm, which specifically includes the following contents:
(1) Based on the on-line request information (u, V) cand ) Re-recommending a video list V' u representing user information using a video push model to be trained, V cand Representing that all video characteristics are requested at this time, V 'represents a video sequence recommended again by the video push model, and since a feedback situation corresponding to video information in the recommended video list V' is not provided online, it is necessary to call for help from a trained video operation prediction model for prediction. Because the structure of the video push model is similar to the network structure of the video operation prediction model, the process of generating the recommended video list V' is also the first step of determining the 1 st video information, then determining the 2 nd video information, and so on; therefore, in the prediction stage of the video push model, the prediction result of the video operation prediction model on the video information in each step can be added into the model characteristics of the video push model, so that the prediction capability of reinforcement learning of the model is enhanced; after the prediction is finished, the collected reinforcement learning standard data has the format of (S) t ,V′ t ,r t ,S t+1 T). Wherein S is t Representing the current state, by user information and selected video informationDetermining; v' t Representing video characteristics corresponding to the video information determined by the current video push model; r is a radical of hydrogen t Feedback case r predicted for video operation prediction model t =aPctr t +bPltr t +cPlvtr t +dPwtr t A, b, c, d are manually configured hyper-parameters; t is a termination condition, where corresponds to whether it is the last video.
(2) Updating network parameters of a video push model by using an Actor-Critic algorithm according to the sample data generated in the step (1):
(a) Updating policy network parameters:
Figure BDA0002476859990000171
(b) Updating and evaluating network parameters:
Figure BDA0002476859990000172
(c) Alpha is the learning rate, gamma is the discount factor, s' is the next state, s is the current state, V w (s) is the expected reward, π, that can be obtained by the current strategy in the current state θ And a is a preset coefficient, and the probability of selecting a certain video under the current strategy is shown as a.
(3) And (3) repeating the processes from the step (1) to the step (2) for a plurality of times until the network parameters of the video pushing model reach convergence.
According to the technical scheme provided by the embodiment of the disclosure, the video push model can be quickly trained, so that the training efficiency of the video push model is improved.
Fig. 6 is a flowchart illustrating a video push method according to an exemplary embodiment, where, as shown in fig. 6, the video push method is used in the server 110 shown in fig. 1, and includes the following steps:
in step S610, account information of the account to be pushed is acquired.
Specifically, the server obtains account information of a current login account of the terminal, and the account information is used as account information of an account to be pushed.
In step S620, inputting the account information of the account to be pushed into the trained video push model to obtain the push video information of the account to be pushed; and obtaining the trained video push model according to the training method of the video push model.
Specifically, the server inputs video information feature codes (all 0 vectors) of 0 th pushed video information and account information feature codes of account information of accounts to be pushed into a trained video pushing model, encodes the video information feature codes of the 0 th pushed video information through the trained video pushing model to obtain video state codes at a first moment, and splices the account information feature codes of the account information of the accounts to be pushed and the video state codes at the first moment to obtain spliced feature codes; carrying out full connection processing on the spliced feature codes to obtain account state codes at a first moment; splicing the account state code at the first moment and the video information characteristic code of each video information in the candidate set, and respectively carrying out full connection processing on the spliced characteristic codes to obtain the probability of selecting each video information in the candidate set by the account to be pushed at the first moment, and taking the video information with the maximum probability as the first pushed video information pushed to the account to be pushed; inputting the video state code at the first moment, the video information characteristic code of the first pushed video information and the account information characteristic code of the account information of the account to be pushed into a trained video pushing model, and coding the video state code at the first moment and the video information characteristic code of the first pushed video information through the trained video pushing model to obtain the video state code at the second moment; splicing the account information feature code of the account information of the account to be pushed with the video state code at the second moment to obtain a spliced feature code; carrying out full-connection processing on the spliced feature codes to obtain account state codes at a second moment; splicing the account state code at the second moment and the video information characteristic code of each video information in the candidate set, and respectively carrying out full connection processing on the spliced characteristic codes to obtain the probability of selecting each video information in the candidate set by the account to be pushed at the second moment, and taking the video information with the maximum probability as the second pushed video information pushed to the account to be pushed; by analogy, a plurality of pieces of push video information pushed to the account to be pushed can be obtained.
In step S630, the push video information is pushed to the account to be pushed.
Specifically, the server acquires a terminal identifier corresponding to the account to be pushed, pushes the pushed video information to a terminal corresponding to the terminal identifier according to a preset frequency, and displays the pushed video information through a terminal interface, so that the interest requirement of the account to be pushed, which is currently logged in by the terminal, is met, and accurate pushing of the video information is realized.
According to the video pushing method, through a trained video pushing model, first pushed video information pushed to an account to be pushed is determined, then second pushed video information pushed to the account to be pushed is determined, and by analogy, a plurality of pieces of pushed video information pushed to the account to be pushed can be determined; therefore, the influence of the first K-1 pieces of pushed video information is comprehensively considered when the Kth piece of pushed video information is determined, the accuracy of the determined pushed video information is improved, and the video pushing accuracy is further improved.
In an exemplary embodiment, in step S630, pushing the push video information to the account to be pushed specifically includes: arranging the push video information according to the sequence of outputting the push video information by the trained video push model to obtain the arranged push video information; and pushing the arranged pushed video information to an account to be pushed.
For example, the server outputs the push video information a, then outputs the push video information B, and finally outputs the push video information C, and then pushes the push video information a, the push video information B, and the push video information C to the account to be pushed according to the arrangement order of the push video information a, the push video information B, and the push video information C.
According to the technical scheme provided by the embodiment of the disclosure, the arranged pushed video information is pushed to the account to be pushed, so that the relation between the pushed video information and the video information can be considered comprehensively, the accurate pushing of the video information is realized, and the accuracy of the video pushing is further improved; meanwhile, the click rate of the video information is improved.
It should be understood that although the various steps in the flowcharts of fig. 2-3, 5-6 are shown in order as indicated by the arrows, the steps are not necessarily performed in order as indicated by the arrows. The steps are not limited to being performed in the exact order illustrated and, unless explicitly stated herein, may be performed in other orders. Moreover, at least some of the steps in fig. 2-3 and 5-6 may include multiple steps or multiple stages, which are not necessarily performed at the same time, but may be performed at different times, which are not necessarily performed in sequence, but may be performed alternately or alternately with other steps or at least some of the other steps.
FIG. 7 is a block diagram illustrating a training apparatus for a video push model in accordance with an exemplary embodiment. Referring to fig. 7, the apparatus includes an information acquisition unit 710, an information prediction unit 720, a prediction model training unit 730, a sample data acquisition unit 740, and a push model training unit 750.
And an information acquisition unit 710 configured to perform acquisition of account information of the sample account and actual operation information of the sample account on the pushed video information.
And an information prediction unit 720, configured to perform inputting the account information and the video information into the video operation prediction model, resulting in prediction operation information of the sample account on the video information.
And a prediction model training unit 730 configured to perform training of the video operation prediction model according to the prediction operation information and the actual operation information.
The sample data obtaining unit 740 is configured to execute the prediction model according to the trained video operation, and obtain the prediction operation information of the target video information by the multiple sample accounts, which is used as the training sample data of the video push model to be trained.
A push model training unit 750 configured to perform training of the video push model to be trained according to the training sample data.
In an exemplary embodiment, the information prediction unit 720 is further configured to extract, from the video information, first video information in which the sample account has been operated sequentially before a preset time and second video information in which the sample account has been operated at the preset time; acquiring account information characteristic codes of account information and first video information characteristic codes of first video information; inputting the account information characteristic code and the first video information characteristic code into an account state coding network in a video operation prediction model to obtain an account state code of a sample account at a preset moment; and inputting the second video information characteristic code and the account state code of the second video information into an operation prediction network in the video operation prediction model to obtain the prediction operation information of the sample account on the second video information at the preset moment.
In an exemplary embodiment, the information prediction unit 720 is further configured to perform inputting a first video information feature code into a first network of the account status coding networks, resulting in a video status code at a preset time; and inputting the account information characteristic code and the video state code into a second network in the account state coding network to obtain the account state code of the sample account at the preset moment.
In an exemplary embodiment, the information prediction unit 720 is further configured to perform an operation prediction network that inputs the second video information feature coding and the account status coding into the video operation prediction model, and obtain a plurality of operation behavior probabilities of the sample account on the second video information; and according to the preset weight corresponding to the operation behavior probabilities, performing weighting processing on the operation behavior probabilities to obtain the target operation probability of the sample account on the second video information, and correspondingly using the target operation probability as the predicted operation information of the sample account on the second video information at the preset moment.
In an exemplary embodiment, the prediction operation information of the sample account on the target video information includes prediction operation information of the sample account on the target video information at each preset moment; a push model training unit 750 further configured to perform target video information feature encoding to obtain target video information; inputting the account information characteristic codes and the target video information characteristic codes into an account state coding network in a video push model to be trained to obtain target account state codes of sample accounts at each preset moment; inputting the target video information characteristic code and the target account state code into an operation prediction network in a video push model to be trained to obtain target prediction operation information of a sample account on the target video information at each preset moment; inputting the target prediction operation information into a preset video push evaluation model to obtain an operation feedback value of a sample account on the target video information at each preset moment; and repeatedly training the video push model to be trained and the preset video push evaluation model according to the target account state code, the prediction operation information of the sample account on the target video information at each preset moment, the target prediction operation information and the operation feedback value until the video push model to be trained and the preset video push evaluation model both meet the convergence condition.
With regard to the apparatus in the above embodiment, the specific manner in which each module performs the operation has been described in detail in the embodiment related to the method, and will not be described in detail here.
Fig. 8 is a block diagram illustrating a video push device according to an example embodiment. Referring to fig. 8, the apparatus includes an account information acquiring unit 810, a video information acquiring unit 820, and a video information pushing unit 830.
An account information obtaining unit 810 configured to perform obtaining account information of an account to be pushed.
A video information obtaining unit 820 configured to perform inputting the account information of the account to be pushed into the trained video pushing model, so as to obtain the pushed video information of the account to be pushed; and obtaining the trained video push model according to the training method of the video push model.
And a video information pushing unit 830 configured to perform pushing of the pushed video information to the account to be pushed.
In an exemplary embodiment, the video information pushing unit 830 is further configured to perform arranging the pushed video information according to an order of outputting the pushed video information according to the trained video pushing model, so as to obtain arranged pushed video information; and pushing the arranged pushed video information to an account to be pushed.
With regard to the apparatus in the above-described embodiment, the specific manner in which each module performs the operation has been described in detail in the embodiment related to the method, and will not be elaborated here.
Fig. 9 is a block diagram illustrating an apparatus 900 for performing the above-described video push model training method or video push method according to an exemplary embodiment. For example, device 900 may be a server. Referring to fig. 9, device 900 includes a processing component 920 that further includes one or more processors and memory resources, represented by memory 922, for storing instructions, such as applications, that are executable by processing component 920. The application programs stored in memory 922 may include one or more modules that each correspond to a set of instructions. Further, the processing component 920 is configured to execute instructions to perform the training method of the video push model or the video push method described above.
The device 900 may also include a power component 924 configured to perform power management of the device 900, a wired or wireless network interface 926 configured to connect the device 900 to a network, and an input/output (I/O) interface 928. The device 900 may operate based on an operating system stored in memory 922, such as Windows Server, mac OS XTM, unixTM, linuxTM, freeBSDTM, or the like.
In an exemplary embodiment, a storage medium comprising instructions, such as the memory 922 comprising instructions, executable by a processor of the device 900 to perform the above-described method is also provided. The storage medium may be a non-transitory computer readable storage medium, which may be, for example, a ROM, a Random Access Memory (RAM), a CD-ROM, a magnetic tape, a floppy disk, an optical data storage device, and the like.
In an exemplary embodiment, there is also provided a computer program product, which includes a computer program stored in a readable storage medium, from which at least one processor of a device reads and executes the computer program, so that the device performs the training method of a video push model or the video push method described in any embodiment of the present disclosure.
Other embodiments of the disclosure will be apparent to those skilled in the art from consideration of the specification and practice of the invention disclosed herein. This disclosure is intended to cover any variations, uses, or adaptations of the disclosure following, in general, the principles of the disclosure and including such departures from the present disclosure as come within known or customary practice in the art to which the disclosure pertains. It is intended that the specification and examples be considered as exemplary only, with a true scope and spirit of the disclosure being indicated by the following claims.
It will be understood that the present disclosure is not limited to the precise arrangements described above and shown in the drawings and that various modifications and changes may be made without departing from the scope thereof. The scope of the present disclosure is limited only by the appended claims.

Claims (14)

1. A training method of a video push model is characterized by comprising the following steps:
acquiring account information of a sample account and actual operation information of the sample account on pushed video information;
inputting the account information and the video information into a video operation prediction model to obtain prediction operation information of the sample account on the video information;
training the video operation prediction model according to the prediction operation information and the actual operation information;
according to the trained video operation prediction model, obtaining prediction operation information of a plurality of sample accounts on target video information, and using the prediction operation information as training sample data of a video push model to be trained; the prediction operation information of the sample account on the target video information comprises the prediction operation information of the sample account on the target video information at each preset moment;
acquiring target video information characteristic codes of the target video information;
inputting the account information feature code of the account information and the target video information feature code into an account state coding network in the video push model to be trained to obtain a target account state code of the sample account at each preset moment;
inputting the target video information feature codes and the target account state codes into an operation prediction network in the video pushing model to be trained to obtain target prediction operation information of the sample accounts on the target video information at each preset moment;
inputting the target prediction operation information into a preset video push evaluation model to obtain an operation feedback value of the sample account on the target video information at each preset moment;
and repeatedly training the video push model to be trained and the preset video push evaluation model according to the target account state code, the prediction operation information of the sample account on the target video information at each preset moment, the target prediction operation information and the operation feedback value until the video push model to be trained and the preset video push evaluation model both meet the convergence condition.
2. The method for training the video push model according to claim 1, wherein the inputting the account information and the video information into a video operation prediction model to obtain the prediction operation information of the sample account on the video information comprises:
extracting first video information of the sample account which is operated in sequence before a preset time and second video information of the sample account which is operated at the preset time from the video information;
acquiring account information characteristic codes of the account information and first video information characteristic codes of the first video information;
inputting the account information feature code and the first video information feature code into an account state coding network in the video operation prediction model to obtain an account state code of the sample account at the preset moment;
and inputting a second video information characteristic code and the account state code of the second video information into an operation prediction network in the video operation prediction model to obtain the prediction operation information of the sample account on the second video information at the preset moment.
3. The method for training a video push model according to claim 2, wherein the inputting the account information feature code and the first video information feature code into an account status coding network in the video operation prediction model to obtain an account status code of the sample account at the preset time includes:
inputting the first video information feature code into a first network in the account state coding network to obtain the video state code at the preset moment;
and inputting the account information characteristic code and the video state code into a second network in the account state code network to obtain the account state code of the sample account at the preset moment.
4. The method for training a video push model according to claim 2, wherein the inputting the second video information feature coding and the account status coding into an operation prediction network in the video operation prediction model to obtain the prediction operation information of the sample account on the second video information at the preset time includes:
inputting the second video information feature code and the account state code into an operation prediction network in the video operation prediction model to obtain a plurality of operation behavior probabilities of the sample account on the second video information;
and according to preset weights corresponding to the operation behavior probabilities, weighting the operation behavior probabilities to obtain a target operation probability of the sample account on the second video information, wherein the target operation probability is correspondingly used as the prediction operation information of the sample account on the second video information at the preset moment.
5. A video push method, comprising:
acquiring account information of an account to be pushed;
inputting the account information of the account to be pushed into the trained video pushing model to obtain the pushed video information of the account to be pushed; the trained video push model is obtained according to the training method of the video push model of any one of claims 1 to 4;
and pushing the pushed video information to the account to be pushed.
6. The video pushing method according to claim 5, wherein pushing the pushed video information to the account to be pushed comprises:
arranging the push video information according to the sequence of the trained video push model outputting the push video information to obtain the arranged push video information;
and pushing the arranged pushed video information to the account to be pushed.
7. An apparatus for training a video push model, comprising:
the information acquisition unit is configured to acquire account information of a sample account and actual operation information of the sample account on pushed video information;
an information prediction unit configured to perform input of the account information and the video information into a video operation prediction model, so as to obtain prediction operation information of the sample account on the video information;
a prediction model training unit configured to perform training of the video operation prediction model according to the prediction operation information and the actual operation information;
the sample data acquisition unit is configured to execute the video operation prediction model after training to obtain prediction operation information of a plurality of sample accounts on target video information, and the prediction operation information is used as training sample data of a video push model to be trained; the prediction operation information of the sample account on the target video information comprises the prediction operation information of the sample account on the target video information at each preset moment;
a push model training unit configured to perform target video information feature encoding for obtaining the target video information; inputting the account information feature codes of the account information and the target video information feature codes into an account state coding network in the video push model to be trained to obtain target account state codes of the sample accounts at each preset moment; inputting the target video information feature code and the target account state code into an operation prediction network in the video push model to be trained to obtain target prediction operation information of the sample account on the target video information at each preset moment; inputting the target prediction operation information into a preset video pushing evaluation model to obtain an operation feedback value of the sample account on the target video information at each preset moment; and repeatedly training the video push model to be trained and the preset video push evaluation model according to the target account state code, the prediction operation information of the sample account on the target video information at each preset moment, the target prediction operation information and the operation feedback value until the video push model to be trained and the preset video push evaluation model both meet the convergence condition.
8. The apparatus for training a video push model according to claim 7, wherein the information prediction unit is further configured to extract, from the video information, first video information that the sample account has been operated sequentially before a preset time and second video information that the sample account has been operated at the preset time; acquiring account information characteristic codes of the account information and first video information characteristic codes of the first video information; inputting the account information feature code and the first video information feature code into an account state coding network in the video operation prediction model to obtain an account state code of the sample account at the preset moment; inputting a second video information feature code of the second video information and the account status code into an operation prediction network in the video operation prediction model to obtain the prediction operation information of the sample account on the second video information at the preset moment.
9. The apparatus for training a video push model according to claim 8, wherein the information prediction unit is further configured to perform the step of inputting the first video information feature code into a first network of the account status coding networks, to obtain the video status code at the preset time; and inputting the account information characteristic code and the video state code into a second network in the account state code network to obtain the account state code of the sample account at the preset moment.
10. The apparatus for training a video push model according to claim 8, wherein the information prediction unit is further configured to execute an operation prediction network that inputs the second video information feature coding and the account status coding into the video operation prediction model, and obtain a plurality of operation behavior probabilities of the sample account on the second video information; and according to preset weights corresponding to the operation behavior probabilities, performing weighting processing on the operation behavior probabilities to obtain a target operation probability of the sample account on the second video information, and correspondingly using the target operation probability as the prediction operation information of the sample account on the second video information at the preset moment.
11. A video push apparatus, comprising:
the account information acquisition unit is configured to acquire the account information of the account to be pushed;
the video information acquisition unit is configured to execute the video pushing model which inputs the account information of the account to be pushed into the training completion to obtain the pushed video information of the account to be pushed; the trained video push model is obtained according to the training method of the video push model of any one of claims 1 to 4;
a video information pushing unit configured to perform pushing of the pushed video information to the account to be pushed.
12. The video pushing apparatus according to claim 11, wherein the video information pushing unit is further configured to perform arranging the pushed video information according to an order of outputting the pushed video information according to the trained video pushing model, so as to obtain arranged pushed video information; and pushing the arranged pushed video information to the account to be pushed.
13. A server, comprising:
a processor;
a memory for storing the processor-executable instructions;
wherein the processor is configured to execute the instructions to implement the method of any one of claims 1 to 6.
14. A storage medium, wherein instructions in the storage medium, when executed by a processor of a server, enable the server to perform the method of any one of claims 1 to 6.
CN202010366374.0A 2020-04-30 2020-04-30 Training method and device of video push model, server and storage medium Active CN113596528B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010366374.0A CN113596528B (en) 2020-04-30 2020-04-30 Training method and device of video push model, server and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010366374.0A CN113596528B (en) 2020-04-30 2020-04-30 Training method and device of video push model, server and storage medium

Publications (2)

Publication Number Publication Date
CN113596528A CN113596528A (en) 2021-11-02
CN113596528B true CN113596528B (en) 2022-10-04

Family

ID=78237493

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010366374.0A Active CN113596528B (en) 2020-04-30 2020-04-30 Training method and device of video push model, server and storage medium

Country Status (1)

Country Link
CN (1) CN113596528B (en)

Family Cites Families (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107515909B (en) * 2017-08-11 2020-05-19 深圳市云网拜特科技有限公司 Video recommendation method and system
CN107911491B (en) * 2017-12-27 2019-09-27 Oppo广东移动通信有限公司 Information recommendation method, device and storage medium, server and mobile terminal
CN108763314B (en) * 2018-04-26 2021-01-19 深圳市腾讯计算机系统有限公司 Interest recommendation method, device, server and storage medium
CN109902849B (en) * 2018-06-20 2021-11-30 华为技术有限公司 User behavior prediction method and device, and behavior prediction model training method and device
CN109460512B (en) * 2018-10-25 2022-04-22 腾讯科技(北京)有限公司 Recommendation information processing method, device, equipment and storage medium
CN109598331A (en) * 2018-12-04 2019-04-09 北京芯盾时代科技有限公司 A kind of fraud identification model training method, fraud recognition methods and device
CN109858625A (en) * 2019-02-01 2019-06-07 北京奇艺世纪科技有限公司 Model training method and equipment, prediction technique and equipment, data processing equipment, medium
CN110427617B (en) * 2019-07-22 2020-09-08 阿里巴巴集团控股有限公司 Push information generation method and device
CN110688528B (en) * 2019-09-26 2023-04-07 抖音视界有限公司 Method, apparatus, electronic device, and medium for generating classification information of video
CN110704599B (en) * 2019-09-30 2022-05-17 支付宝(杭州)信息技术有限公司 Method and device for generating samples for prediction model and method and device for training prediction model

Also Published As

Publication number Publication date
CN113596528A (en) 2021-11-02

Similar Documents

Publication Publication Date Title
CN109104620B (en) Short video recommendation method and device and readable medium
CN108269254B (en) Image quality evaluation method and device
CN110766142A (en) Model generation method and device
CN109145828B (en) Method and apparatus for generating video category detection model
CN109214374B (en) Video classification method, device, server and computer-readable storage medium
CN111858973A (en) Multimedia event information detection method, device, server and storage medium
CN116229530A (en) Image processing method, device, storage medium and electronic equipment
CN111738766B (en) Data processing method and device for multimedia information and server
CN112182281B (en) Audio recommendation method, device and storage medium
CN115130232A (en) Method, device, apparatus, storage medium, and program product for predicting life of part
CN111161238A (en) Image quality evaluation method and device, electronic device, and storage medium
CN114528474A (en) Method and device for determining recommended object, electronic equipment and storage medium
CN116578925B (en) Behavior prediction method, device and storage medium based on feature images
CN113596528B (en) Training method and device of video push model, server and storage medium
CN113836388A (en) Information recommendation method and device, server and storage medium
CN113204699A (en) Information recommendation method and device, electronic equipment and storage medium
CN116028715A (en) Content recommendation method and device, storage medium and electronic equipment
CN115017362A (en) Data processing method, electronic device and storage medium
CN110502715B (en) Click probability prediction method and device
CN115878839A (en) Video recommendation method and device, computer equipment and computer program product
CN113297417A (en) Video pushing method and device, electronic equipment and storage medium
CN115496175A (en) Newly-built edge node access evaluation method and device, terminal equipment and product
CN113469204A (en) Data processing method, device, equipment and computer storage medium
CN112000888B (en) Information pushing method, device, server and storage medium
CN112925972B (en) Information pushing method, device, electronic equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant