CA3153598A1 - Method of and device for predicting video playback integrity - Google Patents

Method of and device for predicting video playback integrity Download PDF

Info

Publication number
CA3153598A1
CA3153598A1 CA3153598A CA3153598A CA3153598A1 CA 3153598 A1 CA3153598 A1 CA 3153598A1 CA 3153598 A CA3153598 A CA 3153598A CA 3153598 A CA3153598 A CA 3153598A CA 3153598 A1 CA3153598 A1 CA 3153598A1
Authority
CA
Canada
Prior art keywords
user
video playback
video
data
playback
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CA3153598A
Other languages
French (fr)
Inventor
Liangwu XU
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
10353744 Canada Ltd
Original Assignee
10353744 Canada Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 10353744 Canada Ltd filed Critical 10353744 Canada Ltd
Publication of CA3153598A1 publication Critical patent/CA3153598A1/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/70Information retrieval; Database structures therefor; File system structures therefor of video data
    • G06F16/74Browsing; Visualisation therefor
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/70Information retrieval; Database structures therefor; File system structures therefor of video data
    • G06F16/73Querying
    • G06F16/732Query formulation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/70Information retrieval; Database structures therefor; File system structures therefor of video data
    • G06F16/78Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
    • G06F16/783Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content
    • G06F16/7847Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content using low-level visual features of the video content

Abstract

A video playback completeness prediction method and apparatus, relating to the technical field of big data and deep learning. The method comprises: inputting data to be tested of a user's video playback feature vector (101); performing calculation by a preset video playback completeness prediction model (102); and outputting the video playback completeness value of said data (103), wherein the preset video playback completeness prediction model is obtained by means of training according to user's video playback training data, the user's video playback feature vector comprising at least a user feature vector and a video feature vector. According to the method, a playback completeness improvement strategy is introduced to predict user's video playback completeness, user's interest data closer to the reality is obtained in terms of viewing duration as an important information stream, and thus, the accuracy of identification of user's interest is improved, so as to improve the real relevance of recommendation, thereby greatly increasing user's viewing duration and degree of satisfaction.

Description

METHOD OF AND DEVICE FOR PREDICTING VIDEO PLAYBACK INTEGRITY
BACKGROUND OF THE INVENTION
Technical Field [0001] The present invention relates to the fields of big data and deep learning technologies, and more particularly to a method of and a device for predicting video playback integrity.
Description of Related Art
[0002] The video recommendation system is created by researching user interests and predilections on the basis of large number of users and large quantities of videos in dependence of big data analysis and artificial intelligence technology, the system recommends user-interested, high-quality videos to target users, solves the problem of information overload, achieves the effect of customized services, and enhances both time duration of stay and satisfaction of users. the video recommendation system usually includes two phases of recalling and sorting, of which the recalling phase is to select certain candidate sets from massive videos, and the sorting phase is to perform more precise and unified calculation on the candidate sets selected in the recalling phase, and to screen out few videos of excellent quality that are most interesting to users from the candidate sets.
[0003] Currently, the number of users registered in some video playback platforms reaches hundred millions, with daily UV (UniqueVisitor) access exceeding ten millions, and the number of plays per day is even higher in mobile ends. In order that users find out contents of interest to them from massive videos, a recommendation system is created by collecting data of plural dimensions (including basic information of users, playback histories of users, video attributes and environment attributes, etc.) to associate the users Date Recue/Date Received 2022-03-07 with potentially preferred videos. There is fewer information usable for the recommendation of short videos, only such information as titles and video categories is usable for the purpose, and the currently frequently used sorting model employs the method of CTR (Click-Through-Rate) prediction. Clickbaits might be encouraged on the basis of the click model, and this cannot bring about longer time duration of stay of users, and would adversely affect the watching time duration and satisfaction of users. However, watching time duration is an important optimization target in the information flow, so it is urgently needed to introduce playback integrity optimization in the short video sorting model, so as to enhance reality relevancy of recommendations, and to achieve enhancement in the watching time duration and satisfaction of users.
SUMMARY OF THE INVENTION
[0004] In order to address problems pending in the state of the art, embodiments of the present invention provide a method of and a device for predicting video playback integrity, by introducing a playback integrity improving policy to predict user video playback integrity, and acquiring interest data of the user more approaching to reality in such important information flow aspect as the watching time duration, the present invention enhances the precision in recognizing user interests, hence enhances reality relevancy of recommendations, and achieves relatively great enhancement in the watching time duration and satisfaction of users.
[0005] The technical solutions are as follows:
[0006] According to one aspect, there is provided a method of predicting video playback integrity, and the method comprises:
[0007] inputting to-be-tested data of a user video playback feature vector;
[0008] calculating through a preset video playback integrity prediction model;
and
[0009] outputting a video playback integrity value of the to-be-tested data;
wherein Date Recue/Date Received 2022-03-07
[0010] the preset video playback integrity prediction model is obtained through training by user video playback training data, and the user video playback feature vector at least includes a user feature vector and a video feature vector.
[0011] Moreover, the method further comprises:
[0012] collecting user video playback information data;
[0013] screening the user video playback information data, and obtaining a screening result; and
[0014] performing a feature extraction on the screening result, and generating to-be-tested data of the user video playback feature vector.
[0015] Further, the step of collecting user video playback information data includes: obtaining the user video playback information data containing user information, user playback historical information, video information and user client side information;
and/or
[0016] the step of screening the user video playback information data, and obtaining a screening result includes: screening the user video playback information data by employing a multi-channel recalling mode including user collaboration, user searching, a topic model, popular recommendation, a user portrait and a video tag, and obtaining a screening result;
and/or
[0017] the step of performing a feature extraction on the screening result, and generating to-be-tested data of the user video playback feature vector includes: performing word segmentation on a video title and a video classification tag in the screening result by employing a word vector obtained by training a preset massive corpus through a word2vec model and IDF weight training, generating a video word vector, thereafter performing word vector calculation according to the user playback historical information in conjunction with time decay, and generating a user word vector.
[0018] Further, the preset video playback integrity prediction model contains DNNs of three hidden layers.

Date Recue/Date Received 2022-03-07
[0019] Further, the preset video playback integrity prediction model is obtained through training by inputting the user video playback training data, wherein the user video playback training data is an independent variable, while a user watching history video playback integrity value is a dependent variable, and the user video playback training data is a feature vector combined by a historical user vector and a historical video vector created according to the user playback historical information.
[0020] Moreover, the method further comprises:
[0021] sorting video playback integrity values of the to-be-tested data in a decreasing order, obtaining topN video sorting results, and recommending the video sorting results to a corresponding user according to priority level, wherein N is an integer greater than 1.
[0022] According to another aspect, there is provided a device for predicting video playback integrity, the device comprises a model calculating module, and the model calculating module is employed for:
[0023] inputting to-be-tested data of a user video playback feature vector, calculating through a preset video playback integrity prediction model, and outputting a video playback integrity value of the to-be-tested data, wherein the preset video playback integrity prediction model is obtained through training by user video playback training data, and the user video playback feature vector at least includes a user feature vector and a video feature vector.
[0024] Moreover, the device further comprises a data collecting module, a data screening module, and a vector generating module, of which the data collecting module collects user video playback information data, the data screening module screens the user video playback information data, and obtains a screening result, and the vector generating module performs a feature extraction on the screening result, and generates to-be-tested data of the user video playback feature vector.

Date Recue/Date Received 2022-03-07
[0025] Further, the data collecting module obtains the user video playback information data containing user information, user playback historical information, video information and user client side information; and/or
[0026] the data screening module screens the user video playback information data by employing a multi-channel recalling mode including user collaboration, user searching, a topic model, popular recommendation, a user poi _________________________________ ti ait and a video tag, and obtains a screening result; and/or
[0027] the vector generating module performs a feature extraction on the screening result, and generates to-be-tested data of the user video playback feature vector, including:
performing word segmentation on a video title and a video classification tag in the screening result by employing a word vector obtained by training a preset massive corpus through a word2vec model and IDF weight training, generating a video word vector, thereafter performing word vector calculation according to the user playback historical information in conjunction with time decay, and generating a user word vector.
[0028] Moreover, the device further comprises a data recommending module for sorting video playback integrity values of the to-be-tested data in a decreasing order, obtaining topN
video sorting results, and recommending the video sorting results to a corresponding user according to priority level, wherein N is an integer greater than 1.
[0029] The technical solutions provided by the embodiments of the present invention bring about the following advantageous effects.
[0030] 1. By modifying the traditional CTR prediction method altogether, a video playback integrity indicator is introduced, video playback integrities of different users are predicted through a well trained preset video playback integrity prediction model, interest data of users more approaching to reality is obtained in such important information flow aspect as the watching time duration through prediction results of the video playback integrities, the precision in recognizing user interests is enhanced, reality relevancy of Date Recue/Date Received 2022-03-07 recommendations is hence enhanced, and relatively great enhancement is achieved in the watching time duration and satisfaction of users.
[0031] 2. Through vectorized representation of the user portrait, interest transfer of the user is reflected in combination with time decay of user behaviors, and hotspot videos and inadvertently clicked videos are filtered out in the process of user poll" __ ait, whereby interference with actual interest of the user is avoided, and the user poi __ tiait is made more precise.
[0032] 3. By collecting such relevant data as user behavior data, video quality and video information, etc., vectorized representation of user features and video attributes is effectively made, as well as proportions of videos played back at various time periods, proportions of various categories and other environment information, different features and different data sources are merged in the application of short video recommendation sorting model through deep learning modeling and prediction of potential playback integrities of videos not watched by users, excellent effect is achieved, and average watching time duration of users is enhanced.
[0033] 4. By creating such features as user features, video features, contextual features and client side classification, deep learning modeling is employed, the playback integrity prediction mode is applied in a group of 10% randomly selected users through the AB test, and such indicators as CTR, daily average playback volume and user average playback integrity are compared through the final report. In the end, user average playback integrity and daily average playback volume are enhanced to a greater extent with slight decrease in the CTR.
[0034] 5. A TF-IDF algorithm is employed in terms of video recommendation, and key information of videos is effectively emphasized through IDF values.

Date Recue/Date Received 2022-03-07
[0035] 6. Reality relevancy of recommendations is enhanced through prediction of short video playback integrities, and increase in the time duration of stay of users is attempted.
BRIEF DESCRIPTION OF THE DRAWINGS
[0036] To more clearly describe the technical solutions in the embodiments of the present invention, drawings required to illustrate the embodiments are briefly introduced below.
Apparently, the drawings introduced below are merely directed to some embodiments of the present invention, while persons ordinarily skilled in the art may further acquire other drawings on the basis of these drawings without spending creative effort in the process.
[0037] Fig. 1 is a flowchart illustrating the method of predicting video playback integrity provided by an embodiment of the present invention;
[0038] Fig. 2 is a flowchart illustrating the method of predicting video playback integrity provided by another embodiment of the present invention;
[0039] Fig. 3 is a view schematically illustrating a preferred mode of execution of feature engineering construction in Step 203;
[0040] Fig. 4 is a view schematically illustrating a preferred mode of execution of a preset video playback integrity prediction model provided by an embodiment of the present invention;
[0041] Fig. 5 is a view schematically illustrating the structure of a device for predicting video playback integrity provided by an embodiment of the present invention; and
[0042] Fig. 6 is a view schematically illustrating the structure of a device for predicting video playback integrity provided by another embodiment of the present invention.

Date Recue/Date Received 2022-03-07 DETAILED DESCRIPTION OF THE INVENTION
[0043] To make more lucid and clear the objectives, technical solutions and advantages of the present invention, the technical solutions in the embodiments of the present invention will be clearly and comprehensively described below with reference to the accompanying drawings in the embodiments of the present invention. Apparently, the embodiments as described are merely partial, rather than the entire, embodiments of the present invention.
Any other embodiments makeable by persons ordinarily skilled in the art on the basis of the embodiments in the present invention without creative effort shall all fall within the protection scope of the present invention. As should be noted, the wordings "plural/more/a plurality of' mean "two and more" in the description of the present invention, unless otherwise definitely and specifically defined.
[0044] In the method of and device for predicting video playback integrity provided by the embodiments of the present invention, by modifying the traditional CTR
prediction method altogether, a video playback integrity indicator is introduced, video playback integrities of different users are predicted through a well trained preset video playback integrity prediction model, interest data of users more approaching to reality is obtained in such important information flow aspect as the watching time duration through prediction results of the video playback integrities, the precision in recognizing user interests is enhanced, reality relevancy of recommendations is hence enhanced, and relatively great enhancement is achieved in the watching time duration and satisfaction of users. Accordingly, the method of and device for predicting video playback integrity are widely applicable to many network video application scenarios concerning user interests mining, user requirements matching or user recommendations.
[0045] Specific embodiments and accompanying drawings are combined below to describe in detail the method of and device for predicting video playback integrity provided by the embodiments of the present invention.

Date Recue/Date Received 2022-03-07
[0046] Fig. 1 is a flowchart illustrating the method of predicting video playback integrity provided by an embodiment of the present invention. As shown in Fig. 1, the method of predicting video playback integrity comprises the following steps:
[0047] 101 - inputting to-be-tested data of a user video playback feature vector;
[0048] 102 - calculating through a preset video playback integrity prediction model; and
[0049] 103 - outputting a video playback integrity value of the to-be-tested data.
[0050] Different from the traditional user technology in which only few such collected information as titles, video categories or click-through-rate is used, the user video playback feature vector here at least includes user feature vectors and video feature vectors, the user feature includes user poi _______________________________ ti aits, user historical playback records or other information relevant to users, and the video information includes video categories, video time durations, video times, video playback integrity records or other information relevant to releasing videos. Besides user feature vectors and video feature vectors, the user video playback feature vector can further include such other information relevant to video playback as user client side classification information. In addition, the preset video playback integrity prediction model is obtained through training by user video playback training data, and the video playback integrity prediction model as specifically used can be either obtained by training a corresponding deep learning model designed and constructed according to requirements, or obtained by training any possible deep learning model available in the art, to which no particular restriction is made in the embodiments of the present invention.
[0051] Fig. 2 is a flowchart illustrating the method of predicting video playback integrity provided by another embodiment of the present invention. As shown in Fig. 2, the method of predicting video playback integrity comprises the following steps.
[0052] 201 - collecting user video playback information data.

Date Recue/Date Received 2022-03-07
[0053] Specifically, the user video playback information data containing user information, user playback historical information, video information and user client side information is obtained.
[0054] This process is a phase for collecting user video playback information data, the user video playback information mainly includes user information, user playback historical information, video information and user client side information, of which the user information mainly indicates user poll" ___________________________________ ait information, including basic attribute information of a user (gender, age, etc.), the user playback historical information includes proportions of historical playbacks on a hourly basis by the user, and proportions of various types of videos watched by the user, etc., and the client side information includes user equipment types and operator types, etc. In addition to the above, such contextual information secondarily associated with videos played back by the user as the time at which the user watches each video and the user position information, etc., can be further collected for the user video playback information according to requirements.
[0055] As is notable, the process of collecting user video playback information data in Step 201 can as well be realized by modes other than the mode recited in the aforementioned step, and these specific modes are not restricted in the embodiments of the present invention.
[0056] 202 - screening the user video playback information data, and obtaining a screening result.
[0057] Specifically, the step of screening the user video playback information data, and obtaining a screening result includes: screening the user video playback information data by employing a multi-channel recalling mode including user collaboration, user searching, a topic model, popular recommendation, a user portrait and a video tag, and obtaining a screening result.
Date Recue/Date Received 2022-03-07
[0058] This process is a phase for recalling coarsely screened user video playback information data, preferably, the screening is mainly directed to the video information in the user video playback information data. Since the video is colossal in scale, possibly reaching the order of several millions, direct input of the video into the model for data preprocessing would require extremely high cost, and the speed would also be extremely slow, so it is possible to coarsely screen out some video information with higher quality or in other words possibly more to the taste of users in the recalling phase.
Recalling is usually embodied as multi-channel recalling, such as through user collaboration, user searching, topic models, popular recommendations, user portraits and video tags, so as to select certain desirable candidate sets from the massive amount of video.
[0059] As is notable, the process of screening the user video playback information data in Step 202 can as well be realized by modes other than the mode recited in the aforementioned step, and these specific modes are not restricted in the embodiments of the present invention.
[0060] 203 - performing a feature extraction on the screening result, and generating to-be-tested data of the user video playback feature vector.
[0061] Specifically, the step of performing a feature extraction on the screening result, and generating to-be-tested data of the user video playback feature vector includes:
performing word segmentation on a video title and a video classification tag in the screening result by employing a word vector obtained by training a preset massive corpus through a word2vec model and IDF weight training, generating a video word vector, thereafter performing word vector calculation according to the user playback historical information in conjunction with time decay, and generating a user word vector.
The user word vector and the video word vector here correspond to the aforementioned user feature vector and video feature vector.

Date Recue/Date Received 2022-03-07
[0062] This process is a feature engineering phase, as shown in Fig. 3, preferably, a word vector with 200 dimensions per word is trained through word segmentation and a word2vec model on the massive corpus, potential meanings of words are characterized in a vectorized form, so as to express the relations between words, and word vector representation of the video is calculated and obtained through a combination of word segmentation process of the video title with the information of IDF obtained by training.
Word vector representation of the user is calculated according to the word vector representation of the user historical playback video in conjunction with time decay; in the process of calculating user vector, videos with top 3 tags of the user with a proportion exceeding 10% are counted according to video tag categories. As found according to playback history analysis of the user, videos to which lower proportional video tags correspond are not latent interest points of the user, playback of them is usually because they are hotspot videos or due to inadvertent clicking by the user, and they can be discarded through feature extraction.
[0063] As is notable, the process of performing a feature extraction on the screening result, and generating to-be-tested data of the user video playback feature vector in Step 203 can as well be realized by modes other than the mode recited in the aforementioned step, and these specific modes are not restricted in the embodiments of the present invention.
[0064] 204 - inputting to-be-tested data of a user video playback feature vector.
[0065] The preset video playback integrity prediction model is obtained through training by inputting the user video playback training data, wherein the user video playback training data is an independent variable, while a user watching history video playback integrity value is a dependent variable, and the user video playback training data is a feature vector combined by a historical user vector and a historical video vector created according to the user playback historical information, for training to obtain a desirable preset video playback integrity prediction model.

Date Recue/Date Received 2022-03-07
[0066] Preferably, the preset video playback integrity prediction model contains DNNs of three hidden layers, and input information of the input layer includes word vector representation of the user (various video word vectors are obtained by a combination of word segmentation of the user historical playback video with IDF weight calculation, and a word vector with 200 dimensions is subsequently calculated and obtained in overall consideration of time decay), basic portrait of the user (gender, age, etc.), proportions of videos played back at various time periods (on an hourly basis), and proportions of various categories of videos, etc.; word vector (200 dimensions) of the video, quality of the video (average playback integrity, video hits, etc.), releasing time of the video, video category, equipment type, operator type; region; current time period, etc.
[0067] As is notable, the data content and form of inputting to-be-tested data of a user video playback feature vector in Step 204 can as well be realized by modes other than the mode recited in the aforementioned step, and these specific modes are not restricted in the embodiments of the present invention.
[0068] 205 ¨ calculating through a preset video playback integrity prediction model.
[0069] 206 - outputting a video playback integrity value of the to-be-tested data.
[0070] Preferably, the following steps are further included after Step 206:
[0071] sorting video playback integrity values of the to-be-tested data in a decreasing order, obtaining topN video sorting results, and recommending the video sorting results to a corresponding user according to priority level, wherein N is an integer greater than 1. As should be noted, it is also possible to base on requirements to arrange the step of sorting video playback integrity values in the calculating process of the preset video playback integrity prediction model, as shown in Fig. 4, to which no particular restriction is made in the embodiments of the present invention.

Date Recue/Date Received 2022-03-07
[0072] Fig. 5 is a view schematically illustrating the structure of a device for predicting video playback integrity provided by an embodiment of the present invention. As shown in Fig.
5, the device for predicting video playback integrity comprises a model calculating module 1, and the model calculating module 1 is employed for: inputting to-be-tested data of a user video playback feature vector, calculating through a preset video playback integrity prediction model, and outputting a video playback integrity value of the to-be-tested data, wherein the preset video playback integrity prediction model is obtained through training by user video playback training data, and the user video playback feature vector at least includes a user feature vector and a video feature vector.
[0073] Fig. 6 is a view schematically illustrating the structure of a device for predicting video playback integrity provided by another embodiment of the present invention. As shown in Fig. 6, the device 2 for predicting video playback integrity comprises a data collecting module 21, a data screening module 22, a vector generating module 23, a model calculating module 24 and a data recommending module 25.
[0074] The data collecting module 21 collects user video playback information data. Specifically, the data collecting module 21 obtains user video playback information data containing user information, user playback historical information, video information and user client side information.
[0075] The data screening module 22 screens the user video playback information data, and obtains a screening result. Specifically, the data screening module 22 screens the user video playback information data by employing a multi-channel recalling mode including user collaboration, user searching, a topic model, popular recommendation, a user poi Li ait and a video tag, and obtains a screening result.
[0076] The vector generating module 23 performs a feature extraction on the screening result, Date Recue/Date Received 2022-03-07 and generates the user video playback feature vector. Specifically, the vector generating module 23 performs a feature extraction on the screening result, and generates to-be-tested data of the user video playback feature vector, including: performing word segmentation on a video title and a video classification tag in the screening result by employing a word vector obtained by training a preset massive corpus through a word2vec model and IDF weight training, generating a video word vector, thereafter performing word vector calculation according to the user playback historical information in conjunction with time decay, and generating a user word vector. The user word vector and the video word vector here correspond to the following user feature vector and video feature vector.
[0077] The model calculating module 24 inputs to-be-tested data of a user video playback feature vector, calculates through a preset video playback integrity prediction model, and outputs a video playback integrity value of the to-be-tested data, wherein the preset video playback integrity prediction model is obtained through training by user video playback training data, and the user video playback feature vector at least includes a user feature vector and a video feature vector.
[0078] The data recommending module 25 sorts video playback integrity values of the to-be-tested data in a decreasing order, obtains topN video sorting results, and recommends the video sorting results to a corresponding user according to priority level, wherein N is an integer greater than 1.
[0079] A preferred mode of execution for the method of and device for predicting video playback integrity provided by the embodiments of the present invention is introduced below.
[0080] Firstly, the word segmentation tool of the present embodiment carries a lexicon therewith, entertainment stars, film and TV drama names, sports stars and team information are additionally added as supplementary lexicon, Netease news, Baidu encyclopedia and Date Recue/Date Received 2022-03-07 Wikipedia obtained by a crawler system constitute a massive corpus, and word segmentation and word vector training are performed with respect to the corpus to finally obtain word vector representation of each word (dimensions of the word vectors are 200 dimensions, as determined by test effect, and normalization is then performed on the vectors).
[0081] Under the aforementioned corpus, TF-IDF training is carried out to obtain IDF values, normalization is then performed, weight enhancement is subsequently performed on the supplementary lexicon as 1, similar to the attention mechanism, more attention is paid to these words.
[0082] See the following Table 1 for video information, in which are carried video IDs, video title information, classification tags, video tag information, and releasing times, etc. The video information is word-segmented, a word vector table of words is searched, and the word vector representation of the current video is obtained in combination with weighted calculation of an IDF value table (performing normalization).
[0083] Table 1 Video Information Table id title cata name tag names release time 3714869 Wu lei has athletics athletics, 2019-08-01 become the European 10:07:32 Mr. Key, Championship, winning odds Wu lei, sports show Espanyol lottery, lottery will bring information surprise in the station, sports new season lottery video collection
[0084] The phase of obtaining user poi Li ait is a process of calculating user word vector, and the target user group directed is a group of active users, namely recently active users (having Date Recue/Date Received 2022-03-07 playback records within 7 days lately) with certain volume of playbacks (exceeding 10 videos) within a period lately (such as the recent 30 days). Word vector calculation of the users is particularized according to tag categories, for instance, a user played 100 videos within a period lately, of which there are 60 relevant to athletics, 20 relevant to finance and economics, 15 relevant to amusement, 4 relevant to the society, and 1 relevant to health; during the process of user poll" __ ait, a poi ____________________ Li ait is drawn for the user under tag categories ranking top 3 in the proportions and with the proportions each exceeding 10%, through which method it is made possible to obtain the main interest points of the user, and to remove few operations by inadvertent clicking and hotspot videos that cannot represent the interest points of the user. In this example, athletics occupies 60%, finance and economics occupy 20%, amusement occupies 15%, society occupies 4%, and health occupies 1%, so the portrait is drawn for the user in terms of the three dimensions of athletics, finance and economics, and amusement with respect to the current user, and word vector representations of the corresponding dimensions of the user are calculated.
[0085] During the process of calculating user word vectors under different tag categories of the user, the time decay factor (for instance, the decay period is 5 days, the decay coefficient is 0.95, taking for example a video played back 12 days before the current date, the video crosses two decay periods, and should be decayed by 0.95^2) is combined to calculate the word vector representations of the user.
[0086] In the feature engineering constructing phase, taken into consideration are the user word vector (200 dimensions), video word vector (200 dimensions), proportion of a category watched by the user, proportions of historical playbacks of the user on a hourly basis, user gender, user age (divided according to the groups of over 20, 2030,¨ 30-40, 40-50, and over 50, and on-hot coded), current video classification tag, video time duration (unit:
second), video releasing time (number of days from the current time), video average playback integrity (average playback integrity of videos played by the user within the recent 24 hours), hits level (divided into five levels according to the number of playback Date Recue/Date Received 2022-03-07 times, and one-hot coded), the time of the video watched by the user (which day of the week, the current time period, one-hot coded), position information (one-hot coded according to the Province), terminal type (one-hot coded), and operator type (one-hot coded).
[0087] The above features are constructed according to playback records of the user within a period lately (such as the recent 30 days), and the deep learning model is trained in combination with the video playback integrity of the user.
[0088] The possible playback integrity of videos not played back by the target user is predicted through the model with respect to the result set recommended to the user in the recalling phase, and inversion is carried out according to the playback integrity to generate the final recommended result set.
[0089] As should be noted, when the device for predicting video playback integrity provided by this embodiment performs a video playback integrity predicting business, the division into the aforementioned various functional modules is merely by way of example, while it is possible, in actual application, to base on requirements to assign the functions to different functional modules for completion, that is to say, to divide the internal structure of the device into different functional modules to complete the entire or partial functions described above. In addition, the device for predicting video playback integrity provided by this embodiment pertains to the same conception as the method of predicting video playback integrity provided by the method embodiment ¨ see the corresponding method embodiment for its specific realization process, while no repetition will be made in this context.
[0090] All the aforementioned optional technical solutions are randomly combinable to form optional embodiments of the present invention, to which no repetition is made on a one-by-one basis.

Date Recue/Date Received 2022-03-07
[0091] To sum it up, in comparison with prior-art technology, the method of and device for predicting video playback integrity provided by the embodiments of the present invention achieve the following advantageous effects.
[0092] 1. By modifying the traditional CTR prediction method altogether, a video playback integrity indicator is introduced, video playback integrities of different users are predicted through a well trained preset video playback integrity prediction model, interest data of users more approaching to reality is obtained in such important information flow aspect as the watching time duration through prediction results of the video playback integrities, the precision in recognizing user interests is enhanced, reality relevancy of recommendations is hence enhanced, and relatively great enhancement is achieved in the watching time duration and satisfaction of users.
[0093] 2. Through vectorized representation of the user portrait, interest transfer of the user is reflected in combination with time decay of user behaviors, and hotspot videos and inadvertently clicked videos are filtered out in the process of user poll" __ ait, whereby interference with actual interest of the user is avoided, and the user poi __ hait is made more precise.
[0094] 3. By collecting such relevant data as user behavior data, video quality and video information, etc., vectorized representation of user features and video attributes is effectively made, as well as proportions of videos played back at various time periods, proportions of various categories and other environment information, different features and different data sources are merged in the application of short video recommendation sorting model through deep learning modeling and prediction of potential playback integrities of videos not watched by users, excellent effect is achieved, and average watching time duration of users is enhanced.

Date Recue/Date Received 2022-03-07
[0095] 4. By creating such features as user features, video features, contextual features and client side classification, deep learning modeling is employed, the playback integrity prediction mode is applied in a group of 10% randomly selected users through the AB test, and such indicators as CTR, daily average playback volume and user average playback integrity are compared through the final report. In the end, user average playback integrity and daily average playback volume are enhanced to a greater extent with slight decrease in the CTR.
[0096] 5. A TF-IDF algorithm is employed in terms of video recommendation, and key information of videos is effectively emphasized through IDF values.
[0097] 6. Reality relevancy of recommendations is enhanced through prediction of short video playback integrities, and increase in the time duration of stay of users is attempted.
[0098] As understandable by persons ordinarily skilled in the art, realization of the entire or partial steps of the aforementioned embodiments can be completed by hardware, or by a program instructing relevant hardware, the program can be stored in a computer-readable storage medium, and the storage medium can be a read-only memory, a magnetic disk, or an optical disk, etc.
[0099] The embodiments of the present application are described with reference to flowcharts and/or block diagrams of the method, device (system), and computer program product embodied in the embodiments of the present application. As should be understood, each flow and/or block in the flowcharts and/or block diagrams, and any combination of flow and/or block in the flowcharts and/or block diagrams can be realized by computer program instructions. These computer program instructions can be supplied to a general computer, a dedicated computer, an embedded processor or a processor of any other programmable data processing device to form a machine, so that the instructions executed by the computer or the processor of any other programmable data processing device Date Recue/Date Received 2022-03-07 generate a device for realizing the functions designated in one or more flow(s) of the flowcharts and/or one or more block(s) of the block diagrams.
[0100] These computer program instructions can also be stored in a computer-readable memory enabling a computer or any other programmable data processing device to operate by a specific mode, so that the instructions stored in the computer-readable memory generate a product containing instructing means, and this instructing means realizes the functions designated in one or more flow(s) of the flowcharts and/or one or more block(s) of the block diagrams.
[0101] These computer program instructions can also be loaded onto a computer or any other programmable data processing device, so as to execute a series of operations and steps on the computer or the any other programmable device to generate computer-realized processing, so that the instructions executed on the computer or the any other programmable device provide steps for realizing the functions designated in one or more flow(s) of the flowcharts and/or one or more block(s) of the block diagrams.
[0102] Although preferred embodiments in the embodiments of the present application have been described so far, it is still possible for persons skilled in the art to make additional modifications and amendments to these embodiments upon learning of the basic inventive conception. Accordingly, the attached Claims are meant to cover the preferred embodiments and all modifications and amendments that fall within the scope of the embodiments of the present application.
[0103] Apparently, persons skilled in the art can make various amendments and modifications to the present invention without departing from the spirit and scope of the present invention.
Thus, if such amendments and modifications to the present invention fall within the Claims of the present invention and equivalent technology, the present invention is also meant to cover these amendments and modifications.

Date Recue/Date Received 2022-03-07
[0104] What is described above is merely directed to preferred embodiments of the present invention, and they are not meant to restrict the present invention. Any amendment, equivalent replacement and improvement makeable within the spirit and scope of the present invention shall all be covered within the protection scope of the present invention.

Date Recue/Date Received 2022-03-07

Claims (10)

What is claimed is:
1. A method of predicting video playback integrity, characterized in comprising:
inputting to-be-tested data of a user video playback feature vector;
calculating through a preset video playback integrity prediction model; and outputting a video playback integrity value of the to-be-tested data; wherein the preset video playback integrity prediction model is obtained through training by user video playback training data, and the user video playback feature vector at least includes a user feature vector and a video feature vector.
2. The method according to Claim 1, characterized in further comprising:
collecting user video playback information data;
screening the user video playback information data, and obtaining a screening result; and performing a feature extraction on the screening result, and generating to-be-tested data of the user video playback feature vector.
3. The method according to Claim 2, characterized in that:
the step of collecting user video playback information data includes:
obtaining the user video playback information data containing user information, user playback historical information, video information and user client side information; and/or that the step of screening the user video playback information data, and obtaining a screening result includes: screening the user video playback information data by employing a multi-channel recalling mode including user collaboration, user searching, a topic model, popular recommendation, a user portrait and a video tag, and obtaining a screening result; and/or that the step of performing a feature extraction on the screening result, and generating to-be-tested data of the user video playback feature vector includes: performing word segmentation on a video title and a video classification tag in the screening result by employing a word vector obtained by training a preset massive corpus through a word2vec model and IDF
weight training, generating a video word vector, thereafter performing word vector calculation according to the user playback historical information in conjunction with time decay, and generating a user word vector.
4. The method according to Claim 1, characterized in that the preset video playback integrity prediction model contains DNNs of three hidden layers.
5. The method according to Claim 4, characterized in that the preset video playback integrity prediction model is obtained through training by inputting the user video playback training data, wherein the user video playback training data is an independent variable, while a user watching history video playback integrity value is a dependent variable, and the user video playback training data is a feature vector combined by a historical user vector and a historical video vector created according to the user playback historical information.
6. The method according to Claim 1, characterized in further comprising:
sorting video playback integrity values of the to-be-tested data in a decreasing order, obtaining topN video sorting results, and recommending the video sorting results to a corresponding user according to priority level, wherein N is an integer greater than 1.
7. A device for predicting video playback integrity, characterized in that the device comprises a model calculating module, and that the model calculating module is employed for:
inputting to-be-tested data of a user video playback feature vector, calculating through a preset video playback integrity prediction model, and outputting a video playback integrity value of the to-be-tested data, wherein the preset video playback integrity prediction model is obtained through training by user video playback training data, and the user video playback feature vector at least includes a user feature vector and a video feature vector.
8. The device according to Claim 7, characterized in further comprising a data collecting module, a data screening module, and a vector generating module, of which the data collecting module collects user video playback information data, the data screening module screens the user video playback information data, and obtains a screening result, and the vector generating module performs a feature extraction on the screening result, and generates to-be-tested data of the user video playback feature vector.
9. The device according to Claim 8, characterized in that:
the data collecting module obtains the user video playback information data containing user information, user playback historical information, video information and user client side information; and/or that the data screening module screens the user video playback information data by employing a multi-channel recalling mode including user collaboration, user searching, a topic model, popular recommendation, a user poi tiait and a video tag, and obtains a screening result;
and/or that the vector generating module performs a feature extraction on the screening result, and generates to-be-tested data of the user video playback feature vector, including: performing word segmentation on a video title and a video classification tag in the screening result by employing a word vector obtained by training a preset massive corpus through a word2vec model and IDF weight training, generating a video word vector, thereafter performing word vector calculation according to the user playback historical information in conjunction with time decay, and generating a user word vector.
10. The device according to Claim 7, characterized in further comprising a data recommending module for sorting video playback integrity values of the to-be-tested data in a decreasing order, obtaining topN video sorting results, and recommending the video sorting results to a corresponding user according to priority level, wherein N is an integer greater than 1.
CA3153598A 2019-09-05 2020-06-24 Method of and device for predicting video playback integrity Pending CA3153598A1 (en)

Applications Claiming Priority (3)

Application Number Priority Date Filing Date Title
CN201910845413.2 2019-09-05
CN201910845413.2A CN110704674B (en) 2019-09-05 2019-09-05 Video playing integrity prediction method and device
PCT/CN2020/097861 WO2021042826A1 (en) 2019-09-05 2020-06-24 Video playback completeness prediction method and apparatus

Publications (1)

Publication Number Publication Date
CA3153598A1 true CA3153598A1 (en) 2021-03-11

Family

ID=69195102

Family Applications (1)

Application Number Title Priority Date Filing Date
CA3153598A Pending CA3153598A1 (en) 2019-09-05 2020-06-24 Method of and device for predicting video playback integrity

Country Status (3)

Country Link
CN (1) CN110704674B (en)
CA (1) CA3153598A1 (en)
WO (1) WO2021042826A1 (en)

Families Citing this family (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110704674B (en) * 2019-09-05 2022-11-25 苏宁云计算有限公司 Video playing integrity prediction method and device
CN111918136B (en) * 2020-07-04 2022-07-01 中信银行股份有限公司 Interest analysis method and device, storage medium and electronic equipment
CN111538912B (en) * 2020-07-07 2020-12-25 腾讯科技(深圳)有限公司 Content recommendation method, device, equipment and readable storage medium
CN111565316B (en) * 2020-07-15 2020-10-23 腾讯科技(深圳)有限公司 Video processing method, video processing device, computer equipment and storage medium
CN112035740A (en) * 2020-08-19 2020-12-04 广州市百果园信息技术有限公司 Project use duration prediction method, device, equipment and storage medium
CN112887795B (en) * 2021-01-26 2023-04-21 脸萌有限公司 Video playing method, device, equipment and medium
CN115086705A (en) * 2021-03-12 2022-09-20 北京字跳网络技术有限公司 Resource preloading method, device, equipment and storage medium
CN113132803B (en) * 2021-04-23 2022-09-16 Oppo广东移动通信有限公司 Video watching time length prediction method, device, storage medium and terminal
CN113220936B (en) * 2021-06-04 2023-08-15 黑龙江广播电视台 Video intelligent recommendation method, device and storage medium based on random matrix coding and simplified convolutional network
CN113312512B (en) * 2021-06-10 2023-10-31 北京百度网讯科技有限公司 Training method, recommending device, electronic equipment and storage medium
CN113873330B (en) * 2021-08-31 2023-03-10 武汉卓尔数字传媒科技有限公司 Video recommendation method and device, computer equipment and storage medium
CN114339417A (en) * 2021-12-30 2022-04-12 未来电视有限公司 Video recommendation method, terminal device and readable storage medium
CN114339402A (en) * 2021-12-31 2022-04-12 北京字节跳动网络技术有限公司 Video playing completion rate prediction method, device, medium and electronic equipment
CN115082301B (en) * 2022-08-22 2022-12-02 中关村科学城城市大脑股份有限公司 Customized video generation method, device, equipment and computer readable medium

Family Cites Families (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105100165B (en) * 2014-05-20 2017-11-14 深圳市腾讯计算机系统有限公司 Network service recommends method and apparatus
US10516906B2 (en) * 2015-09-18 2019-12-24 Spotify Ab Systems, methods, and computer products for recommending media suitable for a designated style of use
CN106028071A (en) * 2016-05-17 2016-10-12 Tcl集团股份有限公司 Video recommendation method and system
US10827221B2 (en) * 2016-06-24 2020-11-03 Sourse Pty Ltd Selectively playing videos
CN106227883B (en) * 2016-08-05 2019-09-13 北京数码视讯科技股份有限公司 A kind of the temperature analysis method and device of multimedia content
CN106446052A (en) * 2016-08-31 2017-02-22 北京魔力互动科技有限公司 Video-on-demand program recommendation method based on user set
CN107832437B (en) * 2017-11-16 2021-03-02 北京小米移动软件有限公司 Audio/video pushing method, device, equipment and storage medium
CN107948761B (en) * 2017-12-12 2021-01-01 上海哔哩哔哩科技有限公司 Bullet screen play control method, server and bullet screen play control system
CN108460085A (en) * 2018-01-19 2018-08-28 北京奇艺世纪科技有限公司 A kind of video search sequence training set construction method and device based on user journal
CN108260008A (en) * 2018-02-11 2018-07-06 北京未来媒体科技股份有限公司 A kind of video recommendation method, device and electronic equipment
CN110059221B (en) * 2019-03-11 2023-10-20 咪咕视讯科技有限公司 Video recommendation method, electronic device and computer readable storage medium
CN110012356B (en) * 2019-04-16 2020-07-10 腾讯科技(深圳)有限公司 Video recommendation method, device and equipment and computer storage medium
CN110704674B (en) * 2019-09-05 2022-11-25 苏宁云计算有限公司 Video playing integrity prediction method and device

Also Published As

Publication number Publication date
CN110704674A (en) 2020-01-17
CN110704674B (en) 2022-11-25
WO2021042826A1 (en) 2021-03-11

Similar Documents

Publication Publication Date Title
CA3153598A1 (en) Method of and device for predicting video playback integrity
CN111241311B (en) Media information recommendation method and device, electronic equipment and storage medium
US8793256B2 (en) Method and apparatus for selecting related content for display in conjunction with a media
CN109543111A (en) Recommendation information screening technique, device, storage medium and server
JP5329900B2 (en) Digital information disclosure method in target area
CN104160712B (en) Associate computer implemented method, process circuit system and the computer-readable medium of media program
CN110390033B (en) Training method and device for image classification model, electronic equipment and storage medium
CN108108821A (en) Model training method and device
CA3150500A1 (en) Uploader matching method and device
CN109511015B (en) Multimedia resource recommendation method, device, storage medium and equipment
CN106131601A (en) Video recommendation method and device
CN108040294A (en) Automatic recommendation
CN106294830A (en) The recommendation method and device of multimedia resource
CN110008397B (en) Recommendation model training method and device
CN110390052B (en) Search recommendation method, training method, device and equipment of CTR (China train redundancy report) estimation model
CN106649647A (en) Ordering method and device for search results based on artificial intelligence
CN110175264A (en) Construction method, server and the computer readable storage medium of video user portrait
CN106874335A (en) Behavioral data processing method, device and server
CN112507163A (en) Duration prediction model training method, recommendation method, device, equipment and medium
CN111159563A (en) Method, device and equipment for determining user interest point information and storage medium
CN109800328A (en) Video recommendation method, its device, information processing equipment and storage medium
CN112464100A (en) Information recommendation model training method, information recommendation method, device and equipment
CN116610858A (en) Information distribution method, device, electronic equipment and storage medium
CN112989174A (en) Information recommendation method and device, medium and equipment
CN110232071A (en) Search method, device and storage medium, the electronic device of drug data

Legal Events

Date Code Title Description
EEER Examination request

Effective date: 20220916

EEER Examination request

Effective date: 20220916

EEER Examination request

Effective date: 20220916

EEER Examination request

Effective date: 20220916

EEER Examination request

Effective date: 20220916