Disclosure of Invention
In order to solve the problem of low accuracy of the existing video screening method, the invention provides a video screening method and device based on big data.
A big data based video screening method comprising:
Acquiring an initial video set related to user information according to the user information, wherein the initial video set comprises at least two initial videos;
For any one initial video in the initial video set, acquiring a first feature vector of the initial video according to a preset screening model;
acquiring a first loss prediction result of the first feature vector according to a preset loss prediction model;
screening a first intermediate video from the initial video set according to a first loss prediction result corresponding to each initial video;
Acquiring user characteristic data of the user according to the user information;
And screening a second intermediate video from the first intermediate video by combining the first intermediate video with the user characteristic data.
Further, the user information includes a historical video play record of the user;
the method comprises the steps of obtaining an initial video set related to user information according to the user information, wherein the initial video set is specifically:
and acquiring videos in the same field and related fields as each video recorded in the historical video play record according to the historical video play record, wherein the acquired videos form the initial video set.
Further, the screening model includes at least two convolution layers, each capable of outputting a first feature vector;
The method comprises the steps of obtaining a first feature vector of the initial video according to a preset screening model, wherein the first feature vector is specifically:
obtaining a first feature vector of the initial video according to a convolution layer in the screening model;
The loss prediction model comprises at least two loss prediction sub-models and a classifier, wherein each loss prediction sub-model corresponds to each convolution layer one by one, and the input of each loss prediction sub-model is a first feature vector output by the corresponding convolution layer;
the method comprises the steps of obtaining a first loss prediction result of the first feature vector according to a preset loss prediction model, wherein the first loss prediction result is specifically:
For any one first feature vector, inputting the first feature vector into a loss predictor model corresponding to the first feature vector, and obtaining a first vector output by the loss predictor model aiming at the first feature vector;
integrating the obtained first vectors to obtain second vectors;
And obtaining the first loss prediction result according to the second vector and the classifier.
Further, the first loss prediction result corresponding to each initial video comprises a prediction loss value corresponding to each initial video;
The first intermediate video is screened from the initial video set according to the first loss prediction result corresponding to each initial video, specifically:
And comparing the predicted loss value corresponding to each initial video with a preset loss threshold value, and acquiring the initial video corresponding to the predicted loss value which is larger than or equal to the preset loss threshold value according to the comparison result to obtain the first intermediate video.
Further, the acquiring the user characteristic data of the user according to the user information specifically includes:
acquiring possible interesting video fields of the user according to the user information, wherein the possible interesting video fields comprise video fields of the user and associated video fields associated with the video fields of the user;
The step of screening a second intermediate video from the first intermediate video by combining the first intermediate video with the user characteristic data specifically includes:
And acquiring the video in the possible interesting video field in the first intermediate video according to the possible interesting video field, and obtaining the second intermediate video.
Further, the acquiring process of the associated video field specifically includes:
acquiring the association degree of the video domain concerned by the user and other video domains in the video domain knowledge graph according to a preset video domain knowledge graph;
And comparing each association degree with a preset association degree threshold value, acquiring a target association degree which is larger than or equal to the preset association degree threshold value according to a comparison result, and acquiring a video field corresponding to the target association degree to acquire the associated video field.
A big data based video screening device comprising a memory and a processor, and a computer program stored on the memory and running on the processor, the processor implementing the big data based video screening method as described above when executing the computer program.
The beneficial effects of the invention are as follows: according to user information, an initial video set related to the user information is obtained, then, according to a preset screening model and a loss prediction model, feature vectors are obtained, loss prediction results are obtained, according to first loss prediction results corresponding to all initial videos in the initial video set, first intermediate videos are screened out from the initial video set, then, user feature data of a user are obtained according to the user information, and finally, the first intermediate videos and the user feature data are combined, and second intermediate videos are screened out from the first intermediate videos. Therefore, the video screening method based on big data provided by the invention firstly acquires the initial video set, then combines the preset screening model and the loss prediction model, screens the first intermediate video from the initial video set, screens the second intermediate video from the first intermediate video according to the user characteristic data, and sequentially carries out two layers of screening on the initial video set according to different screening rules.
Detailed Description
The embodiment provides a video screening method based on big data, which can be used for a computer or intelligent terminal equipment.
As shown in fig. 1, the video filtering method includes the following steps:
Step S1: according to user information, acquiring an initial video set related to the user information, wherein the initial video set comprises at least two initial videos:
and acquiring an initial video set related to the user information according to the user information, wherein the initial video set comprises at least two initial videos. It should be appreciated that the initial video set typically includes a large number of initial videos to be screened. The initial video set may be obtained from a corresponding background server.
The user information is specifically set according to actual needs, and in this embodiment, the user information includes a history video playing record of the user, and may also include personal information of the user, such as a focus video field filled in by the user during registration. It should be understood that, after the user logs in to the corresponding account, each time a video is played, the background will record the play record of the video, thus forming a historical video play record. Moreover, the length of the time period corresponding to the historical video playing record is set according to actual needs, for example: half a year or one year.
Then, according to the historical video play record, the videos in the same field and the related fields as the videos recorded in the historical video play record are acquired, and the acquired videos form an initial video set. Specifically: according to the historical video play records, the fields of all videos recorded in the historical video play records are acquired, then the fields related to all the video fields are acquired, and the related fields can be acquired according to a preset field relation database in a background server, wherein the field relation database comprises the relation among all the fields in all the currently known video fields. Then, as long as the fields related to the respective video fields recorded in the history video play record are the related fields.
Then, videos in the same field and related fields as each video recorded in the history video play record are acquired from the background server, and the acquired videos form an initial video set.
As other implementations, the initial video set may directly include all videos stored in the background server.
Step S2: for any one initial video in the initial video set, acquiring a first feature vector of the initial video according to a preset screening model:
A screening model is preset and is used for obtaining a corresponding first feature vector according to the initial video, so that the screening model is a network model for extracting the feature vector of the video, can be constructed according to actual needs, and can also be directly used for an existing network model with a feature vector extraction function. The specific type of screening model is set by the actual need, such as a convolutional neural network model.
The screening model may include only one convolution layer, or may set at least two convolution layers according to actual needs, in this embodiment, in order to improve accuracy of feature vectors and subsequent data processing, the screening model includes at least two convolution layers, and each convolution layer can output a first feature vector. In the present embodiment, each of the first feature vectors obtained from each of the convolution layers may be regarded as a feature vector in which the feature extraction depth gradually increases.
Since the data processing procedure of any one of the initial videos in the initial video set is the same, the following description will take any one of the initial videos in the initial video set as an example. And respectively obtaining the first eigenvectors of the initial video according to each convolution layer in the screening model.
Step S3: according to a preset loss prediction model, a first loss prediction result of the first feature vector is obtained:
The loss prediction model is preset and is used for obtaining a first loss prediction result according to a first feature vector obtained by the screening model, so that the loss prediction model is a network model for obtaining the loss prediction result according to the feature vector, can be constructed according to actual needs, and can also be a network model with a loss prediction function directly used. The specific type of the loss prediction model is set according to actual needs, such as a deep learning neural network, a convolution neural network and the like, and can comprise a pooling layer, a full connection layer and a nonlinear layer, wherein the number and the specific structure of each layer are not limited.
Because the screening model comprises at least two convolution layers, each convolution layer can output a first feature vector, and correspondingly, the loss prediction model comprises at least two loss prediction sub-models and a classifier, the loss prediction sub-models are in one-to-one correspondence with the convolution layers, and the input of each loss prediction sub-model is the first feature vector output by the corresponding convolution layer.
And as the screening model obtains the first feature vectors with the same number as the convolution layer, for any one first feature vector, inputting the first feature vector into a loss predictor model corresponding to the first feature vector, and obtaining a first vector output by the loss predictor model aiming at the first feature vector.
Then, the obtained first vectors are integrated to obtain a second vector, for example: and splicing the first vectors to obtain a second vector, or carrying out average calculation on elements at the same position in the first vectors to obtain the second vector.
And finally, obtaining a first loss prediction result of the initial video according to the obtained second vector and the classifier.
Through the above process, the first loss prediction result of other initial videos is obtained.
In this embodiment, the first feature vectors of the features with different depths are combined to obtain the first loss prediction result, so that the one-sided problem caused by a single feature vector can be avoided, and further the accuracy of loss prediction is improved.
Step S4: according to a first loss prediction result corresponding to each initial video, screening a first intermediate video from the initial video set:
and screening the first intermediate video from the initial video set according to the first loss prediction result corresponding to each initial video. In this embodiment, the first loss prediction result corresponding to each initial video includes a predicted loss value corresponding to each initial video, which may be also understood as: the first loss prediction result corresponding to each initial video is the prediction loss value corresponding to each initial video.
A loss threshold value is preset, and the preset loss threshold value is specifically set according to actual needs. And comparing the predicted loss value corresponding to each initial video with a preset loss threshold value, and acquiring the initial video corresponding to the predicted loss value which is greater than or equal to the preset loss threshold value according to the comparison result, wherein the acquired initial video is the first intermediate video.
Step S5: according to the user information, user characteristic data of the user are obtained:
And acquiring user characteristic data of the user according to the user information. The user characteristic data is characteristic data only related to a user, is data information specific to the user, and is taken as a specific implementation manner, the user characteristic data is a video field possibly interested by the user, and the video field possibly interested comprises two parts, namely: a video domain in which the user has focused, and an associated video domain associated with the video domain in which the user has focused. The video fields of interest of the user are the video fields of interest filled in by the user when registering. The associated video domain associated with the video domain in which the user has focused is a video domain having a certain association with the video domain in which the user has focused. As a specific embodiment, a specific acquisition procedure of the associated video domain is given below: the background server is preset with a video domain knowledge graph, and the video domain knowledge graph comprises the correlation degree between the known video domains, wherein the higher the correlation degree is, the closer the relationship between the two video domains is represented. Then, according to the preset video domain knowledge graph, acquiring the association degree of the video domain concerned by the user and other video domains in the video domain knowledge graph; and comparing each relevance with a preset relevance threshold value, wherein the preset relevance threshold value is set by actual requirements, acquiring a target relevance which is larger than or equal to the preset relevance threshold value according to a comparison result, namely acquiring a relevance which is larger than the relevance of the video domain concerned by the user, and finally acquiring the video domain corresponding to the target relevance to acquire the associated video domain.
Step S6: and screening a second intermediate video from the first intermediate video by combining the first intermediate video with the user characteristic data:
And after the first intermediate video and the user characteristic data are obtained, screening a second intermediate video from the first intermediate video. Since the fields of the respective videos in the first intermediate video are divided into two parts: and one part is in the video field of possible interest, and the other part is not in the video field of possible interest, then acquiring videos in the video field of possible interest in the first intermediate video, wherein the videos are second intermediate videos, namely the finally required videos. These videos may then be pushed to the user.
The embodiment also provides a big data based video screening device, which comprises a memory, a processor and a computer program stored on the memory and running on the processor, wherein the processor realizes the big data based video screening method when executing the computer program. Since the detailed description of the method for screening video based on big data is given above, the detailed description is omitted.