CN114143612B

CN114143612B - Video display method, device, electronic equipment, storage medium and program product

Info

Publication number: CN114143612B
Application number: CN202111476817.2A
Authority: CN
Inventors: 张水发
Original assignee: Beijing Dajia Internet Information Technology Co Ltd
Current assignee: Beijing Dajia Internet Information Technology Co Ltd
Priority date: 2021-12-06
Filing date: 2021-12-06
Publication date: 2024-03-15
Anticipated expiration: 2041-12-06
Also published as: CN114143612A

Abstract

The present disclosure provides a video presentation method, apparatus, electronic device, storage medium, and program product. The video display method comprises the following steps: responding to a video playing request of a first target video in the video to be played, and acquiring video playing time length of the first target video; determining the playing type of the first target video according to the video playing time length; updating a prediction result of each second target video belonging to the playing type according to the characteristics of each second target video and the characteristics of the latest preset number of played videos belonging to the playing type aiming at a plurality of second target videos after the first target video in the video to be played; updating the final prediction result of each second target video according to the updated prediction result of each second target video belonging to the play type; and sequencing and displaying the plurality of second target videos according to the updated final prediction result of each second target video. According to the method and the device, the instant requirement of the user can be better met.

Description

Video display method, device, electronic equipment, storage medium and program product

Technical Field

The present disclosure relates generally to the field of electronics, and more particularly, to a video display method, apparatus, electronic device, storage medium, and program product.

Background

In order to reduce the server pressure and reduce the interaction times between the client and the server, the server generally transmits a plurality of videos to the client at a time, and the client displays the videos to a user in sequence. However, this approach ignores the user's immediate needs, resulting in the delivered video ordering not matching the user's real-time interest changes.

Disclosure of Invention

Exemplary embodiments of the present disclosure are directed to a video presentation method, apparatus, electronic device, storage medium, and program product to solve at least the problems of the related art described above.

According to a first aspect of an embodiment of the present disclosure, there is provided a video display method, including: responding to a video playing request of a first target video in the video to be played by a target account, and acquiring video playing time of the first target video; determining the playing type of the first target video according to the video playing time length; updating a prediction result of each second target video belonging to the playing type according to the characteristics of each second target video and the characteristics of the latest preset number of played videos belonging to the playing type aiming at a plurality of second target videos after the first target video in the video to be played; updating the final prediction result of each second target video according to the updated prediction result of each second target video belonging to the play type; and sequencing and displaying the plurality of second target videos according to the updated final prediction result of each second target video.

Optionally, the step of updating the prediction result of each second target video belonging to the play type according to the feature of each second target video and the feature of the last predetermined number of played videos belonging to the play type includes: when the playing type is long-playing, inputting the characteristics of each second target video and the characteristics of the latest first preset number of played videos belonging to the long-playing type into a first prediction model to obtain long-playing scores of each second target video, wherein the long-playing scores represent the possibility that the video can be long-played; and when the playing type is non-long playing, inputting the characteristics of each second target video and the characteristics of the latest second preset number of played videos belonging to the non-long playing type into a second prediction model to obtain a non-long playing score of each second target video, wherein the non-long playing score indicates the possibility that the video cannot be long played.

Optionally, the step of inputting the feature of each second target video and the feature of the last first predetermined number of played videos belonging to the long-cast type into the first prediction model to obtain the long-cast score of each second target video includes: inputting the characteristics of each second target video and the characteristics of the latest first preset number of the played videos belonging to the long-cast type into a first attention network to obtain the attention result of each played video in the latest first preset number of the played videos belonging to the long-cast type on the second target video; taking the attention result of each broadcasted video as a weighted value of the characteristics of the broadcasted video, and carrying out weighted summation operation on the characteristics of the latest first preset number of broadcasted videos belonging to the long-broadcast type; inputting the weighted sum operation result and the characteristics of the second target video into a first splicing layer; and inputting the vector output by the first splicing layer into a first classification layer after passing through a first full-connection layer, and obtaining the long-play fraction of the second target video output by the first classification layer.

Optionally, the step of inputting the feature of each second target video and the most recent second predetermined number of features of the played video belonging to the non-long-cast type into the second prediction model to obtain the non-long-cast score of each second target video includes: inputting the characteristics of each second target video and the characteristics of the latest second preset number of the played videos belonging to the non-long-cast type into a second attention network to obtain the attention result of each played video in the latest second preset number of the played videos belonging to the non-long-cast type on the second target video; taking the attention result of each broadcasted video as a weighted value of the characteristics of the broadcasted video, and carrying out weighted summation operation on the characteristics of the second latest preset number of broadcasted videos belonging to the non-long-cast type; inputting the weighted summation operation result and the characteristics of the second target video into a second splicing layer; and inputting the vector output by the second splicing layer into a second classification layer after passing through a second full-connection layer, and obtaining the non-long-play fraction of the second target video output by the second classification layer.

Optionally, the final prediction result of each second target video is the difference between the long-cast score and the non-long-cast score of the second target video.

Optionally, the video display method further includes: receiving the video to be broadcast and the characteristics thereof issued by a server side; and receiving the trained first prediction model and the trained second prediction model issued by the server side.

Optionally, the higher the final prediction result of the second target video, the higher the ranking.

According to a second aspect of embodiments of the present disclosure, there is provided a video display apparatus comprising: the video playing method comprises a playing time length obtaining unit, a video playing time length obtaining unit and a video playing unit, wherein the playing time length obtaining unit is configured to respond to a video playing request of a first target video in a video to be played by a target account; a play type determining unit configured to determine a play type of the first target video according to the video play duration; a play type prediction unit configured to update, for a plurality of second target videos after the first target video in the video to be played, a prediction result of each second target video belonging to the play type according to a feature of each second target video and a latest predetermined number of features of the played videos belonging to the play type; a final prediction result obtaining unit configured to update a final prediction result of each second target video according to the updated prediction result that each second target video belongs to the play type; and the ordering display unit is configured to order and display the plurality of second target videos according to the updated final prediction result of each second target video.

Optionally, the play type prediction unit is configured to: when the playing type is long-playing, inputting the characteristics of each second target video and the characteristics of the latest first preset number of played videos belonging to the long-playing type into a first prediction model to obtain long-playing scores of each second target video, wherein the long-playing scores represent the possibility that the video can be long-played; and when the playing type is non-long playing, inputting the characteristics of each second target video and the characteristics of the latest second preset number of played videos belonging to the non-long playing type into a second prediction model to obtain a non-long playing score of each second target video, wherein the non-long playing score indicates the possibility that the video cannot be long played.

Optionally, the play type prediction unit is configured to: when the playing type is long-playing, inputting the characteristics of each second target video and the characteristics of the latest first preset number of played videos belonging to the long-playing type into a first attention network to obtain the attention result of each played video in the latest first preset number of played videos belonging to the long-playing type on the second target video; taking the attention result of each broadcasted video as a weighted value of the characteristics of the broadcasted video, and carrying out weighted summation operation on the characteristics of the latest first preset number of broadcasted videos belonging to the long-broadcast type; inputting the weighted sum operation result and the characteristics of the second target video into a first splicing layer; and inputting the vector output by the first splicing layer into a first classification layer after passing through a first full-connection layer, and obtaining the long-play fraction of the second target video output by the first classification layer.

Optionally, the play type prediction unit is configured to: when the playing type is non-long-playing, inputting the characteristics of each second target video and the characteristics of a second preset number of recently played videos belonging to the non-long-playing type into a second attention network to obtain the attention result of each recently played video in the second preset number of played videos belonging to the non-long-playing type on the second target video; taking the attention result of each broadcasted video as a weighted value of the characteristics of the broadcasted video, and carrying out weighted summation operation on the characteristics of the second latest preset number of broadcasted videos belonging to the non-long-cast type; inputting the weighted summation operation result and the characteristics of the second target video into a second splicing layer; and inputting the vector output by the second splicing layer into a second classification layer after passing through a second full-connection layer, and obtaining the non-long-play fraction of the second target video output by the second classification layer.

Optionally, the video display device further comprises: the receiving unit is configured to receive the video to be broadcast and the characteristics thereof issued by the server side; and receiving the trained first prediction model and the trained second prediction model issued by the server side.

According to a third aspect of embodiments of the present disclosure, there is provided an electronic device, comprising: at least one processor; at least one memory storing computer-executable instructions, wherein the computer-executable instructions, when executed by the at least one processor, cause the at least one processor to perform a video presentation method as described above.

According to a fourth aspect of embodiments of the present disclosure, there is provided a computer readable storage medium, which when executed by at least one processor, causes the at least one processor to perform the video presentation method as described above.

According to a fifth aspect of embodiments of the present disclosure, there is provided a computer program product comprising computer instructions which, when executed by at least one processor, implement a video presentation method as described above.

According to the embodiment of the disclosure, the instant requirement of the user can be captured through the real-time consumption condition of the user on the video, and the video display sequence is adjusted according to the instant requirement of the user, so that the displayed video can be matched with the real-time interest change of the user, the instant requirement of the user can be met, and better user experience can be achieved.

It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the disclosure.

Drawings

The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the disclosure and together with the description, serve to explain the principles of the disclosure and do not constitute an undue limitation on the disclosure.

FIG. 1 illustrates an application scenario diagram of a video presentation method according to an exemplary embodiment of the present disclosure;

FIG. 2 illustrates a flowchart of a video presentation method according to an exemplary embodiment of the present disclosure;

fig. 3 illustrates a flowchart of a method of deriving a long-cast score for a video to be cast, according to an exemplary embodiment of the present disclosure;

fig. 4 illustrates a flowchart of a method of deriving a non-long cast score of a video to be cast, according to an exemplary embodiment of the present disclosure;

fig. 5 shows a block diagram of a video presentation device according to an exemplary embodiment of the present disclosure;

fig. 6 shows a block diagram of an electronic device according to an exemplary embodiment of the present disclosure.

Detailed Description

In order to enable those skilled in the art to better understand the technical solutions of the present disclosure, the technical solutions of the embodiments of the present disclosure will be clearly and completely described below with reference to the accompanying drawings.

It should be noted that the terms "first," "second," and the like in the description and claims of the present disclosure and in the foregoing figures are used for distinguishing between similar objects and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used may be interchanged where appropriate such that the embodiments of the disclosure described herein may be capable of operation in sequences other than those illustrated or described herein. The implementations described in the following exemplary examples are not representative of all implementations consistent with the present disclosure. Rather, they are merely examples of apparatus and methods consistent with some aspects of the present disclosure as detailed in the accompanying claims.

It should be noted that, in this disclosure, "at least one of the items" refers to a case where three types of juxtaposition including "any one of the items", "a combination of any of the items", "an entirety of the items" are included. For example, "including at least one of a and B" includes three cases side by side as follows: (1) comprises A; (2) comprising B; (3) includes A and B. For example, "at least one of the first and second steps is executed", that is, three cases are juxtaposed as follows: (1) performing step one; (2) executing the second step; (3) executing the first step and the second step.

Fig. 1 illustrates an example of an application scenario of a video presentation method according to an exemplary embodiment of the present disclosure.

Referring to fig. 1, when a user needs to watch a video at a client, the client may send a video acquisition request to a server; the server side can respond to the video acquisition request of the client side and send a plurality of videos to be broadcast and the characteristics of the videos to be broadcast to the client side; the client can sort the plurality of to-be-played videos and display the plurality of to-be-played videos according to the sorting result, and in the process of displaying the to-be-played videos, the video display method according to the exemplary embodiment of the disclosure can be executed, and the video playing time length of a first target video in the to-be-played videos is obtained in response to a video playing request of the target account; determining the playing type of the first target video according to the video playing time length; updating a prediction result of each second target video belonging to the playing type according to the characteristics of each second target video and the characteristics of the latest preset number of played videos belonging to the playing type aiming at a plurality of second target videos after the first target video in the video to be played; updating the final prediction result of each second target video according to the updated prediction result of each second target video belonging to the play type; and sequencing and displaying the plurality of second target videos according to the updated final prediction result of each second target video. Therefore, the displayed video can be more matched with the real-time interest change of the user, and the instant requirement of the user is met.

It should be appreciated that the video presentation method according to the exemplary embodiments of the present disclosure may be applied not only to the above-described scenario, but also to other suitable scenarios, to which the present disclosure is not limited.

Fig. 2 shows a flowchart of a video presentation method according to an exemplary embodiment of the present disclosure. As an example, the video presentation method may be performed by a client.

Referring to fig. 2, in step S101, in response to a video playing request of a first target video in a to-be-played video by a target account, a video playing duration of the first target video is obtained.

Specifically, the first target video in the video to be played can be played in response to a video playing request of the target account, and the video playing time of the first target video can be obtained. For example, the video playing duration of the first target video may be a total duration that the first target video has been played when the playing of the first target video is finished.

In step S102, a play type of the first target video is determined according to the video play duration.

As an example, the play type of the video may include: long-cast and non-long-cast. It should be appreciated that the play types of video may also be partitioned from other dimensions based on the duration of the video, which is not limiting in this disclosure.

As an example, a play whose play duration exceeds a preset threshold belongs to a long-play type, otherwise belongs to a non-long-play type.

In step S103, for a plurality of second target videos after the first target video in the to-be-played videos, according to the feature of each second target video and the features of the latest predetermined number of played videos belonging to the play type, updating the prediction result of each second target video belonging to the play type.

As an example, the characteristics of a video may be: the video's embedding feature. It should be understood that other types of features are possible, as the disclosure is not limited in this regard.

As an example, the prediction that a video belongs to a long-cast type characterizes the likelihood that the video will be long-cast. For example, the prediction that a video belongs to a long-cast type may be expressed in the form of a score (e.g., may be referred to as a long-cast score), the higher the likelihood that the video will be long-cast, the higher the long-cast score. Video may be read long-cast as: the playing duration of playing the video exceeds the preset threshold value.

As an example, a prediction result that a video belongs to a non-long-cast type characterizes the likelihood that the video will not be long-cast. For example, the prediction that a video belongs to a non-long-cast type may be represented in the form of a score (e.g., may be referred to as a non-long-cast score), the higher the likelihood that the video will not be long-cast, the higher the non-long-cast score. Video is not long-cast and can be understood as: the playing duration of playing the video does not exceed the preset threshold value.

In step S104, the final prediction result of each second target video is updated according to the updated prediction result that each second target video belongs to the play type.

As an example, when the play type is long-cast, the feature of each second target video and the feature of the last first predetermined number of the played videos belonging to the long-cast type may be input into the first prediction model, and the long-cast score of the second target video is obtained as the updated prediction result of the second target video belonging to the long-cast type. The long-cast score indicates the likelihood that the video will be long-cast.

As an example, when the play type is not long-cast, the feature of each second target video and the feature of the last second predetermined number of the played videos belonging to the non-long-cast type may be input to the second prediction model, so as to obtain the non-long-cast score of the second target video as the updated prediction result of the second target video belonging to the non-long-cast type. The non-long cast score indicates the likelihood that the video will not be long cast.

As an example, the final prediction result of each video may be obtained based on the prediction results that the video belongs to the respective play types, respectively. For example, when the play type includes long-cast and non-long-cast, the final prediction result of each video may be obtained based on the prediction result of the video belonging to the long-cast type and the prediction result belonging to the non-long-cast type. For example, there may be a difference between the prediction result belonging to the long-cast type and the prediction result belonging to the non-long-cast type. For example, the final prediction result of a video may be the difference (m-n) of the predicted long-cast score m from the non-long-cast score n for that video.

Specifically, after the prediction result of each second target video belonging to the play type is updated, the final prediction result of the second target video needs to be updated according to the updated prediction result of the second target video belonging to the play type. For example, when the play type is long-cast, according to the updated prediction result of the second target video belonging to the long-cast type and the last obtained prediction result of the second target video belonging to the non-long-cast type, the final prediction result of the second target video is obtained as the updated final prediction result of the second target video.

As an example, the first predictive model may include: a first attention content network, a first splice layer, a first full connection layer, and a first classification layer. It should be understood that this model structure is only an example, and the first prediction model may be constructed in other forms, which the present disclosure is not limited to. For example, the first fully connected layer may comprise a two-layer fc network. For example, the first classification layer may use a softmax function. An exemplary embodiment of step S103 will be described below in conjunction with fig. 3.

As an example, the second predictive model may include: a second attention network, a second splice layer, a second full connection layer, and a second classification layer. It should be understood that this model structure is only an example, and the second prediction model may be constructed in other forms, which the present disclosure is not limited to. For example, the second fully connected layer may comprise a two-layer fc network. For example, the second classification layer may use a softmax function. An exemplary embodiment of step S103 will be described below in conjunction with fig. 4.

In step S105, the plurality of second target videos are ranked and displayed according to the updated final prediction result of each second target video.

Specifically, the plurality of second target videos are reordered according to the updated final prediction result of each second target video, and are displayed according to the ordering result. I.e. the earlier the ranking, the more preferentially it is presented.

As an example, the higher the final prediction result of the second target video, the higher the ranking.

As an example, the video presentation method according to an exemplary embodiment of the present disclosure may further include: and receiving the video to be broadcast and the characteristics thereof issued by the server side. For example, N videos to be broadcast and the characteristics of each video to be broadcast issued by the server side may be received. Here, N is an integer greater than 0. For example, N may be an integer greater than or equal to 8. As an example, when receiving the video to be broadcast issued by the server, the feature of each video to be broadcast and the features of the last first predetermined number of video to be broadcast belonging to the long broadcast type may be input to the first prediction model, so as to obtain an initial long broadcast score of each video to be broadcast; inputting the characteristics of each video to be broadcast and the characteristics of the latest second preset number of broadcast videos belonging to the non-long broadcast type into a second prediction model to obtain an initial non-long broadcast score of each video to be broadcast; then, based on the long-broadcast score and the non-long-broadcast score of each video to be broadcast, obtaining a final prediction result of the video to be broadcast; and then, according to the final prediction result of each video to be broadcast, which is obtained initially, the video to be broadcast is sequenced initially and displayed.

According to an exemplary embodiment of the present disclosure, after initial ordering and presentation of the video to be broadcast, each time one presented video to be broadcast (i.e., a first target video) is long broadcast, features of each video to be broadcast (i.e., a second target video) arranged behind the long broadcast video and features of a last first predetermined number of broadcast videos belonging to a long broadcast type are input into a first prediction model, resulting in a long broadcast score (i.e., an updated long broadcast score) of each second target video that is re-predicted; then, updating the final prediction result of each second target video based on the original non-long-cast score and the re-predicted long-cast score of the second target video; and sequencing and displaying the plurality of second target videos according to the updated final prediction result of each second target video. In other words, each time after the user consumes (i.e., views) one of the presented videos P to be played, if the video P is long-played, the long-play score of the video to be played that is arranged after the video P needs to be updated, because the sequence of the last first predetermined number of played videos belonging to the long-play type changes: video P is added to the sequence and accordingly an original video in the sequence is topped off. Therefore, the final prediction result of the subsequent video to be played is affected, and the subsequent video to be played is reordered according to the final prediction result updated in real time and sequentially displayed to the user.

According to an exemplary embodiment of the present disclosure, after initial ordering and presentation of the video to be broadcast, each time a presented video to be broadcast (i.e., a first target video) is not long broadcast, inputting the feature of each video to be broadcast (i.e., a second target video) arranged behind the video and the features of a second predetermined number of recently broadcast videos belonging to a non-long broadcast type into a second prediction model to obtain a non-long broadcast score (i.e., an updated non-long broadcast score) of each second target video to be re-predicted; then, updating the final prediction result of each second target video based on the original long-cast score and the re-predicted non-long-cast score of the second target video; and sequencing and displaying the plurality of second target videos according to the updated final prediction result of each second target video. In other words, each time after the user consumes (i.e., views) one of the presented video Q to be played, if the video Q is not to be played long, the non-long-cast score of the video to be played back arranged behind the video Q needs to be updated, because the sequence of the second predetermined number of played videos belonging to the non-long-cast type has changed recently: video Q is added to the sequence and accordingly, an original video in the sequence is topped off. Therefore, the final prediction result of the subsequent video to be played is affected, and the subsequent video to be played is reordered according to the final prediction result updated in real time and sequentially displayed to the user.

As an example, the video presentation method according to an exemplary embodiment of the present disclosure may further include: and receiving the trained first prediction model and the trained second prediction model issued by the server side. For example, the model parameters updated by the first predictive model may be received each time the first predictive model is updated after the first and second trained predictive models are initially received; each time the second predictive model is updated, model parameters of the second predictive model after updating are received.

Fig. 3 illustrates a flowchart of a method of deriving a long-cast score for a video to be cast, according to an exemplary embodiment of the present disclosure. Here, the first prediction model includes: a first attention network, a first stitching layer, a first full connection layer, and a first classification layer.

Referring to fig. 3, in step S1011, the feature of each second target video and the feature of the latest first predetermined number of the played videos belonging to the long-cast type are input to the first attention network, and the attention result of each of the latest first predetermined number of the played videos belonging to the long-cast type with respect to the second target video is obtained.

Specifically, the first attention network may combine, through an attention mechanism, the feature of each of the recently first predetermined number of broadcasted videos belonging to the long-cast type with the feature of the second target video to obtain an attention result of each of the recently first predetermined number of broadcasted videos belonging to the long-cast type with respect to the second target video.

In step S1012, the attention result of each of the played video is used as a weighted value of the features of the played video, and a weighted summation operation is performed on the features of the last first predetermined number of played video belonging to the long-play type.

Specifically, the attention result of each of the broadcast videos is multiplied by the feature of the broadcast video to obtain a product result, and the product results corresponding to the first latest predetermined number of broadcast videos belonging to the long broadcast type are summed.

In step S1013, the weighted sum operation result and the feature of the second target video are input to the first splicing layer.

That is, the feature of the second target video and the weighted sum operation result are concat.

In step S1014, the vector output by the first splicing layer is input to the first classification layer after passing through the first full-connection layer, so as to obtain the long-play score of the second target video output by the first classification layer.

As an example, the first predictive model may be trained using a cross entropy loss function. It should be understood that other types of loss functions are possible, as the disclosure is not limited in this regard.

Fig. 4 illustrates a flowchart of a method of deriving a non-long cast score of a video to be cast, according to an exemplary embodiment of the present disclosure. The second predictive model may include: a second attention network, a second splice layer, a second full connection layer, and a second classification layer.

Referring to fig. 4, in step S1021, the feature of each second target video and the feature of the latest second predetermined number of the broadcasted videos belonging to the non-long-cast type are input to the second attention network, and the attention result of each of the latest second predetermined number of the broadcasted videos belonging to the non-long-cast type with respect to the second target video is obtained.

In step S1022, the attention result of each of the played video is taken as a weighted value of the features of the played video, and a weighted summation operation is performed on the features of the second most recent predetermined number of played video belonging to the non-long-cast type.

In step S1023, the weighted sum operation result and the feature of the second target video are input to the second splicing layer.

In step S1024, the vector output by the second splicing layer is input to the second classification layer after passing through the second full-connection layer, so as to obtain the non-long-play fraction of the second target video output by the second classification layer.

As an example, the second predictive model may be trained using a cross entropy loss function. It should be understood that other types of loss functions are possible, as the disclosure is not limited in this regard.

The technical scheme provided by the embodiment of the disclosure at least brings the following beneficial effects:

The client can sort a plurality of videos issued to the client and display the videos according to the sorting result, so that the computing capacity of the client is fully utilized, and the pressure of the server is relieved;

in the process of video sequencing, the real-time consumption condition of the video by the user captured in real time can be used, so that the sequencing result is more matched with the real-time interest change of the user, the instant requirement of the user can be met, and better user experience is achieved;

by adopting the feedback network, the positive feedback of the user to the video can be captured in real time, the negative feedback of the user to the video can be captured in real time, and the ordering of the video issued to the client can be adjusted in real time through the positive feedback and the negative feedback, so that the user requirements can be better met, and the user experience is improved well.

Fig. 5 shows a block diagram of a video presentation device according to an exemplary embodiment of the present disclosure.

As shown in fig. 5, a video display apparatus 10 according to an exemplary embodiment of the present disclosure includes: a play duration acquisition unit 101, a play type determination unit 102, a play type prediction unit 103, a final prediction result acquisition unit 104, and a ranking presentation unit 105.

Specifically, the playback time length acquiring unit 101 is configured to acquire the video playback time length of a first target video in the to-be-played videos in response to a video playback request of the first target video by a target account.

The play type determining unit 102 is configured to determine a play type of the first target video according to the video play duration.

The play type prediction unit 103 is configured to update, for a plurality of second target videos subsequent to the first target video in the video to be played, a prediction result of each second target video belonging to the play type according to a feature of each second target video and a latest predetermined number of features of the played videos belonging to the play type.

The final prediction result obtaining unit 104 is configured to update the final prediction result of each second target video according to the updated prediction result that each second target video belongs to the play type.

The ranking presentation unit 105 is configured to rank and present the plurality of second target videos according to the updated final prediction result of each second target video.

As an example, the play type prediction unit 103 may be configured to: when the playing type is long-playing, inputting the characteristics of each second target video and the characteristics of the latest first preset number of played videos belonging to the long-playing type into a first prediction model to obtain long-playing scores of each second target video, wherein the long-playing scores represent the possibility that the video can be long-played; and when the playing type is non-long playing, inputting the characteristics of each second target video and the characteristics of the latest second preset number of played videos belonging to the non-long playing type into a second prediction model to obtain a non-long playing score of each second target video, wherein the non-long playing score indicates the possibility that the video cannot be long played.

As an example, the play type prediction unit 103 may be configured to: when the playing type is long-playing, inputting the characteristics of each second target video and the characteristics of the latest first preset number of played videos belonging to the long-playing type into a first attention network to obtain the attention result of each played video in the latest first preset number of played videos belonging to the long-playing type on the second target video; taking the attention result of each broadcasted video as a weighted value of the characteristics of the broadcasted video, and carrying out weighted summation operation on the characteristics of the latest first preset number of broadcasted videos belonging to the long-broadcast type; inputting the weighted sum operation result and the characteristics of the second target video into a first splicing layer; and inputting the vector output by the first splicing layer into a first classification layer after passing through a first full-connection layer, and obtaining the long-play fraction of the second target video output by the first classification layer.

As an example, the play type prediction unit 103 may be configured to: when the playing type is non-long-playing, inputting the characteristics of each second target video and the characteristics of a second preset number of recently played videos belonging to the non-long-playing type into a second attention network to obtain the attention result of each recently played video in the second preset number of played videos belonging to the non-long-playing type on the second target video; taking the attention result of each broadcasted video as a weighted value of the characteristics of the broadcasted video, and carrying out weighted summation operation on the characteristics of the second latest preset number of broadcasted videos belonging to the non-long-cast type; inputting the weighted summation operation result and the characteristics of the second target video into a second splicing layer; and inputting the vector output by the second splicing layer into a second classification layer after passing through a second full-connection layer, and obtaining the non-long-play fraction of the second target video output by the second classification layer.

As an example, the final prediction result for each second target video may be the difference between the long-cast score and the non-long-cast score for that second target video.

As an example, the video presentation device 10 may further comprise: a receiving unit (not shown) configured to receive the video to be broadcast and its characteristics issued by the server side; and receiving the trained first prediction model and the trained second prediction model issued by the server side.

With respect to the video display apparatus 10 of the above embodiment, the specific manner in which the respective units perform the operations has been described in detail in the embodiment regarding the method, and will not be described in detail herein.

Furthermore, it should be understood that the various elements of video display device 10 in the above-described embodiments may be implemented as hardware components and/or as software components. The individual units may be implemented, for example, using a Field Programmable Gate Array (FPGA) or an Application Specific Integrated Circuit (ASIC), depending on the processing performed by the individual units as defined.

Fig. 6 shows a block diagram of an electronic device according to an exemplary embodiment of the present disclosure. Referring to fig. 6, the electronic device 20 includes: at least one memory 201 and at least one processor 202, the at least one memory 201 having stored therein a set of computer-executable instructions that, when executed by the at least one processor 202, perform the video presentation method as described in the above exemplary embodiments.

By way of example, the electronic device 20 may be a PC computer, tablet device, personal digital assistant, smart phone, or other device capable of executing the above-described set of instructions. Here, the electronic device 20 is not necessarily a single electronic device, but may be any apparatus or a collection of circuits capable of executing the above-described instructions (or instruction sets) individually or in combination. The electronic device 20 may also be part of an integrated control system or system manager, or may be configured as a portable electronic device that interfaces with either locally or remotely (e.g., via wireless transmission).

In electronic device 20, processor 202 may include a Central Processing Unit (CPU), a Graphics Processor (GPU), a programmable logic device, a special purpose processor system, a microcontroller, or a microprocessor. By way of example, and not limitation, processor 202 may also include an analog processor, a digital processor, a microprocessor, a multi-core processor, a processor array, a network processor, and the like.

The processor 202 may execute instructions or code stored in the memory 201, wherein the memory 201 may also store data. The instructions and data may also be transmitted and received over a network via a network interface device, which may employ any known transmission protocol.

The memory 201 may be integrated with the processor 202, for example, RAM or flash memory disposed within an integrated circuit microprocessor or the like. In addition, the memory 201 may include a stand-alone device, such as an external disk drive, a storage array, or other storage device usable by any database system. The memory 201 and the processor 202 may be operatively coupled or may communicate with each other, such as through an I/O port, network connection, etc., such that the processor 202 is able to read files stored in the memory.

In addition, the electronic device 20 may also include a video display (such as a liquid crystal display) and a user interaction interface (such as a keyboard, mouse, touch input device, etc.). All components of the electronic device 20 may be connected to each other via a bus and/or a network.

According to an exemplary embodiment of the present disclosure, there may also be provided a computer-readable storage medium storing instructions, wherein the instructions, when executed by at least one processor, cause the at least one processor to perform the video presentation method as described in the above exemplary embodiment. Examples of the computer readable storage medium herein include: read-only memory (ROM), random-access programmable read-only memory (PROM), electrically erasable programmable read-only memory (EEPROM), random-access memory (RAM), dynamic random-access memory (DRAM), static random-access memory (SRAM), flash memory, nonvolatile memory, CD-ROM, CD-R, CD + R, CD-RW, CD+RW, DVD-ROM, DVD-R, DVD + R, DVD-RW, DVD+RW, DVD-RAM, BD-ROM, BD-R, BD-R LTH, BD-RE, blu-ray or optical disk storage, hard Disk Drives (HDD), solid State Disks (SSD), card memory (such as multimedia cards, secure Digital (SD) cards or ultra-fast digital (XD) cards), magnetic tape, floppy disks, magneto-optical data storage, hard disks, solid state disks, and any other means configured to store computer programs and any associated data, data files and data structures in a non-transitory manner and to provide the computer programs and any associated data, data files and data structures to a processor or computer to enable the processor or computer to execute the programs. The computer programs in the computer readable storage media described above can be run in an environment deployed in a computer device, such as a client, host, proxy device, server, etc., and further, in one example, the computer programs and any associated data, data files, and data structures are distributed across networked computer systems such that the computer programs and any associated data, data files, and data structures are stored, accessed, and executed in a distributed fashion by one or more processors or computers.

According to an exemplary embodiment of the present disclosure, a computer program product may also be provided, instructions in the computer program product being executable by at least one processor to perform the video presentation method as described in the above exemplary embodiment.

Other embodiments of the disclosure will be apparent to those skilled in the art from consideration of the specification and practice of the disclosure disclosed herein. This application is intended to cover any adaptations, uses, or adaptations of the disclosure following, in general, the principles of the disclosure and including such departures from the present disclosure as come within known or customary practice within the art to which the disclosure pertains. It is intended that the specification and examples be considered as exemplary only, with a true scope and spirit of the disclosure being indicated by the following claims.

It is to be understood that the present disclosure is not limited to the precise arrangements and instrumentalities shown in the drawings, and that various modifications and changes may be effected without departing from the scope thereof. The scope of the present disclosure is limited only by the appended claims.

Claims

1. A video presentation method, comprising:

responding to a video playing request of a first target video in the video to be played by a target account, and acquiring video playing time of the first target video;

Determining the play type of the first target video according to the video play duration, wherein the play type is a long-play type or a non-long-play type;

updating a prediction result of each second target video belonging to the playing type according to the characteristics of each second target video and the characteristics of the latest preset number of played videos belonging to the playing type aiming at a plurality of second target videos after the first target video in the video to be played;

updating the final prediction result of each second target video according to the updated prediction result of each second target video belonging to the play type;

and sequencing and displaying the plurality of second target videos according to the updated final prediction result of each second target video.

2. The video presentation method according to claim 1, wherein the step of updating the prediction result of each second target video belonging to the play type based on the characteristics of each second target video and the characteristics of the latest predetermined number of played videos belonging to the play type comprises:

when the playing type is long-playing, inputting the characteristics of each second target video and the characteristics of the latest first preset number of played videos belonging to the long-playing type into a first prediction model to obtain long-playing scores of each second target video, wherein the long-playing scores represent the possibility that the video can be long-played;

And when the playing type is non-long playing, inputting the characteristics of each second target video and the characteristics of the latest second preset number of played videos belonging to the non-long playing type into a second prediction model to obtain a non-long playing score of each second target video, wherein the non-long playing score indicates the possibility that the video cannot be long played.

3. The video presentation method according to claim 2, wherein the step of inputting the feature of each second target video and the most recent first predetermined number of features of the played video belonging to the long-cast type into the first prediction model to obtain the long-cast score of each second target video includes:

inputting the characteristics of each second target video and the characteristics of the latest first preset number of the played videos belonging to the long-cast type into a first attention network to obtain the attention result of each played video in the latest first preset number of the played videos belonging to the long-cast type on the second target video;

taking the attention result of each broadcasted video as a weighted value of the characteristics of the broadcasted video, and carrying out weighted summation operation on the characteristics of the latest first preset number of broadcasted videos belonging to the long-broadcast type;

Inputting the weighted sum operation result and the characteristics of the second target video into a first splicing layer;

and inputting the vector output by the first splicing layer into a first classification layer after passing through a first full-connection layer, and obtaining the long-play fraction of the second target video output by the first classification layer.

4. The video presentation method according to claim 2, wherein the step of inputting the feature of each second target video and the most recent second predetermined number of features of the broadcasted video belonging to the non-long-cast type into the second prediction model to obtain the non-long-cast score of each second target video comprises:

inputting the characteristics of each second target video and the characteristics of the latest second preset number of the played videos belonging to the non-long-cast type into a second attention network to obtain the attention result of each played video in the latest second preset number of the played videos belonging to the non-long-cast type on the second target video;

taking the attention result of each broadcasted video as a weighted value of the characteristics of the broadcasted video, and carrying out weighted summation operation on the characteristics of the second latest preset number of broadcasted videos belonging to the non-long-cast type;

inputting the weighted summation operation result and the characteristics of the second target video into a second splicing layer;

And inputting the vector output by the second splicing layer into a second classification layer after passing through a second full-connection layer, and obtaining the non-long-play fraction of the second target video output by the second classification layer.

5. The video presentation method of claim 2, wherein the final prediction result for each second target video is a difference between the long-cast score and the non-long-cast score of the second target video.

6. The video presentation method of claim 2, wherein the video presentation method further comprises:

receiving the video to be broadcast and the characteristics thereof issued by a server side;

and receiving the trained first prediction model and the trained second prediction model issued by the server side.

7. The video presentation method of claim 1, wherein the higher the final prediction result of the second target video, the higher the ranking.

8. A video display apparatus, comprising:

the video playing method comprises a playing time length obtaining unit, a video playing time length obtaining unit and a video playing unit, wherein the playing time length obtaining unit is configured to respond to a video playing request of a first target video in a video to be played by a target account;

a play type determining unit configured to determine a play type of the first target video according to the video play duration, wherein the play type is a long-play type or a non-long-play type;

A play type prediction unit configured to update, for a plurality of second target videos after the first target video in the video to be played, a prediction result of each second target video belonging to the play type according to a feature of each second target video and a latest predetermined number of features of the played videos belonging to the play type;

a final prediction result obtaining unit configured to update a final prediction result of each second target video according to the updated prediction result that each second target video belongs to the play type;

and the ordering display unit is configured to order and display the plurality of second target videos according to the updated final prediction result of each second target video.

9. The video presentation device of claim 8, wherein the play type prediction unit is configured to:

10. The video presentation device of claim 9, wherein the play type prediction unit is configured to:

when the playing type is long-playing, inputting the characteristics of each second target video and the characteristics of the latest first preset number of played videos belonging to the long-playing type into a first attention network to obtain the attention result of each played video in the latest first preset number of played videos belonging to the long-playing type on the second target video;

11. The video presentation device of claim 9, wherein the play type prediction unit is configured to:

when the playing type is non-long-playing, inputting the characteristics of each second target video and the characteristics of a second preset number of recently played videos belonging to the non-long-playing type into a second attention network to obtain the attention result of each recently played video in the second preset number of played videos belonging to the non-long-playing type on the second target video;

12. The video display device of claim 9, wherein the final prediction result for each second target video is a difference between the long-cast score and the non-long-cast score of the second target video.

13. The video display device of claim 9, wherein the video display device further comprises:

the receiving unit is configured to receive the video to be broadcast and the characteristics thereof issued by the server side; and receiving the trained first prediction model and the trained second prediction model issued by the server side.

14. The video presentation device of claim 8, wherein the higher the final prediction result of the second target video, the higher the ranking.

15. An electronic device, comprising:

at least one processor;

at least one memory storing computer-executable instructions,

wherein the computer executable instructions, when executed by the at least one processor, cause the at least one processor to perform the video presentation method of any one of claims 1 to 7.

16. A computer-readable storage medium, wherein instructions in the computer-readable storage medium, when executed by at least one processor, cause the at least one processor to perform the video presentation method of any of claims 1 to 7.