CN111753135A

CN111753135A - Video display method, device, terminal, server, system and storage medium

Info

Publication number: CN111753135A
Application number: CN202010437842.9A
Authority: CN
Inventors: 韩金泽
Original assignee: Beijing Dajia Internet Information Technology Co Ltd
Current assignee: Beijing Dajia Internet Information Technology Co Ltd
Priority date: 2020-05-21
Filing date: 2020-05-21
Publication date: 2020-10-09
Anticipated expiration: 2040-05-21
Also published as: CN111753135B

Abstract

The present disclosure relates to a video display method, apparatus, terminal, server, system, and storage medium, the video display method comprising: acquiring a corresponding target keyword according to the playing state of the currently played target video; generating a corresponding keyword tag according to the target keyword, and displaying the keyword tag in a playing area of the target video; and responding to the trigger operation implemented on the keyword tag, and acquiring and displaying the associated video corresponding to the target keyword. According to the method and the device, the acquisition way for acquiring the associated video corresponding to the target keyword is provided for the user watching the video through the target keyword corresponding to the video playing state during video playing, so that the user can acquire the associated video meeting the self requirement quickly without complex operations such as self-search, and the video interaction experience of the user is improved.

Description

Video display method, device, terminal, server, system and storage medium

Technical Field

The present disclosure relates to internet technologies, and in particular, to a video display method, apparatus, terminal, server, system, and storage medium.

Background

With the development of internet technology, more and more users hope to acquire information meeting their own requirements by watching videos, but at present, after watching videos, if there are similar video requirements, users need to determine keywords by themselves and retrieve and acquire the keywords according to the keywords, so that the operation is complex, and the user experience is affected.

Disclosure of Invention

The present disclosure provides a video display method, device, terminal, server, system, and storage medium, so as to at least solve the problem of complicated operation when a user acquires a related video in the related art. The technical scheme of the disclosure is as follows:

according to a first aspect of the embodiments of the present disclosure, there is provided a video display method, including:

acquiring a corresponding target keyword according to the playing state of the currently played target video;

generating a corresponding keyword tag according to the target keyword, and displaying the keyword tag in a playing area of the target video;

and responding to the trigger operation implemented on the keyword tag, and acquiring and displaying the associated video corresponding to the target keyword.

Optionally, the obtaining a corresponding target keyword according to the playing state of the currently played target video includes:

determining playing content corresponding to the playing state according to the playing state of the currently played target video;

and selecting candidate keywords associated with the playing content from the candidate keywords acquired in advance as the target keywords.

Optionally, the method further includes:

performing image recognition on each frame in the target video, and determining the image content in each frame;

and extracting keywords corresponding to the image content from the image content of each frame to obtain candidate keywords corresponding to the target video.

Optionally, the method further includes:

acquiring user information corresponding to a user currently playing the target video;

and screening candidate keywords which are acquired in advance and correspond to the target video according to the user information, acquiring candidate keywords which are associated with the user information, and obtaining the updated candidate keywords to select the target keywords.

Optionally, the screening, according to the user information, the candidate keywords that are obtained in advance and correspond to the target video to obtain the candidate keywords associated with the user information includes:

taking the user information and the candidate keywords corresponding to the target video as input, processing through a keyword screening model, and outputting the candidate keywords associated with the user information;

wherein the user portrait information at least comprises user portrait information and user historical behavior information;

the keyword screening model is obtained by training a sample set through a preset type of machine learning algorithm, each sample included in the sample set comprises historical behavior data of a keyword displayed in a user account playing video, and the user account has a corresponding user portrait.

Optionally, the playing status includes a playing time point;

the acquiring of the corresponding target keyword according to the playing state of the currently played target video includes:

and acquiring a target keyword corresponding to the playing time point according to the playing time point of the currently played target video.

Optionally, displaying the keyword tag in a playing area of the target video, including:

determining a video content area corresponding to the target keyword in the playing area, and displaying the keyword tag in the video content area; or

And displaying the keyword tag at a preset display position of a playing area of the target video.

Optionally, the obtaining and displaying the associated video corresponding to the target keyword includes:

searching in a preset video database according to a target keyword to obtain an associated video corresponding to the target keyword;

and displaying the associated video in the corresponding page display area.

Optionally, the method further includes:

and displaying all target keywords corresponding to the target video after the target video is played.

Optionally, the displaying all the target keywords corresponding to the target video includes:

and displaying a mask layer on the target video, and displaying all target keywords corresponding to the target video on the mask layer.

Optionally, the method further includes:

when the target video is played, sending a keyword acquisition request to a server, and triggering the server to feed back a corresponding target keyword; and/or

The obtaining and displaying of the associated video corresponding to the target keyword comprises:

sending a related video acquisition request to a server, and triggering the server to feed back a related video corresponding to the target keyword;

and displaying the associated video.

According to a second aspect of the embodiments of the present disclosure, there is provided a video presentation method, including:

receiving a keyword acquisition request sent by a client, wherein the keyword acquisition request comprises the playing state of a target video currently played by the client;

acquiring a target keyword corresponding to the playing state, and feeding the target keyword back to the client, wherein the target keyword is used for displaying in a playing area of a target video of the client;

receiving an associated video acquisition request sent by the client, wherein the associated video acquisition request comprises the target keyword;

and acquiring an associated video corresponding to the target keyword, and feeding back the associated video to the client, wherein the associated video is used for displaying in the client.

Optionally, the method further includes:

carrying out image recognition on each frame in the target video in advance, and determining the image content in each frame;

extracting candidate keywords from the image content of each frame to obtain candidate keywords corresponding to the target video, and correspondingly storing the candidate keywords and the playing state of the target video;

the acquiring of the target keyword corresponding to the playing state includes:

and acquiring candidate keywords corresponding to the playing state as target keywords.

Optionally, the obtaining of the candidate keyword corresponding to the playing state as the target keyword includes:

acquiring user information corresponding to the client;

and screening candidate keywords corresponding to the playing state according to the user information, and acquiring candidate keywords associated with the user information as the target keywords.

Optionally, the user information includes user portrait information and user historical behavior information.

According to a third aspect of the embodiments of the present disclosure, there is provided a video presentation apparatus, including:

the target keyword acquisition module is configured to acquire a corresponding target keyword according to the playing state of a currently played target video;

the first keyword display module is configured to generate a corresponding keyword tag according to the target keyword, and display the keyword tag in a playing area of the target video;

and the associated video acquisition module is configured to respond to the triggering operation implemented on the keyword tag and acquire and display the associated video corresponding to the target keyword.

Optionally, the target keyword obtaining module includes:

a playing content determination unit configured to determine a playing content corresponding to a playing state according to the playing state of a currently played target video;

and the target keyword acquisition unit is configured to select candidate keywords associated with the playing content from the candidate keywords acquired in advance as the target keywords.

Optionally, the apparatus further comprises:

the image recognition module is configured to perform image recognition on each frame in the target video and determine image content in each frame;

and the keyword extraction module is configured to extract keywords corresponding to the image content from the image content of each frame to obtain candidate keywords corresponding to the target video.

Optionally, the apparatus further comprises:

the user information acquisition module is configured to acquire user information corresponding to a user currently playing the target video;

and the keyword screening module is configured to screen candidate keywords which are acquired in advance and correspond to the target video according to the user information, acquire candidate keywords which are associated with the user information, and obtain the updated candidate keywords to select the target keywords.

Optionally, the keyword screening module is specifically configured to:

Optionally, the playing status includes a playing time point;

the target keyword acquisition module is specifically configured to:

Optionally, the first keyword presentation module is specifically configured to:

Optionally, the associated video obtaining module includes:

the relevant video acquisition unit is configured to search in a preset video database according to a target keyword and acquire a relevant video corresponding to the target keyword;

and the associated video display unit is configured to display the associated video in the corresponding page display area.

Optionally, the apparatus further comprises:

and the second keyword display module is configured to display all target keywords corresponding to the target video after the target video is played.

Optionally, the second keyword display module is specifically configured to:

and after the target video is played, displaying a cover layer on the target video, and displaying all target keywords corresponding to the target video on the cover layer.

Optionally, the apparatus further comprises:

the keyword acquisition request sending module is configured to send a keyword acquisition request to a server when the target video is played, and trigger the server to feed back a corresponding target keyword; and/or

The associated video acquisition module comprises:

the associated video acquisition request sending unit is configured to send an associated video acquisition request to a server and trigger the server to feed back an associated video corresponding to the target keyword;

an associated video presentation unit configured to present the associated video.

According to a fourth aspect of the embodiments of the present disclosure, there is provided a video display apparatus, comprising:

the system comprises a keyword acquisition request receiving module, a keyword acquisition request processing module and a keyword acquisition module, wherein the keyword acquisition request receiving module is configured to receive a keyword acquisition request sent by a client, and the keyword acquisition request comprises the playing state of a target video currently played by the client;

the keyword feedback module is configured to acquire a target keyword corresponding to the playing state and feed the target keyword back to the client, wherein the target keyword is used for displaying in a playing area of a target video of the client;

a relevant video acquisition request receiving module configured to receive a relevant video acquisition request sent by the client, where the relevant video acquisition request includes the target keyword;

and the associated video feedback module is configured to acquire an associated video corresponding to the target keyword and feed back the associated video to the client, wherein the associated video is used for displaying in the client.

Optionally, the apparatus further comprises:

the image recognition module is configured to perform image recognition on each frame in the target video in advance and determine image content in each frame;

the keyword extraction module is configured to extract candidate keywords from the image content of each frame to obtain candidate keywords corresponding to the target video, and correspondingly store the candidate keywords and the playing state of the target video;

the keyword feedback module comprises:

a keyword acquisition unit configured to acquire a candidate keyword corresponding to the play state as a target keyword.

Optionally, the keyword obtaining unit is specifically configured to:

acquiring user information corresponding to the client;

According to a fifth aspect of the embodiments of the present disclosure, there is provided a terminal, including:

a processor;

a memory for storing the processor-executable instructions;

wherein the processor is configured to execute the instructions to implement the video presentation method as described in the first aspect.

According to a sixth aspect of embodiments of the present disclosure, there is provided a server including:

a processor;

a memory for storing the processor-executable instructions;

wherein the processor is configured to execute the instructions to implement the video presentation method according to the second aspect.

According to a seventh aspect of embodiments of the present disclosure, there is provided a storage medium, wherein instructions, when executed by a processor of a terminal, enable the terminal to perform the video presentation method according to the first aspect, or, when executed by a processor of a server, enable the server to perform the video presentation method according to the second aspect.

According to an eighth aspect of embodiments of the present disclosure, there is provided a computer program product comprising readable program code which, when executed by a processor of a terminal, enables the terminal to perform the video presentation method as described in the first aspect, or which, when executed by a processor of a server, enables the server to perform the video presentation method as described in the second aspect.

The technical scheme provided by the embodiment of the disclosure at least brings the following beneficial effects:

according to the method and the device, the target keyword corresponding to the playing state of the currently played target video is obtained, the keyword tag corresponding to the target keyword is generated and displayed in the playing area of the target video, the associated video corresponding to the target keyword is obtained to be displayed in response to the triggering operation implemented on the keyword tag, an obtaining way for obtaining the associated video corresponding to the target keyword is provided for a user watching the video through the target keyword corresponding to the playing state of the video when the video is played, the user can obtain the associated video meeting the self-demand quickly without complicated operations such as self-search, and the video interaction experience of the user is improved.

It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the disclosure.

Drawings

The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the present disclosure and, together with the description, serve to explain the principles of the disclosure and are not to be construed as limiting the disclosure.

FIG. 1 is a flow diagram illustrating a video presentation method in accordance with an exemplary embodiment;

2a-2c are exemplary diagrams respectively illustrating keyword tags in video content regions corresponding to target keywords in embodiments of the present disclosure;

FIG. 3 is an exemplary diagram illustrating display of all target keywords after completion of target video playback in an embodiment of the disclosure;

FIG. 4 is a flow diagram illustrating a method of video presentation in accordance with an exemplary embodiment;

FIG. 5 is an exemplary diagram of a video picture in an embodiment of the present disclosure;

FIG. 6 is a flow diagram illustrating a video push method in accordance with an exemplary embodiment;

FIG. 7 is a block diagram of a video presentation device according to an exemplary embodiment;

FIG. 8 is a block diagram illustrating a video push device in accordance with an exemplary embodiment;

FIG. 9 is a block diagram illustrating a terminal in accordance with an exemplary embodiment;

FIG. 10 is a block diagram illustrating a server in accordance with an exemplary embodiment;

FIG. 11 is a block diagram illustrating a video presentation system in accordance with an exemplary embodiment;

fig. 12 is a flow chart illustrating a video presentation method according to an example embodiment.

Detailed Description

In order to make the technical solutions of the present disclosure better understood by those of ordinary skill in the art, the technical solutions in the embodiments of the present disclosure will be clearly and completely described below with reference to the accompanying drawings.

It should be noted that the terms "first," "second," and the like in the description and claims of the present disclosure and in the above-described drawings are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used is interchangeable under appropriate circumstances such that the embodiments of the disclosure described herein are capable of operation in sequences other than those illustrated or otherwise described herein. The implementations described in the exemplary embodiments below are not intended to represent all implementations consistent with the present disclosure. Rather, they are merely examples of apparatus and methods consistent with certain aspects of the present disclosure, as detailed in the appended claims.

Fig. 1 is a flow chart illustrating a video presentation method according to an exemplary embodiment, as shown in fig. 1, including the following steps.

In step S11, a corresponding target keyword is obtained according to the playing status of the currently played target video.

Wherein, the playing state may include a playing time point and/or playing content of the target video.

And in the process of playing the target video, acquiring a target keyword corresponding to the playing state according to the current playing state. The target keyword corresponding to the playing state may be obtained by identifying the playing content of the target video in the corresponding playing state in advance.

In step S12, a corresponding keyword tag is generated according to the target keyword, and is displayed in the playing area of the target video.

After the target keyword corresponding to the playing state is acquired, the keyword tag corresponding to the target keyword is generated, and the keyword tag can be displayed in the playing area of the target video. The display position of the keyword tag in the playing area is not limited, and for example, the keyword tag may be displayed near the video content corresponding to the target keyword.

In an exemplary embodiment, the presenting the keyword tag in the playing area of the target video includes: determining a video content area corresponding to the target keyword in the playing area, and displaying the keyword tag in the video content area; or displaying the keyword tag at a preset display position of a playing area of the target video.

When displaying the keyword tag, a display position of the keyword tag is first determined, where the display position of the keyword tag may be a video content region corresponding to the target keyword, or may be a preset display position, for example, the preset display position may be a position such as an upper right corner of a playing region.

Fig. 2a to 2c are respectively exemplary diagrams showing keyword tags in a video content region corresponding to a target keyword in the embodiment of the present disclosure, as shown in fig. 2a, a castle is included in a video picture corresponding to a playing state, then the shown target keyword may be a "beautiful castle", and the keyword tags are shown near the castle, as shown in fig. 2b, if a castle of a middle century in the united kingdom is recognized in the corresponding video picture in advance as the target video is played, then the shown target keyword is the "british castle", as shown in fig. 2c, the video content is further played, and the video content is switched to the inside of the castle, then the shown target keyword is the "inside of the castle".

The keyword labels are displayed in the video content areas corresponding to the target keywords, so that the target keywords can be associated with the video content by a user, and the displayed keyword labels are displayed at preset display positions in the playing areas of the target videos, so that the displayed keyword labels can be prevented from interfering the user in watching the target videos.

In step S13, in response to the triggering operation performed on the keyword tag, an associated video corresponding to the target keyword is acquired and displayed.

The displayed keyword tag can be used for a user to perform triggering operation, and when the triggering operation of the user on the keyword tag is detected, the associated video corresponding to the target keyword is obtained and displayed. When the associated video is displayed, the user can jump to an associated video display page and display the associated video on the associated video display page, or the user can display the associated video in a preset area of a currently played page of the target video. Wherein the associated video corresponding to the target keyword may be a video including the target keyword.

In an exemplary embodiment, the obtaining and displaying the associated video corresponding to the target keyword includes: searching in a preset video database according to a target keyword to obtain an associated video corresponding to the target keyword; and displaying the associated video in the corresponding page display area.

The method comprises the steps that a plurality of videos can be stored in a preset video database, and keywords corresponding to the videos can also be stored, so that the videos including target keywords can be searched in the preset video database according to the target keywords, related videos corresponding to the target keywords are obtained, and the related videos are displayed after the related videos corresponding to the target keywords are obtained. When the associated video is displayed, the associated video can be displayed in the playing area of the target video, or the display page of the associated video can be skipped to and the associated video can be displayed on the display page. The associated videos corresponding to the target keywords are searched in the preset video database and displayed in the corresponding page display areas, so that all the associated videos corresponding to the target keywords can be displayed in time. When the associated video is displayed through the display page of the associated video, if the triggering operation of the user on the return button is detected, jumping to the play page of the target video, and continuing to play the target video.

According to the video display method provided by the exemplary embodiment, the target keyword corresponding to the playing state of the currently played target video is acquired, the keyword tag corresponding to the target keyword is generated and displayed in the playing area of the target video, the associated video corresponding to the target keyword is acquired in response to the triggering operation implemented on the keyword tag and displayed, the acquisition path for acquiring the associated video corresponding to the target keyword is provided for the user watching the video through the target keyword corresponding to the playing state of the video during video playing, the user can acquire the associated video meeting the self-demand quickly without complicated operations such as self-search, and the video interaction experience of the user is improved.

On the basis of the technical scheme, the playing state comprises a playing time point;

the acquiring of the corresponding target keyword according to the playing state of the currently played target video includes: and acquiring a target keyword corresponding to the playing time point according to the playing time point of the currently played target video.

For a target video, the playing content of a playing time point is fixed, so that a target keyword can be corresponding to the playing time point of the target video, when the target video is played currently, whether the target keyword corresponding to the playing time point exists is inquired according to the playing time point, if the target keyword corresponding to the playing time point exists, the target keyword is obtained, and therefore the target keyword corresponding to the current playing content can be obtained rapidly and displayed in time.

On the basis of the technical scheme, the method further comprises the following steps:

After the target video is played, a replay button can be displayed, all target keywords corresponding to the target video are displayed at the same time, the displayed target keywords can be used for a user to perform triggering operation, if the triggering operation of the user on one of the target keywords is detected, the associated video corresponding to the target keyword can be searched in a preset video database, and the associated video is displayed. By displaying all target keywords corresponding to the target video after the target video is played, another way for acquiring the associated video corresponding to the target keywords is provided for the user watching the video, and the video interaction experience of the user is further improved.

On the basis of the above technical solution, the displaying all target keywords corresponding to the target video includes: and displaying a mask layer on the target video, and displaying all target keywords corresponding to the target video on the mask layer.

As shown in fig. 3, after the target video is played, a mask layer is displayed on the target video, the mask layer may cover the entire display screen, and a playback button is displayed on the mask layer, and all target keywords corresponding to the target video are displayed on the mask layer. The target keywords are displayed through the cover layer, so that the displayed target keywords can be clearer, and the change of the video picture of the target video can be avoided.

On the basis of the technical scheme, the method further comprises the following steps: when the target video is played, sending a keyword acquisition request to a server, and triggering the server to feed back a corresponding target keyword; and/or

sending a related video acquisition request to a server, and triggering the server to feed back a related video corresponding to the target keyword; and displaying the associated video.

The target keywords and associated videos may be fed back by the client trigger server. When the client plays the target video, a keyword acquisition request is sent to the server to trigger the server to feed back the target keywords corresponding to the target video, or a keyword acquisition request is generated according to the playing state of the currently played target video and sent to the server to trigger the server to feed back the target keywords corresponding to the playing state; when the triggering operation of the keyword tag implemented by the user is detected, an associated video acquisition request is sent to the server, wherein the associated video acquisition request comprises a target keyword corresponding to the keyword tag, so that the server can be triggered to feed back the associated video corresponding to the target keyword. The target keywords and the associated videos are fed back through the server, so that tasks with large calculation amount are completed by the server, and the response speed of the client is guaranteed.

Fig. 4 is a flow chart illustrating a video presentation method according to an exemplary embodiment, as shown in fig. 4, including the following steps.

In step S41, the playback content corresponding to the playback status is determined based on the playback status of the currently played target video.

In the process of playing the target video, the playing content corresponding to the playing state is determined, for example, if there is a basketball court in the picture of the currently played target video, the playing content corresponding to the playing state can be determined as the basketball court.

In step S42, a candidate keyword associated with the played content is selected from the candidate keywords acquired in advance as the target keyword.

The candidate keywords obtained in advance may be all candidate keywords corresponding to the target video. After the playing content corresponding to the playing state is determined, candidate keywords associated with the playing content can be selected from the candidate keywords acquired in advance to serve as target keywords, and keyword labels corresponding to the target keywords are displayed in the current video picture. For example, the pre-obtained candidate keywords include keywords such as a basketball court, a basketball, and the like, and if the playing content corresponding to the playing state is the basketball court, a keyword "basketball court" associated with the playing content may be selected from the pre-obtained candidate keywords as the target keyword.

In step S43, a corresponding keyword tag is generated according to the target keyword, and is displayed in the playing area of the target video.

In step S44, in response to the triggering operation performed on the keyword tag, an associated video corresponding to the target keyword is acquired and displayed.

According to the video display method provided by the exemplary embodiment, the playing content corresponding to the playing state is determined, the candidate keywords associated with the playing content are selected from the candidate keywords selected in advance and serve as the target keywords, so that the determined target keywords are associated with the current playing content, different target keywords can be displayed according to different playing contents, the current playing content of a user can be reminded through the displayed keyword tags, the associated video can be directly displayed according to the triggering operation of the user on the keyword tags, the operation that the user independently abstracts the keywords and searches the associated video according to the current playing content is simplified, the corresponding associated video can be obtained through the displayed keyword tags, and the problem that the user abstracts the keywords and cannot accurately obtain the corresponding associated video is avoided.

On the basis of the technical scheme, the method further comprises the following steps: performing image recognition on each frame in the target video, and determining the image content in each frame; and extracting keywords corresponding to the image content from the image content of each frame to obtain candidate keywords corresponding to the target video.

The method comprises the steps of carrying out image recognition on each frame in a target video in advance, determining image content in each frame, carrying out keyword extraction on the image content of each frame to obtain a keyword corresponding to each frame, and carrying out duplication removal on the keyword corresponding to each frame to obtain a candidate keyword corresponding to the target video. The candidate keywords corresponding to the target video are determined by carrying out image recognition on each frame in the target video in advance, so that the target keywords can be selected from the candidate keywords in time and displayed when the target video is played, and the problem that the target keywords are not displayed in time due to the fact that the image recognition is carried out when the target video is played is solved.

When image recognition is performed on each frame, recognition can be performed based on a video tag, and the video tag can be a tag with a hierarchy, so that a higher hierarchy can be recognized first, and then specific image content can be recognized. In a tag having a hierarchy, for example, a primary tag may be a sport, under which a plurality of secondary tags such as diving, roller skating, sports, fitness, skiing, etc. may be included, and for sports, a tertiary tag such as balls, which may also include a quaternary tag such as soccer, basketball, table tennis, baseball, golf, etc. Taking the video picture shown in fig. 5 as an example, when content of the video picture is identified, it may be determined that objects therein include people and balls, the people include multiple people, the balls are basketballs, and further, it may be determined that a scene of the video picture is a basketball court, that is, keywords corresponding to the video picture may include multiple people, basketballs, basketball courts, and the like.

On the basis of the technical scheme, the method further comprises the following steps: acquiring user information corresponding to a user currently playing the target video; and screening candidate keywords which are acquired in advance and correspond to the target video according to the user information, acquiring candidate keywords which are associated with the user information, and obtaining the updated candidate keywords to select the target keywords.

Wherein the user information may include user portrait information and user historical behavior information.

The candidate keywords corresponding to the target video are keywords associated with the video content in the target video, for one user, some keywords are interesting to the user, and some keywords are uninteresting to the user, at this time, the candidate keywords corresponding to the target video, which are obtained in advance, can be screened according to the user information, the keywords which are possibly clicked by the current user are reserved, the candidate keywords associated with the user information are obtained, and therefore, when the target keywords are selected, the candidate keywords are selected from the updated candidate keywords, the displayed target keywords are in accordance with the user interests, and the video viewing interference caused by displaying too many keywords which are uninteresting to the user can be avoided.

In an exemplary embodiment, the screening, according to the user information, a candidate keyword that is obtained in advance and corresponds to the target video to obtain a candidate keyword associated with the user information includes: taking the user information and the candidate keywords corresponding to the target video as input, processing through a keyword screening model, and outputting the candidate keywords associated with the user information; wherein the user portrait information at least comprises user portrait information and user historical behavior information; the keyword screening model is obtained by training a sample set through a preset type of machine learning algorithm, each sample included in the sample set comprises historical behavior data of a keyword displayed in a user account playing video, and the user account has a corresponding user portrait.

The user historical behavior information may include user play records and user interaction behavior information. The user interaction behavior information may include at least one of praise behavior information, comment behavior information, and share behavior information.

When the candidate keywords associated with the user information are screened from all the candidate keywords corresponding to the target video, screening can be performed based on a trained keyword screening model, the user information and all the candidate keywords corresponding to the target video are input into the keyword screening model, processing is performed through the keyword screening model, the probability that each candidate keyword is clicked by the current user is obtained through estimation, and the candidate keywords with the probability larger than a threshold value are determined to be the candidate keywords associated with the user information. The keyword screening model is obtained by training the sample set through a preset type of machine learning algorithm, so that the candidate keywords associated with the user information are screened through the keyword screening model, the screening speed can be increased, and the accuracy of the screened candidate keywords can be improved.

Fig. 6 is a flow chart illustrating a video presentation method according to an exemplary embodiment, as shown in fig. 6, including the following steps.

In step S61, a keyword obtaining request sent by a client is received, where the keyword obtaining request includes a playing status of a target video currently played by the client.

In the process of playing the target video, the client generates a keyword acquisition request corresponding to the playing state according to the current playing state, the keyword acquisition request is sent to the server, and the server receives the keyword acquisition request sent by the client.

In step S62, a target keyword corresponding to the playing status is obtained, and the target keyword is fed back to the client, where the target keyword is used to be displayed in a playing area of a target video of the client.

The server can identify the video content corresponding to the playing state of the target video in advance, so that keywords corresponding to the playing state are generated, after a keyword acquisition request sent by the client is received, the target keywords corresponding to the current playing state of the client are acquired, the target keywords are fed back to the client, and after the client receives the target keywords, the target keywords can be displayed in the playing area of the target video.

In step S63, an associated video acquisition request sent by the client is received, where the associated video acquisition request includes the target keyword.

If the target keywords displayed in the client are triggered by the user, the client generates a related video acquisition request and sends the related video acquisition request to the server, so that the server receives the related video acquisition request including the target keywords.

In step S64, an associated video corresponding to the target keyword is obtained, and the associated video is fed back to the client, where the associated video is used for displaying in the client.

Searching a relevant video corresponding to the target keyword in a preset video database according to the target keyword in the relevant video acquisition request, feeding the relevant video back to the client side, and displaying the relevant video after the client side receives the relevant video.

In the video display method provided by the exemplary embodiment, the target keyword corresponding to the playing state of the target video currently played by the client is acquired by receiving the keyword acquisition request sent by the client, and the target keyword is fed back to the client, so that the client can display the target keyword in the playing area of the target video, and the displayed target keyword can be operated by the user, and generate the associated video acquisition request, and thus the server can feed the associated video corresponding to the target keyword back to the client for display after receiving the associated video acquisition request, and can provide an acquisition path for acquiring the associated video corresponding to the target keyword to the user watching the video through the target keyword corresponding to the video playing state during video playing, so that the user can quickly acquire the associated video meeting the self-demand without complicated operations such as self-search and the like, and the video interaction experience of the user is improved.

On the basis of the technical scheme, the method further comprises the following steps: carrying out image recognition on each frame in the target video in advance, and determining the image content in each frame; extracting candidate keywords from the image content of each frame to obtain candidate keywords corresponding to the target video, and correspondingly storing the candidate keywords and the playing state of the target video;

the acquiring of the target keyword corresponding to the playing state includes: and acquiring candidate keywords corresponding to the playing state as target keywords.

The method comprises the steps of carrying out image recognition on each frame in a target video in advance, determining image content in each frame, carrying out keyword extraction on the image content of each frame to obtain a candidate keyword corresponding to each frame, carrying out duplication removal on the candidate keyword corresponding to each frame to obtain a candidate keyword corresponding to the target video, obtaining a corresponding relation between the candidate keyword and the playing state of the target video according to the corresponding relation between the candidate keyword and each frame, and accordingly correspondingly storing the candidate keyword and the playing state of the target video. The candidate keywords corresponding to the target video are determined by carrying out image recognition on each frame in the target video in advance, so that when the client plays the target video to request for acquiring the keywords, the target keywords can be selected from the candidate keywords in time and fed back to the client, the problem that the target keywords are not displayed in time due to image recognition when the client plays the target video is avoided, and image recognition on the keyword acquisition request of each client is repeatedly carried out.

On the basis of the above technical solution, the obtaining of the candidate keyword corresponding to the play status as the target keyword includes: acquiring user information corresponding to the client; and screening candidate keywords corresponding to the playing state according to the user information, and acquiring candidate keywords associated with the user information as the target keywords.

Wherein the user information may include user portrait information and user historical behavior information. The user historical behavior information may include user play records and user interaction behavior information. The user interaction behavior information may include at least one of praise behavior information, comment behavior information, and share behavior information.

The candidate keywords corresponding to the target video are keywords associated with the video content in the target video, for a user, the video content corresponding to some keywords is interesting for the user, the video content corresponding to some keywords is not interesting for the user, at this time, the candidate keywords corresponding to the playing state can be screened according to the user information, the candidate keywords which are possibly clicked by the current user are reserved, the candidate keywords associated with the user information are obtained and serve as the target keywords, and therefore the target keywords displayed by the client side are in accordance with the user interest, and viewing interference caused by displaying too many keywords which are not interesting for the user can be avoided.

Fig. 7 is a block diagram illustrating a video presentation device according to an example embodiment. Referring to fig. 7, the apparatus includes a target keyword acquisition module 71, a first keyword presentation module 72, and an associated video acquisition module 73.

The target keyword obtaining module 71 is configured to obtain a corresponding target keyword according to a playing state of a currently played target video;

the first keyword display module 72 is configured to generate a corresponding keyword tag according to the target keyword, and display the keyword tag in the playing area of the target video;

the associated video acquiring module 73 is configured to acquire an associated video corresponding to the target keyword for presentation in response to a triggering operation implemented on the keyword tag.

Optionally, the target keyword obtaining module includes:

Optionally, the apparatus further comprises:

Optionally, the keyword screening module is specifically configured to:

Optionally, the playing status includes a playing time point;

the target keyword acquisition module is specifically configured to:

Optionally, the associated video obtaining module includes:

Optionally, the apparatus further comprises:

Optionally, the second keyword display module is specifically configured to:

Optionally, the apparatus further comprises:

The associated video acquisition module comprises:

The video display device provided by the exemplary embodiment generates a keyword tag corresponding to a target keyword by acquiring the target keyword corresponding to a playing state of a currently played target video, displays the keyword tag in a playing area of the target video, and acquires an associated video corresponding to the target keyword for displaying in response to a trigger operation implemented on the keyword tag.

Fig. 8 is a block diagram illustrating a video presentation device according to an example embodiment. The device can be applied to a server. Referring to fig. 8, the apparatus includes a keyword acquisition request receiving module 81, a keyword feedback module 82, an associated video acquisition request receiving module 83, and an associated video feedback module 84.

The keyword obtaining request receiving module 81 is configured to receive a keyword obtaining request sent by a client, where the keyword obtaining request includes a playing state of a target video currently played by the client;

the keyword feedback module 82 is configured to obtain a target keyword corresponding to the playing status, and feed the target keyword back to the client, where the target keyword is used for displaying in a playing area of a target video of the client;

the associated video obtaining request receiving module 83 is configured to receive an associated video obtaining request sent by the client, where the associated video obtaining request includes the target keyword;

the associated video feedback module 84 is configured to obtain an associated video corresponding to the target keyword, and feed back the associated video to the client, where the associated video is used for displaying in the client.

Optionally, the apparatus further comprises:

the keyword feedback module comprises:

Optionally, the keyword obtaining unit is specifically configured to:

acquiring user information corresponding to the client;

Optionally, the user information includes user portrait information and user historical behavior information. The user portrait information at least comprises one of age, gender, geographic position and interest types of the user, and the user historical behavior information comprises video related behavior data of the user in a preset historical time period, wherein the video related behavior data comprises different types of video watching frequency, video watching duration, video point praise frequency, video sharing frequency, video comment frequency and the like.

The video display apparatus provided in the exemplary embodiment receives the keyword acquisition request sent by the client, acquires the target keyword corresponding to the playing state of the target video currently played by the client, and feeds the target keyword back to the client, so that the client can display the target keyword in the playing area of the target video, and the displayed target keyword can be operated by the user, and generate the associated video acquisition request, and thus the server can feed the associated video corresponding to the target keyword back to the client for display after receiving the associated video acquisition request, and can provide an acquisition path for acquiring the associated video of the target keyword to the user watching the video through the target keyword corresponding to the playing state of the video during video playing, so that the user can quickly acquire the associated video meeting the needs of the user without complicated operations such as self-search, and the video interaction experience of the user is improved.

With regard to the apparatus in the above-described embodiment, the specific manner in which each module performs the operation has been described in detail in the embodiment related to the method, and will not be elaborated here.

Fig. 9 is a block diagram illustrating a terminal according to an example embodiment. For example, terminal 900 may be a mobile phone, computer, digital broadcast terminal, messaging device, game console, tablet device, medical device, fitness device, personal digital assistant, and the like.

Referring to fig. 9, terminal 900 can include one or more of the following components: a processing component 902, a memory 904, a power component 906, a multimedia component 908, an audio component 910, an input/output (I/O) interface 912, a sensor component 914, and a communication component 916.

Processing component 902 generally controls overall operation of terminal 900, such as operations associated with display, telephone calls, data communications, camera operations, and recording operations. Processing component 902 may include one or more processors 920 to execute instructions to perform all or a portion of the steps of the video presentation method described above. Further, processing component 902 can include one or more modules that facilitate interaction between processing component 902 and other components. For example, the processing component 902 can include a multimedia module to facilitate interaction between the multimedia component 908 and the processing component 902.

Memory 904 is configured to store various types of data to support operation at terminal 900. Examples of such data include instructions for any application or method operating on terminal 900, contact data, phonebook data, messages, pictures, videos, and so forth. The memory 904 may be implemented by any type or combination of volatile or non-volatile memory devices such as Static Random Access Memory (SRAM), electrically erasable programmable read-only memory (EEPROM), erasable programmable read-only memory (EPROM), programmable read-only memory (PROM), read-only memory (ROM), magnetic memory, flash memory, magnetic or optical disks.

The power components 906 provide power to the various components of the terminal 900. The power components 906 may include a power management system, one or more power sources, and other components associated with generating, managing, and distributing power for the terminal 900.

The multimedia components 908 include a screen providing an output interface between the terminal 900 and the user. In some embodiments, the screen may include a Liquid Crystal Display (LCD) and a Touch Panel (TP). If the screen includes a touch panel, the screen may be implemented as a touch screen to receive an input signal from a user. The touch panel includes one or more touch sensors to sense touch, slide, and gestures on the touch panel. The touch sensor may not only sense the boundary of a touch or slide action, but also detect the duration and pressure associated with the touch or slide operation. In some embodiments, the multimedia component 908 includes a front facing camera and/or a rear facing camera. The front camera and/or the rear camera may receive external multimedia data when the terminal 900 is in an operation mode, such as a photographing mode or a video mode. Each front camera and rear camera may be a fixed optical lens system or have a focal length and optical zoom capability.

The audio component 910 is configured to output and/or input audio signals. For example, audio component 910 includes a Microphone (MIC) configured to receive external audio signals when terminal 900 is in an operational mode, such as a call mode, a recording mode, and a voice recognition mode. The received audio signals may further be stored in the memory 904 or transmitted via the communication component 916. In some embodiments, audio component 910 also includes a speaker for outputting audio signals.

I/O interface 912 provides an interface between processing component 902 and peripheral interface modules, which may be keyboards, click wheels, buttons, etc. These buttons may include, but are not limited to: a home button, a volume button, a start button, and a lock button.

The sensor component 914 includes one or more sensors for providing various aspects of state assessment for the terminal 900. For example, sensor assembly 914 can detect an open/closed state of terminal 900, a relative positioning of components, such as a display and keypad of terminal 1000, a change in position of terminal 900 or a component of terminal 900, the presence or absence of user contact with terminal 900, an orientation or acceleration/deceleration of terminal 900, and a change in temperature of terminal 900. The sensor assembly 914 may include a proximity sensor configured to detect the presence of a nearby object in the absence of any physical contact. The sensor assembly 914 may also include a light sensor, such as a CMOS or CCD image sensor, for use in imaging applications. In some embodiments, the sensor assembly 914 may also include an acceleration sensor, a gyroscope sensor, a magnetic sensor, a pressure sensor, or a temperature sensor.

Communication component 916 is configured to facilitate communications between terminal 900 and other devices in a wired or wireless manner. Terminal 900 can access a wireless network based on a communication standard, such as WiFi, an operator network (e.g., 2G, 3G, 4G, or 5G), or a combination thereof. In an exemplary embodiment, the communication component 916 receives a broadcast signal or broadcast associated information from an external broadcast management system via a broadcast channel. In an exemplary embodiment, the communication component 916 further includes a Near Field Communication (NFC) module to facilitate short-range communications. For example, the NFC module may be implemented based on Radio Frequency Identification (RFID) technology, infrared data association (IrDA) technology, Ultra Wideband (UWB) technology, Bluetooth (BT) technology, and other technologies.

In an exemplary embodiment, the terminal 900 may be implemented by one or more Application Specific Integrated Circuits (ASICs), Digital Signal Processors (DSPs), Digital Signal Processing Devices (DSPDs), Programmable Logic Devices (PLDs), Field Programmable Gate Arrays (FPGAs), controllers, micro-controllers, microprocessors or other electronic components for performing the above-described video presentation methods.

FIG. 10 is a block diagram illustrating a server in accordance with an example embodiment. Referring to fig. 10, server 1000 includes a processing component 1022 that further includes one or more processors and memory resources, represented by memory 1032, for storing instructions, such as application programs, that are executable by processing component 1022. The application programs stored in memory 1032 may include one or more modules that each correspond to a set of instructions. Further, the processing component 1022 is configured to execute instructions to perform the video presentation methods described above.

The server 1000 may also include a power component 1026 configured to perform power management for the server 1000, a wired or wireless network interface 1050 configured to connect the server 1000 to a network, and an input/output (I/O) interface 1058. Server 1000 may operate based on an operating system stored in memory 1032, such as a Windows Server, Mac OS XTM, UnixTM, LinuxTM, FreeBSDTM, or the like.

Fig. 11 is a block diagram illustrating a video presentation system according to an exemplary embodiment, and referring to fig. 11, the video presentation system 1100 includes a terminal 900 and a server 1000. The terminal 900 and the server 1000 in the video presentation system may interact to implement presentation of a video, and fig. 12 is a flowchart illustrating a video presentation method implemented by the terminal 900 and the server 1000 in the video presentation system 1100 interactively according to an exemplary embodiment, as shown in fig. 12, the video presentation method includes the following steps.

In step S121, the client sends a keyword acquisition request to the server when playing the target video.

In step S122, the server screens the candidate keywords corresponding to the target video according to the user information corresponding to the user currently playing the target video, obtains the candidate keywords associated with the user information, and obtains updated candidate keywords.

The candidate keywords corresponding to the target video are obtained by the server performing image recognition on each frame in the target video in advance, determining the image content in each frame by performing image recognition on each frame, and extracting corresponding keywords from the image content of each frame to obtain the candidate keywords corresponding to the target video.

The process of the server obtaining the candidate keyword associated with the user information is the same as the above embodiment, and is not described here again.

In step S123, the server sends the updated corresponding relationship between the candidate keyword and the playing status to the client.

In step S124, the client obtains a corresponding target keyword from the updated candidate keywords according to the playing status of the currently played target video.

In step S125, the client generates a corresponding keyword tag according to the target keyword, and displays the keyword tag in the playing area of the target video.

In step S126, the client sends an associated video acquisition request to the server in response to the triggering operation implemented on the keyword tag.

In step S127, the server feeds back the associated video corresponding to the target keyword.

In step S128, the client presents the associated video.

The specific implementation manner of each step in this exemplary embodiment is the same as that of the relevant step in the above embodiments, and is not described here again.

The exemplary embodiment sends a keyword acquisition request to a server when a target video is played by a client, the server sends candidate keywords associated with user information and the target video to the client, the client displays the target keywords corresponding to the playing state according to the playing state of the target video, the user can trigger keyword tags corresponding to the target keywords to acquire associated videos corresponding to the target keywords, thereby realizing the pushing of the related keywords according to the user information and the like, displaying at the client, the method and the device have the advantages that an obtaining way for obtaining the associated video corresponding to the target keyword is provided for a user watching the video through the target keyword corresponding to the video playing state during video playing, so that the user can quickly obtain the associated video meeting the self requirement without complex operations such as self retrieval, and the video interaction experience of the user is improved.

In an exemplary embodiment, there is also provided a storage medium comprising instructions, such as the memory 904 comprising instructions executable by the processor 920 of the terminal 900 to perform the video presentation method described above, or the memory 1032 comprising instructions executable by the processing component 1022 of the server 1000 to perform the video presentation method described above. Alternatively, the storage medium may be a non-transitory computer readable storage medium, which may be, for example, a ROM, a Random Access Memory (RAM), a CD-ROM, a magnetic tape, a floppy disk, an optical data storage device, and the like.

Other embodiments of the disclosure will be apparent to those skilled in the art from consideration of the specification and practice of the disclosure disclosed herein. This application is intended to cover any variations, uses, or adaptations of the disclosure following, in general, the principles of the disclosure and including such departures from the present disclosure as come within known or customary practice within the art to which the disclosure pertains. It is intended that the specification and examples be considered as exemplary only, with a true scope and spirit of the disclosure being indicated by the following claims.

It will be understood that the invention is not limited to the precise arrangements described above and shown in the drawings and that various modifications and changes may be made without departing from the scope thereof. The scope of the invention is limited only by the appended claims.

Claims

1. A method for video presentation, comprising:

2. The method according to claim 1, wherein the obtaining the corresponding target keyword according to the playing status of the currently played target video comprises:

3. The method of claim 2, further comprising:

4. A method for video presentation, comprising:

5. A video presentation apparatus, comprising:

6. A video presentation apparatus, comprising:

7. A terminal, comprising:

a processor;

a memory for storing the processor-executable instructions;

wherein the processor is configured to execute the instructions to implement the video presentation method of any one of claims 1 to 3.

8. A server, comprising:

a processor;

a memory for storing the processor-executable instructions;

wherein the processor is configured to execute the instructions to implement the video presentation method of claim 4.

9. A video presentation system, comprising:

a terminal according to claim 7 and a server according to claim 8.

10. A storage medium in which instructions are executable by a processor of a terminal to enable the terminal to perform the video presentation method of any one of claims 1 to 3, or a server to perform the video presentation method of claim 4, when the instructions are executable by a processor of the server.