CN111857517B

CN111857517B - Video information processing method and device, electronic equipment and storage medium

Info

Publication number: CN111857517B
Application number: CN202010740479.8A
Authority: CN
Inventors: 金肖莹
Original assignee: Tencent Technology Shenzhen Co Ltd
Current assignee: Shenzhen Yayue Technology Co ltd
Priority date: 2020-07-28
Filing date: 2020-07-28
Publication date: 2022-05-17
Anticipated expiration: 2040-07-28
Also published as: CN111857517A

Abstract

The invention provides a video information processing method, a video information processing device, electronic equipment and a storage medium; the method comprises the following steps: presenting a generating function item corresponding to a key content display diagram of a video in a playing interface of the video; presenting a key content presentation graph for representing key content of the video in response to a triggering operation for the generating function item; wherein the key content presentation graph comprises at least one key video frame of the video and text content associated with the key video frame; by the method and the device, the key content of the video can be more vividly and meticulously displayed to the user, so that the user can better and more quickly know the video, and meanwhile, the introduction modes of the video content are enriched, and the video transmission is facilitated.

Description

Video information processing method and device, electronic equipment and storage medium

Technical Field

The present invention relates to the technical field of artificial intelligence and media playing, and in particular, to a method and an apparatus for processing video information, an electronic device, and a storage medium.

Background

Artificial Intelligence (AI) is a theory, method, technique and application system that uses a digital computer or a machine controlled by a digital computer to simulate, extend and expand human Intelligence, perceive the environment, acquire knowledge and use the knowledge to obtain the best results. In other words, artificial intelligence is a comprehensive technique of computer science that attempts to understand the essence of intelligence and produce a new intelligent machine that can react in a manner similar to human intelligence. Artificial intelligence is the research of the design principle and the realization method of various intelligent machines, so that the machines have the functions of perception, reasoning and decision making. With the continuous development of artificial intelligence technology, artificial intelligence has been increasingly applied to intelligent playing of media videos, intelligent processing of videos, and the like.

In the related art, if a user wants to know the key content of a certain video, the user usually needs to use the corresponding text blurb, catwalk, and trailer of the video. But the introduction is only text, and the information that can be conveyed is very limited; however, the trailers and the blossoms usually need to be watched for a period of time and cannot be browsed quickly, so that the desire of the user to know is low, and the video transmission is not facilitated.

Disclosure of Invention

The embodiment of the invention provides a video information processing method and device, electronic equipment and a storage medium, which can more vividly and meticulously show key contents of a video to a user, so that the user can better and more quickly know the video, and meanwhile, the introduction mode of the video contents is enriched, and the video transmission is facilitated.

The technical scheme of the embodiment of the invention is realized as follows:

the embodiment of the invention provides a video information processing method, which comprises the following steps:

presenting a generating function item corresponding to a key content display diagram of a video in a playing interface of the video;

presenting a key content presentation graph for representing key content of the video in response to a triggering operation for the generating function item;

wherein the key content presentation graph comprises at least one key video frame of the video and text content associated with the key video frame.

An embodiment of the present invention further provides an information processing apparatus for a video, including:

the display module is used for displaying the generation function item of the key content display graph corresponding to the video in the playing interface of the video;

a generating module, which is used for responding to the triggering operation aiming at the generating function item and presenting a key content display diagram for representing the key content of the video;

In the foregoing solution, the generating module is further configured to respond to the triggering operation for the generating function item, present a video frame selection interface, and

presenting at least one key video frame in the video frame selection interface;

and in response to a video frame selection operation triggered based on the video frame selection interface, generating and presenting a key content display graph for representing key content of the video based on the key video frame selected by the video frame selection operation.

In the above scheme, the presentation module is further configured to present a storage function item corresponding to the key content display chart;

responding to the trigger operation aiming at the storage function item, and saving the key content display graph to a storage path associated with the storage function item;

and presenting prompt information that the key content display graph is stored to the storage path.

In the foregoing solution, the presenting module is further configured to present a first sharing function item corresponding to the key content display diagram;

responding to the triggering operation aiming at the first sharing function item, presenting a first sharing interface corresponding to the key content display diagram, and presenting a sharing object for selection in the first sharing interface;

and responding to a sharing object selection operation triggered based on the first sharing interface, and sharing the key content display diagram to the sharing object selected by the sharing object selection operation.

In the foregoing solution, the presenting module is further configured to present a second sharing function item corresponding to the key content display diagram;

responding to the triggering operation aiming at the second sharing function item, presenting a second sharing interface corresponding to the key content display diagram, and presenting a sharing mode for selection in the second sharing interface;

and responding to a sharing mode selection operation triggered based on the second sharing interface, and sharing the key content display diagram based on the sharing mode selected by the sharing mode selection operation.

In the above solution, the apparatus further includes:

the encoding module is used for acquiring a link corresponding to a playing interface of the video;

and coding the link to obtain a graphic code corresponding to the video, wherein the graphic code is used for jumping to a playing interface of the video when the electronic equipment triggers the scanning operation aiming at the graphic code.

In the above scheme, the generating module is further configured to obtain at least one key video frame of the video and text content associated with each key video frame, where the text content is used to describe video content of the corresponding key video frame;

obtaining a display diagram generation template corresponding to the video;

and generating a template based on the display graph, combining the text content with corresponding key video frames, and generating and presenting a key content display graph for representing the key content of the video based on the combination result.

In the foregoing solution, the generating module is further configured to, when the number of the key video frames is at least two, respectively obtain text contents associated with each of the key video frames, where the text contents are used to describe video contents of the corresponding key video frames;

respectively combining each key video frame with corresponding text content to obtain a combined video frame corresponding to each key video frame;

and splicing the obtained combined video frames corresponding to the key video frames to obtain and present the key content display image.

In the above scheme, the apparatus further comprises:

the receiving module is used for receiving and presenting a key content display diagram of a shared target video, wherein the key content display diagram of the target video comprises a graphic code corresponding to the target video;

and responding to the scanning operation aiming at the graphic code, jumping from the current page to the playing interface of the target video, and playing the target video based on the playing interface of the target video.

In the above scheme, the generating module is further configured to obtain video information corresponding to each playing time point in the video, where the video information includes at least one of a historical playing speed, historical interactive data, and background music;

identifying video clips corresponding to key characters in the video;

determining at least one key video frame from the video clip based on the acquired video information corresponding to each playing time point;

based on the determined at least one key video frame, generating and presenting a key content display diagram for characterizing key content of the video.

In the above scheme, the generating module is further configured to determine, based on the video information corresponding to each playing time point, a score of a video frame corresponding to the corresponding playing time point in the video; the score is used for representing the possibility that the video frame corresponding to the playing time point is a key video frame;

and determining at least one key video frame from the video clip based on the score of the video frame corresponding to each playing time point.

In the above scheme, the generating module is further configured to determine, based on the historical playing speed corresponding to each playing time point, a first score of a video frame corresponding to the corresponding playing time point in the video;

determining a second score of a video frame corresponding to the corresponding playing time point in the video based on the historical interactive data corresponding to each playing time point;

determining a third score of a video frame corresponding to the corresponding playing time point in the video based on the background music corresponding to each playing time point;

respectively obtaining weights corresponding to the first score, the second score and the third score;

and determining the scores of the video frames corresponding to the corresponding playing time points in the video based on the first score, the second score, the third score and the corresponding weights.

An embodiment of the present invention further provides an electronic device, including:

a memory for storing executable instructions;

and the processor is used for realizing the video information processing method provided by the embodiment of the invention when executing the executable instructions stored in the memory.

The embodiment of the invention also provides a computer-readable storage medium, which stores executable instructions, and when the executable instructions are executed by a processor, the method for processing the video information provided by the embodiment of the invention is realized.

The embodiment of the invention has the following beneficial effects:

the method comprises the steps of presenting a generating function item of a key content display diagram corresponding to a video on a playing interface of the video, and generating the key content display diagram for representing the key content of the video when a trigger operation aiming at the generating function item is received, wherein the key content display diagram is generated based on the key video frame of the video and the text content associated with the key video frame in a combined mode, so that the key content of the video can be more vividly and meticulously presented to a user, the user can better and more quickly know the video, the introduction mode of the video content is enriched, and the video transmission is facilitated.

Drawings

Fig. 1 is a schematic view of an implementation scenario of an information processing method for a video according to an embodiment of the present invention;

fig. 2 is a schematic flow chart of an information processing method for video according to an embodiment of the present invention;

FIG. 3 is a schematic diagram illustrating the generation of a key content display according to an embodiment of the present invention;

FIG. 4 is a flowchart illustrating the generation of a key content display according to an embodiment of the present invention;

FIG. 5 is a schematic diagram illustrating a storage flow of a key content display according to an embodiment of the present invention;

fig. 6A is a first schematic view illustrating a sharing process of a key content display diagram according to an embodiment of the present invention;

fig. 6B is a schematic view illustrating a sharing process of a key content display diagram according to an embodiment of the present invention;

FIG. 7 is a flowchart illustrating a method for implementing a target video playback based on a key content presentation according to an embodiment of the present invention;

FIG. 8A is a schematic flow chart of generating a key content presentation based on a presentation generation template according to an embodiment of the present invention;

FIG. 8B is a first diagram of a key content presentation graph provided by an embodiment of the present invention;

FIG. 8C is a second diagram of a key content presentation graph provided by an embodiment of the present invention;

fig. 9 is a flowchart illustrating an information processing method for a video according to an embodiment of the present invention;

fig. 10 is a diagram illustrating a graph depicting the heat of video contents provided in the related art;

fig. 11 is a flowchart illustrating an information processing method for a video according to an embodiment of the present invention;

FIG. 12 is a schematic diagram of a video tagged with artificial nodes according to an embodiment of the invention;

fig. 13 is a schematic structural diagram of an information processing apparatus for video according to an embodiment of the present invention;

fig. 14 is a schematic structural diagram of an electronic device according to an embodiment of the present invention.

Detailed Description

In order to make the objects, technical solutions and advantages of the present invention clearer, the present invention will be further described in detail with reference to the accompanying drawings, the described embodiments should not be construed as limiting the present invention, and all other embodiments obtained by a person of ordinary skill in the art without creative efforts shall fall within the protection scope of the present invention.

In the following description, reference is made to "some embodiments" which describe a subset of all possible embodiments, but it is understood that "some embodiments" may be the same subset or different subsets of all possible embodiments, and may be combined with each other without conflict.

In the following description, references to the terms "first \ second \ third" are only to distinguish similar objects and do not denote a particular order, but rather the terms "first \ second \ third" are used to interchange specific orders or sequences, where appropriate, to enable embodiments of the invention described herein to be practiced in other than the order shown or described herein.

Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs. The terminology used herein is for the purpose of describing embodiments of the invention only and is not intended to be limiting of the invention.

Before further detailed description of the embodiments of the present invention, terms and expressions mentioned in the embodiments of the present invention are explained, and the terms and expressions mentioned in the embodiments of the present invention are applied to the following explanations.

1) In response to the condition or state on which the performed operation depends, one or more of the performed operations may be in real-time or may have a set delay when the dependent condition or state is satisfied; there is no restriction on the order of execution of the operations performed unless otherwise specified.

Based on the above explanations of terms and terms related in the embodiments of the present invention, an implementation scenario of the video information processing method provided in the embodiments of the present invention is described below, referring to fig. 1, fig. 1 is a schematic diagram of an implementation scenario of the video information processing method provided in the embodiments of the present invention, in order to support an exemplary application, a terminal (including a terminal 200-1 and a terminal 200-2) is connected to a server 100 through a network 300, where the terminal 200-1 is located on a video sharing side, the terminal 200-2 is located on a video receiving side, and the network 300 may be a wide area network or a local area network, or a combination of both, and uses a wireless or wired link to implement data transmission.

The terminal 200-1 is used for presenting a generating function item of a key content display diagram corresponding to a video in a video playing interface; generating a key content display diagram for representing the key content of the video by combining at least one key video frame of the video and the text content associated with the key video frame in response to the trigger operation aiming at the generation function item; sending a request indicating to share the key content presentation graph to the server 100;

the server 100 is used for receiving and responding to a request for indicating the sharing of the key content display map, and sharing the key content display map of the video to the terminal 200-2;

and the terminal 200-2 is used for receiving and presenting the key content display diagram of the video.

In some embodiments, a key content presentation graph of the video may also be generated by the terminal 200-2. Specifically, the terminal 200-2 presents a generation function item of a key content display diagram corresponding to the video in a video playing interface; generating a key content display diagram for representing the key content of the video by combining at least one key video frame of the video and the text content associated with the key video frame in response to the trigger operation aiming at the generation function item; sending a request indicating to share the key content presentation graph to the server 100; the server 100 receives and responds to a request for indicating sharing of the key content display diagram, and shares the key content display diagram of the video to the terminal 200-1; the terminal 200-1 receives and presents a key content presentation of the video.

In some embodiments, a key content presentation graph of a video may also be generated by the server 100. Specifically, a terminal (such as the terminal 200-1) presents a generation function item of a key content display diagram corresponding to a video in a video playing interface; in response to a trigger operation for generating a function item, sending a generation instruction of a key content display diagram of a corresponding video to the server 100; the server 100 receives and responds to the generation instruction of the key content display graph, and generates a key content display graph for representing the key content of the video by combining at least one key video frame of the video and the text content associated with the key video frame; returning the generated key content display graph to the terminal; a terminal, such as terminal 200-1, receives and presents a key content presentation of a video. In this embodiment, a terminal (e.g., the terminal 200-1) may also share the key content display map of the video received from the server 100 to other terminals.

In practical application, the server 100 may be an independent physical server, a server cluster or a distributed system formed by a plurality of physical servers, or a cloud server providing basic cloud computing services such as a cloud service, a cloud database, cloud computing, a cloud function, cloud storage, a Network service, cloud communication, a middleware service, a domain name service, a security service, a Content Delivery Network (CDN), a big data and artificial intelligence platform, and the like. The terminal may be, but is not limited to, a smart phone, a tablet computer, a laptop computer, a desktop computer, a smart speaker, a smart watch, a smart television, and the like. The terminal and the server may be directly or indirectly connected through wired or wireless communication, and the embodiment of the present application is not limited.

Based on the above description of the implementation scenario of the video information processing method according to the embodiment of the present invention, the following describes the video information processing method according to the embodiment of the present invention. Referring to fig. 2, fig. 2 is a schematic flowchart of an information processing method for video according to an embodiment of the present invention; in some embodiments, the video information processing method provided in the embodiments of the present invention may be implemented by a server or a terminal alone, or implemented by a server and a terminal in a cooperative manner, and taking the terminal as an example, the video information processing method provided in the embodiments of the present invention includes:

step 201: and the terminal presents the generation function item of the key content display chart corresponding to the video in the video playing interface.

Here, the terminal is provided with a client for video playback. When a user watches a video through a client arranged on a terminal, if the user needs to acquire key content of the video or share the video with others, a key content display diagram for representing the key content of the video can be generated.

In practical application, the terminal presents a generating function item of the key content display graph corresponding to the video through the playing interface, and a user can trigger the generating operation of the key content display graph through the generating function item.

Step 202: and presenting a key content display diagram for representing the key content of the video in response to the trigger operation aiming at the generation function item.

The key content display graph is obtained by combining at least one key video frame of the video and text content associated with the key video frame.

In practical application, after receiving a trigger operation for generating a function item, the terminal combines a key video frame of a video and text content associated with the key video frame in response to the trigger operation, generates and presents a key content display diagram for representing key content of the video.

Exemplarily, referring to fig. 3, fig. 3 is a schematic diagram for generating a key content illustration provided by an embodiment of the present invention. Here, the playing interface of the video displays the generation function item of the key content display diagram through the "generate long diagram" function button. The user can trigger the generation operation of the key content display chart by clicking the 'generate long chart' function button; and after receiving the clicking operation aiming at the 'generate long image' function button, the terminal generates a key content display image of the video, wherein the key content display image comprises key video frames and text content associated with the key video frames, such as subtitle information.

In some embodiments, the key content presentation graph may also be generated based on the concatenation of key video frames only, if the determined key video frames do not have associated text content.

In some embodiments, the terminal may generate a key content presentation graph of the key content of the video by: presenting a video frame selection interface in response to a trigger operation for generating a function item, and presenting at least one key video frame in the video frame selection interface; and in response to a video frame selection operation triggered based on the video frame selection interface, generating and presenting a key content display graph for representing the key content of the video based on the key video frames selected by the video frame selection operation.

In practical application, a video frame selection interface can be provided for a user, so that the user can generate a key content display diagram of the video according to personal needs. Based on the method, after receiving the trigger operation aiming at the generated function item, the terminal responds to the trigger operation and presents a video frame selection interface for the user to select the key video frame. At least one key video frame may be presented at the video frame selection interface for selection by a user. And after receiving a video frame selection operation triggered by a user based on a video frame selection interface, determining a key video frame selected by the user in response to the video frame selection operation, and presenting a key content display diagram of the video based on the selected key video frame.

Illustratively, referring to fig. 4, fig. 4 is a flow chart of generating a key content illustration provided by an embodiment of the present invention. Here, the terminal presents a video frame selection interface including three key video frames in response to a user's click operation with respect to the "generate long picture" function button, and correspondingly presents a selection function item "□" for each key video frame. And when the selection operation of the user for the key video frame 1 and the key video frame 2 is received and confirmed, generating a key content display diagram of the video based on the key video frame 1 and the key video frame 2.

In some embodiments, the terminal may present a storage function item corresponding to the key content presentation graph; responding to the trigger operation aiming at the storage function item, and saving the key content display graph to a storage path associated with the storage function item; and presenting the prompt message that the key content display graph is stored to the storage path.

In practical application, the terminal can also present a storage function item corresponding to the key content display picture. When receiving a trigger operation of a user for the storage function item, storing the generated key content display graph to a storage path associated with the storage function item; and simultaneously presenting prompt information that the key content display graph is successfully stored. Here, the terminal may also save the key content presentation graph to a preset storage path by default after generating the key content presentation graph.

Exemplarily, referring to fig. 5, fig. 5 is a schematic diagram illustrating a storage flow of a key content illustration according to an embodiment of the present invention. Here, while the play interface presents the key content presentation graph, a corresponding storage function item "save long graph" is also presented; and after receiving the click operation of the user for saving the long image, saving the key content display image to a storage path associated with the storage function item, and presenting prompt information of successful saving.

In some embodiments, the terminal may share the key content presentation graph by: presenting a first sharing function item corresponding to the key content presentation graph; responding to the triggering operation aiming at the first sharing function item, presenting a first sharing interface corresponding to the key content display diagram, and presenting a sharing object for selection in the first sharing interface; and in response to the selection operation of the sharing object triggered based on the first sharing interface, sharing the key content display diagram to the sharing object selected by the sharing object selection operation.

In practical application, after the terminal generates the key content display diagram of the video, the terminal can share the key content display diagram to other people. At this time, the terminal may further present a first sharing function item corresponding to the key content presentation diagram, where the first sharing function item may be associated with a corresponding sharing client by default, such as an instant messaging client. When a trigger operation aiming at the first sharing function item is received, responding to the trigger operation, and presenting a first sharing interface containing a sharing object selected by a user. And then after receiving a sharing object selection operation triggered based on the first sharing interface, sharing the key content display graph to the sharing object selected by the user based on the sharing object selection operation.

For example, referring to fig. 6A, fig. 6A is a first schematic view illustrating a sharing process of a key content display diagram according to an embodiment of the present invention. Here, the terminal presents a corresponding first sharing function item "share" while presenting a key content presentation diagram on the play interface; responding to the click operation aiming at the sharing function item, and jumping from the current page to a first sharing interface containing a sharing object to be selected; and receiving selection operations of the user for the sharing object 1 and the sharing object 2, and sharing the key content display picture to the sharing object 1 and the sharing object 2 after confirmation.

In some embodiments, the terminal may further share the key content display diagram by: presenting a second shared function item corresponding to the key content presentation graph; responding to the triggering operation aiming at the second sharing function item, presenting a second sharing interface corresponding to the key content display diagram, and presenting a sharing mode for selection in the second sharing interface; and responding to a sharing mode selection operation triggered based on the second sharing interface, and sharing the key content display diagram based on the sharing mode selected by the sharing mode selection operation.

In practical application, the terminal may further present a second sharing function item corresponding to the key content presentation diagram, and when receiving a trigger operation for the second sharing function item, present a second sharing interface including a sharing manner for the user to select in response to the trigger operation. And then after receiving a selection operation of the sharing mode triggered based on the second sharing interface, sharing the key content display diagram based on the sharing mode selected by the selection operation of the sharing mode.

For example, referring to fig. 6B, fig. 6B is a schematic view illustrating a sharing process of the key content display diagram according to the embodiment of the present invention. Here, the terminal presents a corresponding second sharing function item "share" while presenting a key content presentation diagram on the play interface; in response to the click operation for the "share" function item, presenting a second share interface containing a sharing manner for the user to select, such as the second share interface containing sharing manners such as "circle of friends", "QQ space", and "WeChat" shown in fig. 6B; receiving a selection operation of a user for a sharing mode 'friend circle', and jumping from a current page to an information editing page of 'friend circle'; or directly completing the sharing of the 'friend circle' of the key content display diagram, and presenting a friend circle browsing interface with the key content display diagram, as shown in fig. 6B.

In some embodiments, the terminal may generate the graphic code included in the key content presentation graph by: acquiring a link corresponding to a playing interface of a video; and coding the link to obtain a graphic code corresponding to the video.

In practical application, in order to facilitate the transmission and sharing of videos, graphic codes of the videos can be presented in a key content display diagram. The graphic code is used for jumping to a playing interface of a video when the electronic equipment triggers the scanning operation aiming at the graphic code. Specifically, the graphic code may be obtained based on a webpage link code corresponding to a video playing interface.

Based on this, in some embodiments, the terminal may implement playing of the target video based on the graphic code by: receiving and presenting a key content display graph of a shared target video, wherein the key content display graph of the target video comprises a graph code corresponding to the target video; and responding to the scanning operation aiming at the graphic code, jumping from the current page to the playing interface of the target video, and playing the target video based on the playing interface of the target video.

After the terminal receives the key content display graph of the shared target video, the key content display graph of the target video comprises a graph code corresponding to the target video. The user can scan and recognize the graphic code through operation, so that the target video can be watched. And after receiving the scanning operation aiming at the graphic code, the terminal responds to the scanning operation and jumps to a playing interface of the target video from the current page so as to play the target video based on the playing interface.

Exemplarily, referring to fig. 7, fig. 7 is a schematic flowchart illustrating a process of implementing a target video playing based on a key content presentation diagram according to an embodiment of the present invention. Here, when the user views the key content display map, it may be found that the key content display map includes the two-dimensional code, and prompt information indicating that "scan the identification two-dimensional code and view the video" is presented is issued from the two-dimensional code. And when the terminal receives the scanning operation of the user for the two-dimensional code, jumping from the current page to the playing page of the target video.

In some embodiments, the terminal may generate a key content presentation graph of the key content of the video by: acquiring at least one key video frame of a video and text content associated with each key video frame, wherein the text content is used for describing the video content of the corresponding key video frame; acquiring a display diagram generation template corresponding to the video; and generating a template based on the display graph, combining the text content with the corresponding key video frames to generate a key content display graph for representing the key content of the video based on the combination result.

In practical application, a presentation graph generation template for the key content presentation graph may be further provided, and the presentation graph generation template may describe a position of a video frame, a position of text content, a position of each type of file content in different types of text content, and the like when the key content presentation graph is generated.

After determining at least one key video frame of the video, acquiring the content of the key video frame and the text content associated with each key video frame, such as subtitle information, historical background information, key character information, plot information and the like. And simultaneously acquiring a display image generation template corresponding to the video.

And generating a template based on the acquired display diagram, and combining the text content with the corresponding key video frame so as to generate a key content display diagram of the video based on the combination result.

Exemplarily, referring to fig. 8A, fig. 8A is a schematic flowchart of generating a key content presentation graph based on a presentation graph generation template according to an embodiment of the present invention. Here, first, a key video frame, caption information, historical background information and key character information associated with the key video frame are acquired, and then a display map generation template of the video is acquired, wherein the display map generation template respectively describes the positions of the caption information, the historical background information and the key character information in the key video frame; and generating a template based on the display diagram, and filling corresponding text contents into corresponding positions of the key video frames respectively for combination to obtain a key content display diagram.

In some embodiments, the terminal may also generate a key content presentation graph of the key content of the video by: when the number of the key video frames is at least two, respectively acquiring text contents associated with the key video frames, wherein the text contents are used for describing the video contents of the corresponding key video frames; respectively combining each key video frame with corresponding text content to obtain a combined video frame corresponding to each key video frame; and splicing the obtained combined video frames corresponding to the key video frames to obtain a key content display picture.

In practical applications, when the number of the determined key video frames is at least two, text content associated with each key video frame, such as subtitle information, episode information, and the like corresponding to each key video frame, is acquired. And for each key video frame, combining the key video frame with the corresponding text content to obtain a combined video frame corresponding to the key video frame, and specifically, when combining the key video frame with the corresponding text content, combining the key video frame and the corresponding text content in the same manner as the above-described manner of generating the template based on the display map. And after the combined video frame corresponding to each key video frame is obtained, splicing the obtained combined video frames to generate a key content display diagram. And specifically, splicing can be performed according to the playing time of each key video frame to obtain a key content display diagram.

Exemplarily, referring to fig. 8B, fig. 8B is a first schematic diagram of a key content presentation diagram provided by an embodiment of the present invention. Here, the key content display diagram is obtained by splicing two key video frames, specifically, two key video frames combined with text content (such as the subtitle information shown in fig. 8B) are spliced according to the playing time.

In practical application, when the number of the determined key video frames is one, the text content corresponding to the key video frames and the associated text content of the video clip in which the key video frames are located are obtained. The video clip is a video in a preset time period before and after the time point of the key video frame. And combining the key video frame with the associated text content to obtain a key content display diagram. Referring to fig. 8C, fig. 8C is a second schematic diagram of a key content display diagram provided in an embodiment of the invention. Here, the key content presentation map is obtained by combining a key video frame and the associated text content (such as the subtitle information shown in fig. 8C) of the video clip.

In some embodiments, the terminal may generate a key content presentation graph of the key content of the video by: acquiring video information corresponding to each playing time point in a video, wherein the video information comprises at least one of historical playing speed, historical interactive data and background music; identifying video clips corresponding to key characters in the video; determining at least one key video frame from the video clip based on the acquired video information corresponding to each playing time point; based on the determined at least one key video frame, a key content presentation graph is generated for characterizing the key content of the video.

In practical applications, the terminal may obtain video information at each playing time point in the video, such as at least one of a historical playing speed, historical interaction data, and background music. Here, the historical interactive data is interactive behavior data executed by the user on the video frame at each playing time point during the historical playing of the video, such as user transmission barrage data, user praise behavior data on barrage, and the like. And then, performing key character recognition on each video frame in the video, and dividing the video based on the recognized key characters to obtain video clips containing the key characters. And finally, determining at least one key video frame from the video clip based on the acquired video information corresponding to each playing time point so as to generate a key content display diagram of the key content of the video.

Based on this, the key content display graph of the video is generated, and the key video frames of the video are acquired accurately, and in the embodiment of the invention, the terminal can determine the key content display graph based on different aspects such as historical playing speed, historical interaction data, background music, key character identification and the like so as to improve the acquisition accuracy of the key video frames.

In some embodiments, the terminal may determine the at least one key video frame from the video clip by: determining the score of a video frame corresponding to each playing time point in the video based on the video information corresponding to each playing time point; determining at least one key video frame from the video clip based on the score of the video frame corresponding to each playing time point; wherein the score is used for representing the possibility that the video frame corresponding to the playing time point is a key video frame.

In some embodiments, the terminal may determine the score of the video frame corresponding to the corresponding playing time point in the video by: determining a first score of a video frame corresponding to each playing time point in the video based on the historical playing speed corresponding to each playing time point; determining a second score of a video frame corresponding to each playing time point in the video based on the historical interactive data corresponding to each playing time point; determining a third score of a video frame corresponding to each playing time point in the video based on the background music corresponding to each playing time point; respectively obtaining weights corresponding to the first score, the second score and the third score; and determining the score of the video frame corresponding to the corresponding playing time point in the video based on the first score, the second score, the third score and the corresponding weight.

In practical application, the terminal may determine the first score of the video frame corresponding to the playing time point in the video in the following manner. Here, the terminal acquires a history play speed, such as 2 × speed, 0.5 × speed, or the like, at which the user plays the video. And determining a target historical playing speed corresponding to the longest playing time when the user plays the whole video for each user, and analyzing by taking the target historical playing speed as a reference. Specifically, since the highlight level of the video content can be determined according to whether the user turns on the double speed, in the embodiment of the present invention, the corresponding relationship between the playing speed and the highlight level is preset, and the corresponding relationship includes 6 levels, as shown in table 1:

playing speed	0.5	1	1.25	1.5	2	Skip over
							Wonderful grade	5	4	3	2	1	0

TABLE 1 corresponding relationship between Play speed and highlight level

In practical applications, the score of the video frame corresponding to the reference playing speed may be set to 1, and the score of the skipped video frame may be set to 0, where the score is used to represent the possibility that the video frame corresponding to the playing time point is the key video frame. Score S of video frame corresponding to other playing speed_nThe calculation can be made by the following formula:

S_n＝1+(Ln-L)*0.2

wherein, Ln is the level corresponding to the playing speed, and L is the reference level. For example, in a 40-minute video, if a user views 30 minutes of content at 1.5 × speed, views 5 minutes of content at 1 × speed, and skips five minutes of content, the score of 1.5 × speed video frames is 1, the score of 1 × speed video frames is 1+ (4-2) × 0.2 ═ 1.4, and the score of skipped video frames is 0.

Based on this, the score of the video frame of each playing time point can be obtained for each user, and finally the scores of the video frames of each playing time point corresponding to all the users are added and averaged to obtain the first score of the video frame of each playing time point of the video.

In practical application, the terminal may determine the second fraction of the video frame corresponding to the playing time point in the video as follows. Here, the terminal identifies whether the video has background music to obtain an identification result; and determining a second score of the video frame of each playing time point of the video based on the recognition result. Specifically, the score of the video frame in which the background music exists may be set to 1, and the score of the video frame in which the background music does not exist may be set to 0, so as to obtain the second score of the video frame at each play time point of the video.

In practical applications, the terminal may determine the third fraction of the video frames corresponding to the playing time point in the video as follows. Here, the terminal acquires interaction information, such as a pop-up screen, a pop-up comment, and the like, performed by the user on the video frame at each playing time point during the playing of the video, so as to determine a third score of the video frame at each playing time point of the video based on the interaction information at each playing time point. Specifically, the third score may be determined according to the number of pop-ups corresponding to the video frame at each playing time point and the like amount for each pop-up, for example, the score of one pop-up is set to 1, and the score of one like is set to 0.5. Meanwhile, in order to avoid that the score obtained based on the calculation is too high and influences other scores, the score obtained through the calculation is subjected to normalization processing, and therefore the third score of the video frame of each playing time point of the video is obtained.

After the first score, the second score and the third score of the video frame of each playing time point of the video are obtained, the weights corresponding to the first score, the second score and the third score are respectively obtained, so that the score of the video frame of each playing time point of the video is obtained based on the first score, the second score, the third score and the corresponding weight.

And finally, selecting at least one key video frame from the video clips obtained based on the key character recognition based on the obtained scores of the video frames of each playing time point in the video, so as to generate a key content display diagram of the key content of the video based on the selected at least one key video frame.

By applying the embodiment of the invention, the generation function item of the key content display diagram corresponding to the video is presented on the playing interface of the video, and the key content display diagram for representing the key content of the video is generated when the trigger operation aiming at the generation function item is received.

In some embodiments, the video information processing method provided by the present invention may be cooperatively implemented by a first client, a second client, and a server, where the first client is configured to trigger a generation instruction of a key content display diagram, and share the key content display diagram to the second client after requesting the key content display diagram from the server, where the first client is disposed in a first terminal, and the second client is disposed in a second terminal. Referring to fig. 9, fig. 9 is a schematic flowchart of an information processing method for a video according to an embodiment of the present invention, where the information processing method for a video according to an embodiment of the present invention includes:

step 901: and the first terminal runs the first client and presents the generation function item of the key content display chart corresponding to the video in the playing interface of the video.

Here, the first terminal is provided with a first client for video playback. When a user watches a video through a client arranged on a terminal, if a key content display graph of the video needs to be generated, a generation function item displayed in a playing interface of the video can be clicked to trigger a generation instruction of the key content display graph corresponding to the video.

Step 902: and responding to the trigger operation aiming at the generating function item, and sending a generating instruction of the key content display graph of the corresponding video to the server.

Step 903: and the server receives and responds to the generation instruction, acquires video information corresponding to the video and identifies video clips corresponding to key people in the video.

Here, the video information includes at least one of a historical play speed, historical interaction data, and background music. The server can determine the score of the video frame corresponding to the corresponding playing time point in the video based on the video information corresponding to each playing time point, and the score is used for representing the possibility that the video frame corresponding to the playing time point is the key video frame.

Step 904: and determining at least one key video frame from the video clip based on the acquired video information corresponding to each playing time point.

Here, specifically, the server determines at least one key video frame from the video clips based on the score of the video frame corresponding to each playing time point.

Step 905: at least one key video frame of a video is obtained, and text content associated with each key video frame is obtained.

Step 906: the text content is combined with the corresponding key video frames to generate a key content presentation graph based on the combination result.

After determining at least one key video frame of the video, the server acquires the key video frame and text content associated with each key video frame, such as subtitle information, historical background information, key character information, plot information and the like.

And combining the key video frames and the text content associated with each key video frame to obtain a key content display diagram. Specifically, a presentation graph generation template corresponding to the video may be acquired, the text content and the corresponding key video frame may be combined based on the acquired presentation graph generation template, so as to generate a key content presentation graph based on the combination result.

Step 907: and sending the key content display graph of the video to the first terminal.

Step 908: the first terminal receives the key content exhibition 10 diagram returned by the server and shares the key content exhibition diagram to the second terminal.

Here, the playing interface also presents a sharing function item corresponding to the key content display diagram, and the user can share the key content display diagram to other people through the sharing function item. After receiving the sharing instruction triggered based on the sharing function item, the first terminal shares the generated key content display graph to other users (namely, sends the key content display graph to the second terminal) so as to realize the transmission and sharing of the video.

Step 909: the second terminal receives the key content display graph and receives scanning operation aiming at the graphic code in the key content display graph.

Here, the key content display diagram further includes a graphic code corresponding to the video, and the user can scan and identify the graphic code through operation, so that the video can be viewed.

Step 910: and responding to the scanning operation aiming at the graphic code, jumping to the playing interface of the video from the current page, and playing the video based on the playing interface of the video.

By applying the embodiment, the key content display graph used for representing the key content of the video can be quickly generated by setting the generating function item of the key content display graph of the video, so that a user can be helped to better know the content of the video; and the key content is also provided with a graphic code corresponding to the video, so that a user can conveniently and quickly enter a video watching interface, and the video transmission is facilitated.

An exemplary application of the embodiments of the present invention in a practical application scenario will be described below.

Currently, if a user wants to browse a video quickly, only a trailer or a text introduction can be used, and the manual intervention is needed, and the cost is high. At present, no scheme for automatically generating a video introduction long graph exists, and although the splicing technology of the long graph is quite simple, the most critical part for generating the introduction long graph is the interpretation of video contents. In the related art, the heat of the video is represented by a curve, as shown in fig. 10, the curve is generated by judging the heat according to whether the user skips the segment, although the user can be referred to for a certain extent, as can be seen from fig. 10, the curve is very gentle, and it is very difficult for the user to know the key content of the video. In addition, the wonderful shot of the sports game video can be analyzed through the traditional shot segmentation, machine learning and other methods, but the scheme is only suitable for the situation that the video scene is single and the segmentation detection is easy to perform, and most videos (such as movies, TV plays, synthesis art and the like) use various artistic techniques to process the shot, so that the content is difficult to analyze through similar methods to obtain the key content of the video.

Based on this, embodiments of the present invention provide a video information processing method to solve at least the above existing problems, and the following detailed description is provided. Referring to fig. 11, fig. 11 is a schematic flowchart of an information processing method for a video according to an embodiment of the present invention, where the information processing method for a video according to an embodiment of the present invention includes:

step 1101: and the terminal runs the client.

Here, the terminal is provided with a client for video playback.

Step 1102: and displaying the generation function item of the key content display diagram of the corresponding video in the playing interface of the video.

Step 1103: and responding to the trigger operation aiming at the generating function item, and sending a generating instruction of the key content display graph of the corresponding video to the server.

Here, when a user watches a video through a client installed in a terminal, if a key content display diagram of the video needs to be generated, a generation instruction corresponding to the key content display diagram of the video may be triggered by clicking a generation function item presented in a playing interface of the video, which may be referred to in fig. 3.

Step 1104: the server receives and responds to the generation instruction to acquire video information of the video.

Step 1105: and determining whether the video has the label of the artificial node, if so, executing step 1109, and if not, executing step 1106.

Referring to fig. 12, fig. 12 is a schematic view of a video labeled with artificial nodes according to an embodiment of the present invention, where video content is labeled in a node manner on a display progress bar of the video, and each node corresponds to a key video frame of the video. When a trigger operation of a user for a certain node is received, annotation content corresponding to the node video, such as 'XXX to XX song', can be presented. In practical application, if the video is labeled with artificial nodes, the video frames corresponding to the nodes and the labeled text content can be directly used to generate a key content display diagram in a combined manner.

Step 1106: and determining whether a trailer corresponding to the video exists, if so, executing step 1107, and if not, executing step 1108.

Because the content of the video is many, in order to reduce the calculation time, in practical application, whether the video has a corresponding trailer can be judged, if yes, the trailer of the video can be directly analyzed to obtain a key video frame so as to generate a key content display diagram; and if the key video frames do not exist, analyzing the content of the video to obtain key video frames so as to generate a key content display graph.

Step 1107: and analyzing the content of the trailer corresponding to the video.

Step 1108: and performing content analysis on the video.

Here, the server may perform content analysis based on a playback speed, interactive information, background music, key character recognition, and the like, and the following description will be given in detail taking the example in which the server performs content analysis on a video.

First, the server performs content analysis on the video based on the play speed. Here, the server acquires a history playing speed of the user playing the video, such as 2-time speed, 0.5-time speed, and the like. And determining a target historical playing speed corresponding to the longest playing time when the user plays the whole video for each user, and analyzing by taking the target historical playing speed as a reference. Specifically, since the highlight level of the video content can be determined according to whether the user turns on the double speed, in the embodiment of the present invention, the corresponding relationship between the playing speed and the highlight level is preset, and the corresponding relationship includes 6 levels, as shown in table 1:

playing speed	0.5	1	1.25	1.5	2	Skip over
							Wonderful grade	5	4	3	2	1	0

TABLE 1 corresponding relationship between Play speed and highlight level

In practical applications, the score of the video frame corresponding to the reference playing speed may be set to 1, and the score of the skipped video frame may be set to 0, where the score is used to represent the possibility that the video frame corresponding to the playing time point is the key video frame. The scores of the video frames corresponding to other playing speeds can be calculated by the following formula:

S_n＝1+(Ln-L)*0.2

Second, the server performs content analysis on the video based on the background music. Here, the server identifies whether the video has background music to obtain an identification result; and determining a second score of the video frame of each playing time point of the video based on the recognition result. Specifically, the score of the video frame in which the background music exists may be set to 1, and the score of the video frame in which the background music does not exist may be set to 0, so as to obtain the second score of the video frame at each play time point of the video.

Thirdly, the server analyzes the content of the video based on the interactive information. Here, the server obtains interaction information, such as a pop-up screen, a pop-up comment, and the like, performed by the user on the video frame at each playing time point during the playing of the video, so as to determine a third score of the video frame at each playing time point of the video based on the interaction information at each playing time point. Specifically, the third score may be determined according to the number of pop-ups corresponding to the video frame at each playing time point and the like amount for each pop-up, for example, the score of one pop-up is set to 1, and the score of one like is set to 0.5. Meanwhile, in order to avoid that the score obtained based on the calculation is too high and influences other scores, the score obtained by the calculation is subjected to normalization processing, so that a third score of the video frame of each playing time point of the video is obtained.

Fourth, the server performs content analysis on the video based on key character recognition. Here, the server performs key person identification for each video frame of the video, and divides the video based on the identified key persons to obtain a video clip including each key person. For example, for a 20 minute segment, 0-5 minutes for a female hero segment, 5-8 minutes for a male hero segment, and 10-14 minutes for a male and female major-on-screen segment, we divide the movie into five video segments, 0-5, 5-8, 8-10, 10-14, and 14-20.

Step 1109: and determining a key video frame corresponding to the video based on the analysis result.

Here, based on the analysis result, the score of the video frame of each playing time point of the video is obtained. Specifically, the weights of the first score, the second score and the third score are analyzed according to manual investigation and actual conditions, and the score of the video frame of each playing time point of the video is obtained based on the corresponding weights.

Then, at least one key video frame is selected from the video clips based on the scores of the video frames of each playing time point. In practical applications, for each video segment divided based on key character recognition, given a minimum threshold value N and fluctuation range a, the highest score of video frames S1 is selected, corresponding to time point t1, the video segment is discarded if S1 is less than N, and a time period [ ta, tb ] is selected if S1 is greater than N, such that t1 is included between ta and tb, and the score of each video frame between ta and tb is guaranteed to be greater than N-a. Similarly, we choose the second time period with this rule. And after two time periods are obtained, taking the video frame corresponding to the highest score or a plurality of video frames which are sequenced according to the scores and are positioned at the front as the key video frame of the video.

Step 1110: based on the selected key video frames, a key content presentation graph of the video is generated.

Here, after obtaining the key video frames, text content associated with each key video frame, such as subtitle information, may be obtained, so that a key content display long image of the video is generated by combining and splicing the key video frames and the text content associated with the key video frames.

Specifically, the server may store a table of playing time points for the videos, and after determining the key video frames, the key video frames and the associated text content may be spliced according to the playing time point of each key video frame.

Meanwhile, when the key content display long graph is generated, the graph code corresponding to the video can be generated, so that when the electronic equipment triggers the scanning operation aiming at the graph code, the page can be directly jumped to the playing interface of the video, and the video can be conveniently spread.

Step 1111: and sending the key content display graph of the video to the terminal.

Step 1112: and the terminal receives and presents the key content display graph returned by the server.

Here, the terminal receives the key content presentation graph returned by the server, and may perform operations such as storing and sharing on the key content presentation graph.

In addition, the key content display diagram also comprises a graphic code of the video, and a user can enter a playing interface of the video by scanning the graphic code to watch the video.

By applying the embodiment, the feedback information of the existing user for the video can be fully utilized to generate the key content display long picture for introducing the video, so that the user can better know the content of the video, and the propagation of the original video is facilitated.

Continuing to describe the video information processing apparatus according to the embodiment of the present invention, referring to fig. 13, fig. 13 is a schematic structural diagram of an information processing apparatus 1300 according to the embodiment of the present invention, where the information processing apparatus 1300 according to the embodiment of the present invention includes:

a presentation module 1310, configured to present, in a video playing interface, a generation function item corresponding to a key content display diagram of a video;

a generating module 1320, configured to generate and present a key content display diagram representing key content of the video in response to a triggering operation for the generating function item;

wherein the key content presentation graph is derived by combining at least one key video frame of the video with text content associated with the key video frame.

In some embodiments, the generating module 1320 is further configured to present a video frame selection interface in response to the triggering operation for the generating function item, and

presenting at least one key video frame in the video frame selection interface;

In some embodiments, the presenting module 1310 is further configured to present a storage function item corresponding to the key content illustration;

In some embodiments, the presenting module 1310 is further configured to present a first sharing function item corresponding to the key content presentation graph;

In some embodiments, the presenting module 1310 is further configured to present a second sharing function item corresponding to the key content presentation graph;

In some embodiments, the apparatus further comprises:

the encoding module is used for acquiring a webpage link corresponding to the playing interface of the video;

and coding the webpage link to obtain a graphic code corresponding to the video, wherein the graphic code is used for jumping to a playing interface of the video when the electronic equipment triggers the scanning operation aiming at the graphic code.

In some embodiments, the generating module 1320 is further configured to obtain at least one key video frame of the video and text content associated with each key video frame, where the text content is used to describe video content of the corresponding key video frame;

acquiring a display diagram generation template corresponding to the video;

and combining the text content and the corresponding key video frames based on the display graph generation template so as to generate a key content display graph for representing the key content of the video based on the combination result.

In some embodiments, the generating module 1320 is further configured to, when the number of the key video frames is at least two, respectively obtain text content associated with each of the key video frames, where the text content is used to describe video content of the corresponding key video frame;

and splicing the obtained combined video frames corresponding to the key video frames to obtain the key content display picture.

In some embodiments, the apparatus further comprises:

In some embodiments, the generating module 1320 is further configured to obtain video information corresponding to each playing time point in the video, where the video information includes at least one of a historical playing speed, historical interaction data, and background music;

identifying video clips corresponding to key characters in the video;

based on the at least one determined key video frame, a key content presentation graph is generated for characterizing key content of the video.

In some embodiments, the generating module 1320 is further configured to determine, based on the video information corresponding to each playing time point, a score of a video frame corresponding to the corresponding playing time point in the video; the score is used for representing the possibility that the video frame corresponding to the playing time point is a key video frame;

In some embodiments, the generating module 1320 is further configured to determine, based on the historical playing speed corresponding to each playing time point, a first score of a video frame corresponding to the corresponding playing time point in the video;

An embodiment of the present invention further provides an electronic device, referring to fig. 14, where fig. 14 is a schematic structural diagram of the electronic device 400 provided in the embodiment of the present invention, in an actual application, the electronic device 400 may be a terminal or a server in fig. 1, and the electronic device is the terminal shown in fig. 1 as an example, and an electronic device implementing the video information processing method in the embodiment of the present invention is described, where the electronic device provided in the embodiment of the present invention includes:

a memory 450 for storing executable instructions;

the processor 410 is configured to implement the video information processing method provided by the embodiment of the present invention when executing the executable instructions stored in the memory.

Here, the Processor 410 may be an integrated circuit chip having Signal processing capabilities, such as a general purpose Processor, a Digital Signal Processor (DSP), or other programmable logic device, discrete gate or transistor logic device, discrete hardware components, or the like, wherein the general purpose Processor may be a microprocessor or any conventional Processor, or the like.

The memory 450 may be removable, non-removable, or a combination thereof. Exemplary hardware devices include solid state memory, hard disk drives, optical disk drives, and the like. Memory 450 optionally includes one or more storage devices physically located remote from processor 410.

The memory 450 includes either volatile memory or nonvolatile memory, and may include both volatile and nonvolatile memory. The nonvolatile memory may be a Read Only Memory (ROM), and the volatile memory may be a Random Access Memory (RAM). The memory 450 described in embodiments of the invention is intended to comprise any suitable type of memory.

At least one network interface 420 and user interface 430 may also be included in some embodiments. The various components in electronic device 400 are coupled together by a bus system 440. It is understood that the bus system 440 is used to enable communications among the components. The bus system 440 includes a power bus, a control bus, and a status signal bus in addition to a data bus. For clarity of illustration, however, the various buses are labeled as bus system 440 in fig. 14.

Embodiments of the present invention also provide a computer program product or computer program comprising computer instructions stored in a computer readable storage medium. The processor of the computer device reads the computer instructions from the computer-readable storage medium, and the processor executes the computer instructions, so that the computer device executes the information processing method of the video provided by the embodiment of the invention.

In some embodiments, the computer-readable storage medium may be memory such as FRAM, ROM, PROM, EPROM, EEPROM, flash, magnetic surface memory, optical disk, or CD-ROM; or may be various devices including one or any combination of the above memories. The computer may be a variety of computing devices including intelligent terminals and servers.

In some embodiments, executable instructions may be written in any form of programming language (including compiled or interpreted languages), in the form of programs, software modules, scripts or code, and may be deployed in any form, including as a stand-alone program or as a module, component, subroutine, or other unit suitable for use in a computing environment.

By way of example, executable instructions may correspond, but do not necessarily have to correspond, to files in a file system, and may be stored in a portion of a file that holds other programs or data, such as in one or more scripts in a hypertext Markup Language (HTML) document, in a single file dedicated to the program in question, or in multiple coordinated files (e.g., files that store one or more modules, sub-programs, or portions of code).

By way of example, executable instructions may be deployed to be executed on one computing device or on multiple computing devices at one site or distributed across multiple sites and interconnected by a communication network.

The above description is only an example of the present invention, and is not intended to limit the scope of the present invention. Any modification, equivalent replacement, and improvement made within the spirit and scope of the present invention are included in the protection scope of the present invention.

Claims

1. An information processing method for video, the method comprising:

responding to the trigger operation aiming at the generating function item, determining a video highlight grade corresponding to the historical playing speed of each playing time point in the video, and determining a first score corresponding to a video frame of the corresponding playing time point on the basis of the video highlight grade;

determining a result representing whether the background music of each playing time point exists or not, and determining a second score corresponding to the video frame of the corresponding playing time point based on the result;

determining a third score corresponding to the video frame of each playing time point based on historical interactive data of each playing time point, wherein the historical interactive data comprises at least one of the number of barrage and the number of barrage praise;

determining a score corresponding to the video frame of each playing time point based on the first score, the second score, the third score and the corresponding weight of each playing time point, wherein the score is used for representing the possibility that the video frame of each playing time point is a key video frame;

determining key video frames from video clips of corresponding key people based on scores corresponding to the video frames of each playing time point, and generating and presenting a key content display graph for representing key contents of the video based on the key video frames;

2. The method of claim 1, wherein the method further comprises:

presenting storage function items corresponding to the key content display pictures;

3. The method of claim 1, wherein the method further comprises:

presenting a first shared function item corresponding to the key content presentation graph;

in response to a trigger operation aiming at the first sharing function item, presenting a first sharing interface corresponding to the key content display diagram, and presenting a sharing object for selection in the first sharing interface;

4. The method of claim 1, wherein the method further comprises:

presenting a second sharing function item corresponding to the key content display diagram;

5. The method of claim 1, wherein a key content presentation graph of the video comprises a graphic code corresponding to the video; the method further comprises the following steps:

acquiring a link corresponding to a playing interface of the video;

6. The method of claim 1, wherein said generating and presenting a key content presentation map for characterizing key content of the video based on the key video frames comprises:

acquiring text content associated with the key video frames, wherein the text content is used for describing video content of the corresponding key video frames;

obtaining a display diagram generation template corresponding to the video;

and generating a template based on the display graph, combining the text content with corresponding key video frames, and generating and presenting a key content display graph for representing the key content of the video based on a combination result.

7. The method of claim 1, wherein the method further comprises:

receiving and presenting a key content display diagram of a shared target video, wherein the key content display diagram of the target video comprises a graphic code corresponding to the target video;

8. An information processing apparatus for video, the apparatus comprising:

the generating module is used for responding to the triggering operation aiming at the generating function item, determining a video highlight grade corresponding to the historical playing speed of each playing time point in the video, and determining a first score corresponding to a video frame of the corresponding playing time point based on the video highlight grade;

determining key video frames from video clips of corresponding key people based on scores corresponding to the video frames of each playing time point, and generating and presenting a key content display graph for representing key content of the video based on the key video frames;

9. An electronic device, characterized in that the electronic device comprises:

a memory for storing executable instructions;

a processor for implementing the information processing method of video according to any one of claims 1 to 7 when executing the executable instructions stored in the memory.

10. A computer-readable storage medium storing executable instructions for implementing an information processing method of a video according to any one of claims 1 to 7 when executed.