CN111901633B

CN111901633B - Video playing processing method and device, electronic equipment and storage medium

Info

Publication number: CN111901633B
Application number: CN202010749122.6A
Authority: CN
Inventors: 余自强
Original assignee: Tencent Technology Shenzhen Co Ltd
Current assignee: Tencent Technology Shenzhen Co Ltd
Priority date: 2020-07-30
Filing date: 2020-07-30
Publication date: 2021-12-17
Anticipated expiration: 2040-07-30
Also published as: CN111901633A

Abstract

The invention provides a video playing processing method, a video playing processing device, electronic equipment and a computer readable storage medium; the method comprises the following steps: responding to the video playing operation, and starting to play the video in the video playing page; when the video playing page is played to a first playing time point, presenting introduction information of an object aiming at the object appearing at the first playing time point; and when the video playing page is played to a second playing time point when the object appears again, and the interval between the first playing time point and the second playing time point is greater than the forgetting time length, presenting the introduction information of the object. By the method and the device, the objects related to the video can be introduced individually and accurately.

Description

Video playing processing method and device, electronic equipment and storage medium

Technical Field

The present invention relates to internet technologies, and in particular, to a method and an apparatus for playing and processing a video, an electronic device, and a computer-readable storage medium.

Background

With the continuous development of internet technology, the fire heat of network videos makes online videos become an important part of mass life and entertainment. The relationship between people in the video is complex, and the number of people is large, so that great memory pressure is brought to watching, and even the condition of 'blind face' appears.

The related art generally relies on human recognition of people in the video and unified incorporation of introductory information in the video. In the embodiment of the invention, the difference of the forgotten physiological characteristics of the user is found, so that the method not only consumes human resources and computing resources and cannot adapt to the current situation of mass video production, but also cannot meet the personalized requirements of the user on the introduction information when the user watches the video due to the difference of the forgotten physiological characteristics of the user.

Disclosure of Invention

The embodiment of the invention provides a video playing processing method and device, electronic equipment and a computer readable storage medium, which can be used for individually and accurately introducing objects related to a video.

The technical scheme of the embodiment of the invention is realized as follows:

the embodiment of the invention provides a video playing processing method, which comprises the following steps:

responding to the video playing operation, and starting to play the video in the video playing page;

when the video playing page is played to a first playing time point, presenting introduction information of an object aiming at the object appearing at the first playing time point;

and when the video playing page is played to a second playing time point when the object appears again, and the interval between the first playing time point and the second playing time point is greater than the forgetting time length, presenting the introduction information of the object.

An embodiment of the present invention provides a video playing processing apparatus, including:

the video playing module is used for responding to video playing operation and starting to play video in a video playing page;

the introduction presentation module is used for presenting introduction information of the object aiming at the object appearing at a first playing time point when the video playing page is played to the first playing time point;

the introduction presentation module is further configured to present introduction information of the object when the video playing page is played to a second playing time point at which the object appears again, and an interval between the first playing time point and the second playing time point is greater than the forgetting time length.

In the above scheme, the introduction presentation module is further configured to present introduction information corresponding to the first appearing object at the first play time point; wherein the first appearing object of the first playing time point is an object which appears at the first playing time point and does not appear before the first playing time point; or presenting introduction information corresponding to the target introduction object at the first playing time point in response to introduction information viewing operation; and the target introduction object at the first playing time point is an object selected by the introduction information viewing operation in the objects appearing at the first playing time point.

In the above solution, the introduction presentation module is further configured to, starting from the first play time point, present introduction information of the object at a presentation position of the object, and stop presenting the introduction information at a third play time point, or stop presenting the introduction information when a set introduction duration is reached; the third playing time point is the time point when the object moves out of the video playing page, and the set introduction time length is counted from the first playing time point.

In the above scheme, the introduction presentation module is further configured to present the introduction information at a position where the object appears in the video playback page; or, presenting the introduction information and a position identifier in an edge area of the video playing page, where the position identifier is used to indicate a position where an object introduced by the introduction information appears in the video playing page.

In the foregoing solution, the video playing processing apparatus further includes: the acquisition module is used for acquiring the information of the object to be introduced when the video playing page is played to a first playing time point; wherein the type of the object to be introduced comprises at least one of the following types: a first occurrence object of the first play time point; a target introduction object of the first playing time point; wherein the first appearing object of the first playing time point is an object which appears at the first playing time point and does not appear before the first playing time point; the target introduction object of the first playing time point is an object selected by introduction information viewing operation in the objects appearing at the first playing time point; the information of the object to be introduced comprises position information and introduction information, wherein the position information is used for indicating the position of presenting the introduction information in the video playing page.

In the above scheme, the obtaining module is further configured to send an information obtaining request to a server, and receive information of the object to be introduced, sent by the server; the information obtaining request includes the first playing time point, and the information obtaining request is used for the server to search the information of the object to be introduced corresponding to the first playing time point in the cache corresponding to the video.

In the above scheme, the obtaining module is further configured to present an object introduction mode button in the video playing page; responding to the triggering operation of the object introduction mode button, switching to an object introduction mode for introducing the object in the video and presenting a forgetting duration setting page; and acquiring the forgetting duration set in the forgetting duration setting page.

In the foregoing aspect, the video playback processing apparatus further includes: the adjusting module is used for responding to the forgetting time length adjusting operation and acquiring the adjusted forgetting time length; the introduction presentation module is further configured to present introduction information of the object when the video playing page is played to a second playing time point at which the object appears again, and an interval between the first playing time point and the second playing time point is greater than the adjusted forgetting duration.

In the above scheme, the obtaining module is further configured to obtain historical video data; calling a neural network model to perform the following processing on the historical video data: extracting a feature vector of the historical video data; mapping the extracted feature vectors into probabilities corresponding to a plurality of candidate forgetting durations respectively, and determining the candidate forgetting duration corresponding to the maximum probability as the forgetting duration; wherein the sample historical video data used to train the neural network model comprises: the type of historical video; a viewing duration of the historical video; the number of interactions related to information consultation of the object to be introduced during the playing period of the historical video; a number of pauses during the playing of the historical video.

An embodiment of the present invention provides an electronic device, including:

a memory for storing computer executable instructions;

and the processor is used for realizing the video playing processing method provided by the embodiment of the invention when executing the computer executable instructions stored in the memory.

The embodiment of the invention provides a computer-readable storage medium, which stores computer-executable instructions for causing a processor to execute the computer-readable storage medium, so as to implement the video playing processing method provided by the embodiment of the invention.

The embodiment of the invention has the following beneficial effects:

in the playing process of the video, the frequency of introducing the object in the video is controlled by the forgetting duration, so that the time for introducing the object can accurately meet the requirement of knowing the object in the watching process of a user, personalized accurate recommendation is realized, and meanwhile, the resource consumption of related equipment is reduced due to unnecessary introduction.

Drawings

Fig. 1 is a schematic structural diagram of a video playing processing system 100 according to an embodiment of the present invention;

fig. 2 is a schematic structural diagram of an electronic device 500 according to an embodiment of the present invention;

fig. 3 is a schematic flowchart of a video playing processing method according to an embodiment of the present invention;

fig. 4 is a schematic flowchart of a video playing processing method according to an embodiment of the present invention;

fig. 5 is a schematic view of an application scenario of a video playing processing method according to an embodiment of the present invention;

fig. 6 is a schematic flowchart of a video playing processing method according to an embodiment of the present invention;

fig. 7 is a schematic view of an application scenario of a video playing processing method according to an embodiment of the present invention;

fig. 8 is a schematic flowchart of a video playing processing method according to an embodiment of the present invention;

fig. 9A and 9B are schematic application scenarios of a video playing processing method according to an embodiment of the present invention.

Detailed Description

In order to make the objects, technical solutions and advantages of the present invention clearer, the present invention will be further described in detail with reference to the accompanying drawings, the described embodiments should not be construed as limiting the present invention, and all other embodiments obtained by a person of ordinary skill in the art without creative efforts shall fall within the protection scope of the present invention.

In the following description, reference is made to "some embodiments" which describe a subset of all possible embodiments, but it is understood that "some embodiments" may be the same subset or different subsets of all possible embodiments, and may be combined with each other without conflict.

Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs. The terminology used herein is for the purpose of describing embodiments of the invention only and is not intended to be limiting of the invention.

Before further detailed description of the embodiments of the present invention, terms and expressions mentioned in the embodiments of the present invention are explained, and the terms and expressions mentioned in the embodiments of the present invention are applied to the following explanations.

1) In response to the condition or state on which the performed operation depends, one or more of the performed operations may be in real-time or may have a set delay when the dependent condition or state is satisfied; there is no restriction on the order of execution of the operations performed unless otherwise specified.

2) The terminal comprises a client, and an application program running in the terminal and used for providing various services, such as a video client, a short video client or a live broadcast client.

3) The character introduction is a text description for introducing characters in the video.

4) Video Frames, video content consists of a series of video Frames, usually expressed in Frames Per Second (FPS, Frames Per Second). Each video frame is a still image, and when a plurality of video frames are played in order, a moving image, i.e., video content, can be created.

5) The term "face-blind" as used herein refers to a user losing recognition ability for an object (e.g., a person or an object) while watching a video, and is not particularly limited to loss of face recognition ability.

In the related art, many european dramas or domestic dramas are manually inserted into the video for the first-time appearance character in the drama one by one when the video dramas are produced. However, the human information is manually identified, and the human introduction is inserted by using video processing, so that the workload is high, and the high-frequency human introduction required by the user in the scene with a blind face cannot be met.

In order to solve the technical problems, the embodiment of the invention utilizes the face recognition of the characters in the video to obtain the role information of the corresponding characters in the drama, and the role information is presented in the video display page in a character mode, so that the character information can be prevented from being manually recognized; the embodiment of the invention inserts the introduction information based on the face recognition, and has the advantages of high speed, high accuracy and small workload. And the frequency of the appearance of the introduction information can be adjusted by the user, and the user is supported to click the character head portrait to directly obtain the introduction information, so that the user can conveniently distinguish the characters when watching the drama.

Referring to fig. 1, fig. 1 is a schematic structural diagram of a video playing processing system 100 according to an embodiment of the present invention. The video playing processing system 100 includes: the server 200, the network 300, and the terminal 400 will be separately described.

The server 200 is a background server of the client 410, and is configured to respond to a video acquisition request of the client 410 and send a corresponding video to the client 410; and is further configured to send corresponding introduction information to the client 410 in response to the introduction information obtaining request of the client 410.

The network 300, which is used as a medium for communication between the server 200 and the terminal 400, may be a wide area network or a local area network, or a combination of both.

The terminal 400 is used for operating a client 410, and the client 410 is a client with a video playing function. The client 410 is configured to respond to a video playing operation of a user, send a video acquisition request to the server 200, receive a video sent by the server 200, and play the video in a video playing page; and is further configured to send an introduction information acquisition request to the server 200 in response to a user's trigger operation on the object introduction mode button, to receive the introduction information sent by the server 200, and to present the introduction information in the video playback page.

In some embodiments, the client 410 implements the video playing processing method provided by the embodiments of the present invention by running a computer program, which may be a native program or a software module in an operating system; can be a local (Native) Application program (APP), i.e. a program that needs to be installed in an operating system to run, such as a video APP or a live APP; or may be an applet, i.e. a program that can be run only by downloading it to the browser environment; but also a video applet or live applet that can be embedded into any APP. In general, the computer program may be any application, module or plug-in that may be in any form.

The embodiment of the invention can be realized by means of Cloud Technology (Cloud Technology), which is a hosting Technology for unifying series resources such as hardware, software, network and the like in a wide area network or a local area network to realize the calculation, storage, processing and sharing of data.

The cloud technology is based on the general names of network technology, information technology, integration technology, management platform technology, application technology and the like applied in the cloud computing business model, can form a resource pool, is used as required, and is flexible and convenient. Cloud computing technology will become an important support. Background services of technical network systems require a large amount of computing and storage resources, for example, web portals for video playback.

As an example, the server 200 may be an independent physical server, may be a server cluster or a distributed system formed by a plurality of physical servers, and may also be a cloud server providing basic cloud computing services such as a cloud service, a cloud database, cloud computing, a cloud function, cloud storage, a web service, cloud communication, a middleware service, a domain name service, a security service, a CDN, and a big data and artificial intelligence platform. The terminal 400 may be, but is not limited to, a smart phone, a tablet computer, a laptop computer, a desktop computer, a smart speaker, a smart watch, and the like. The terminal 400 and the server 200 may be directly or indirectly connected through wired or wireless communication, and the embodiment of the present invention is not limited thereto.

Next, a structure of an electronic device according to an embodiment of the present invention is described, where the electronic device may be the terminal 400 shown in fig. 1, referring to fig. 2, fig. 2 is a schematic structural diagram of an electronic device 500 according to an embodiment of the present invention, and the electronic device 500 shown in fig. 2 includes: at least one processor 510, memory 550, at least one network interface 520, and a user interface 530. The various components in the electronic device 500 are coupled together by a bus system 540. It is understood that the bus system 540 is used to enable communications among the components. The bus system 540 includes a power bus, a control bus, and a status signal bus in addition to a data bus. For clarity of illustration, however, the various buses are labeled as bus system 540 in fig. 2.

The Processor 510 may be an integrated circuit chip having Signal processing capabilities, such as a general purpose Processor, a Digital Signal Processor (DSP), or other programmable logic device, discrete gate or transistor logic device, discrete hardware components, or the like, wherein the general purpose Processor may be a microprocessor or any conventional Processor, or the like.

The user interface 530 includes one or more output devices 531 enabling presentation of media content, including one or more speakers and/or one or more visual display screens. The user interface 530 also includes one or more input devices 532, including user interface components to facilitate user input, such as a keyboard, mouse, microphone, touch screen display, camera, other input buttons and controls.

The memory 550 may be removable, non-removable, or a combination thereof. Exemplary hardware devices include solid state memory, hard disk drives, optical disk drives, and the like. Memory 550 optionally includes one or more storage devices physically located remote from processor 510.

The memory 550 may comprise volatile memory or nonvolatile memory, and may also comprise both volatile and nonvolatile memory. The nonvolatile memory may be a Read Only Memory (ROM), and the volatile memory may be a Random Access Memory (RAM). The memory 550 described in connection with embodiments of the invention is intended to comprise any suitable type of memory.

In some embodiments, memory 550 can store data to support various operations, examples of which include programs, modules, and data structures, or subsets or supersets thereof, as exemplified below.

An operating system 551 including system programs for processing various basic system services and performing hardware-related tasks, such as a framework layer, a core library layer, a driver layer, etc., for implementing various basic services and processing hardware-based tasks;

a network communication module 552 for communicating to other computing devices via one or more (wired or wireless) network interfaces 520, exemplary network interfaces 520 including: bluetooth, wireless compatibility authentication (WiFi), and Universal Serial Bus (USB), etc.;

a presentation module 553 for enabling presentation of information (e.g., a user interface for operating peripherals and displaying content and information) via one or more output devices 531 (e.g., a display screen, speakers, etc.) associated with the user interface 530;

an input processing module 554 to detect one or more user inputs or interactions from one of the one or more input devices 532 and to translate the detected inputs or interactions.

In some embodiments, the video playing processing apparatus provided by the embodiments of the present invention may be implemented in software, and fig. 2 shows a video playing processing apparatus 555 stored in a memory 550, which may be software in the form of programs and plug-ins, and includes the following software modules: a video playing module 5551 and an introduction presentation module 5552, which are logical and thus can be arbitrarily combined or further split according to the implemented functions. The functions of the respective modules will be explained below.

The following description will take the terminal 400 in fig. 1 as an example to implement the video playing processing method provided by the embodiment of the present invention. Referring to fig. 3, fig. 3 is a flowchart illustrating a video playing processing method according to an embodiment of the present invention, and will be described with reference to the steps shown in fig. 3.

It should be noted that the method shown in fig. 3 can be executed by various forms of computer programs executed by the terminal 400, and is not limited to the client 410, such as the operating system 551, the software modules and the scripts described above, and therefore the client should not be considered as limiting the embodiments of the present invention.

In step S101, in response to a video play operation, video play is started in a video play page.

Here, the video playing operation may be various forms of operation that the operating system has set in advance and that does not conflict with the registered operation; or may be various forms of operations that are user-defined and that do not conflict with registered operations. The video playing operation comprises at least one of the following: click operations (e.g., single-finger click operations, multi-finger click operations, multiple continuous click operations, etc.); a sliding operation in a specific track or direction; performing voice operation; a motion sensing operation (e.g., an operation of moving up and down, a curved motion operation, or the like). Thus, the operation experience of the user can be improved.

In some embodiments, the client sends a video acquisition request to the server in response to a video playing operation, so as to receive a video sent by the server and play the video in a video playing page.

In step S102, when the video playback page is played to the first playback time point, the introduction information of the object is presented for the object appearing at the first playback time point.

Here, the time point may be a playing time point of a single video frame, that is, a time point in the time axis (or referred to as a progress bar) corresponding to the playing time stamp of the video frame; the time point may also be a fixed length period (e.g., 1 second); the time point may also be a time period with a dynamically changing length, for example, a time point corresponds to one or more (i.e., at least two) groups Of motion Pictures (GOPs), where the corresponding time duration is a difference between timestamps Of a first frame and a last frame Of the Group Of motion Pictures; the point in time may also correspond to a period of time in which one or more objects appear connected in the timeline.

The object may be a person, a place, an animal, or the like. The objects appearing at the first play time point may be one or more. The type of object includes at least one of: a first occurrence object of a first play time point; the object introduction object of the first play time point will be described in detail below.

In some embodiments, when the object is a person, the introductory information includes at least one of: role information played by a person in a video; relationship information between the role played by the person in the video and other roles; true identity information of the person. When the object is a place, the introduction information includes at least one of: a geographic location of the place; a landmark landscape of a place. When the object is an animal, the introductory information includes at least one of: the name of the animal; animal habit.

In some embodiments, when the video playing page is played to a first playing time point, presenting introduction information corresponding to a first occurrence object of the first playing time point; the first appearing object of the first playing time point is an object which appears at the first playing time point and does not appear before the first playing time point. Therefore, in the video playing process, the client can automatically present the introduction information of the object which is first shown, so that the user can fully know the object appearing in the video watching process, and the user does not need to pause playing the video to retrieve the corresponding object.

In other embodiments, when the video playing page is played to the first playing time point, presentation information corresponding to the target presentation object at the first playing time point is presented in response to the presentation information viewing operation; and the target introduction object at the first playing time point is an object selected by introduction information viewing operation in the objects appearing at the first playing time point.

As an example, at least one introductory information viewing button corresponding to the object is presented in the video playing page, and the operation for triggering the introductory information viewing button is an introductory information viewing operation for the object presented in the video playing page. For example, the video frame includes an object a, an object B, and an object C, each object corresponds to an introduction information viewing button, and when the user triggers the introduction information viewing button corresponding to the object a, the introduction information of the object a is presented. Each object may correspond to the same introduction information viewing button, and when the user triggers the introduction information viewing button, the introduction information of the object a, the object B, and the object C is presented at the same time.

Taking the object as a character as an example, when the user is unfamiliar with the characters appearing in the video, the user can present corresponding introduction information in the client by clicking the region where the character is located, so that the user introduction information acquisition requirement can be met through simple operation, and the user can conveniently and correctly master the development of the video scenario.

In some embodiments, presentation information of the object is presented at the presentation position of the object from the first play time point, and presentation of the presentation information is stopped at the third play time point.

Here, the third play time point is a time point at which the object starts to move out or completely moves out from the video play page. Therefore, the situation that introduction information of the object is presented after the object is moved out to cause introduction confusion is avoided.

In other embodiments, the presentation information of the object is presented at the presentation position of the object from the first play time point, and the presentation of the presentation information is stopped when the set presentation time period is reached.

Here, the introduction period is set to be counted from the first play time point. The introduction time duration may be a default value, a value set by a user, or determined according to the total time duration of the video, for example, one thousandth of the total time duration of the video.

In some embodiments, the introductory information is presented at the location where the object appears in the video playback page.

Taking an object as a figure as an example, carrying out face recognition processing on a current video frame of a video to obtain a figure region of the figure in the video frame; the character introduction information is presented in a manner of avoiding the portrait area (for example, above, below, left, or right of the portrait area) in the video frame, or the character introduction information is presented in a floating layer having transparency in the portrait area.

As an example, the specific implementation of the face recognition processing may be: dividing a video frame into a plurality of candidate frames, and extracting a feature vector of each candidate frame; determining candidate frames comprising the portrait according to the feature vector of each candidate frame; the candidate frame including the portrait is determined as the portrait area.

For example, in fig. 7, introduction information 701 is presented in the vicinity of a position where a person appears. Therefore, the situation that the introduction information blocks people to influence the watching experience of the user can be avoided.

In other embodiments, the introduction information and the location identification are presented in the edge area of the video playing page.

Here, the client presents the introduction information on the video playing page in a manner of avoiding the position where the object appears in the video playing page. The edge area may be located in a fixed area of the video playing page, or may be changed according to a picture presented in the video playing page, for example, located in a blank area of the picture. The position identification is used for indicating the position of the object introduced by the introduction information in the video playing page.

Taking the example that the object is a character, a current frame presented in the video playing page includes a plurality of objects, and introduction information corresponding to each object and an identifier indicating a position where the object introduced by each introduction information appears in the current frame are presented in an edge area of the video playing page.

For example, an object a, an object B, and an object C are included from left to right in the current frame, and the introduction information of the object a and an identifier indicating a position where the object a is located (e.g., "left one" or "right three"), the introduction information of the object B and an identifier indicating a position where the object B is located (e.g., "left two" or "right two"), and the introduction information of the object C and an identifier indicating a position where the object C is located (e.g., "right one" or "left three") are presented in the edge region.

As an example, a drop point of a sight line of a viewer is determined in a video playing page, and an area with the drop point as a center is determined as a focus area; determining the area except the focus area in the video playing page as an edge area; and presenting the introduction information in the edge area and the position of the object introduced by the introduction information in the video playing page.

Here, the region centered on the drop point may be a regular shape, such as a circle or a rectangle; or may be irregularly shaped. The size of the region may be a default or user-defined fixed size; or the size of the video playing page may be determined according to the size of the video playing page, for example, the size of the area is proportional to the size of the video playing page. The size of the region may be enlarged or reduced in different proportions according to a particular action of the viewer (e.g., waving a hand, or blinking continuously, etc.), for example, when the viewer blinks three times in succession, the size of the region is enlarged; reducing the size of the region when the viewer blinks twice in succession; zooming in the size of the region when the viewer swipes his hand to the right; when the viewer swipes his hand to the left, the area is reduced in size.

Specific implementations of determining the focus area are described below.

In some embodiments, a point of drop of the viewer's gaze is determined in the video playback page by the eye tracking system, and an area centered at the point of drop is determined as the focus area.

As an example, the client calls a camera device (e.g. a camera) of the terminal to acquire the positions of the reflective bright spots on the outer surfaces of the pupil and the cornea of the eyeball of the viewer; and determining a falling point corresponding to the sight line of the viewer in the video playing page according to the positions of the pupil of the viewer and the reflection bright spot on the outer surface of the cornea of the eyeball.

Here, the Reflection bright spot on the outer surface of the Cornea of the eyeball refers to Purkinje's spot (Purkinje Image), which is a bright light spot on the Cornea of the eyeball, generated by Reflection (CR) of light entering the pupil on the outer surface of the Cornea.

The principle of determining the falling point corresponding to the sight line of the viewer in the video playing page according to the positions of the reflective bright spots of the pupil and the outer surface of the cornea of the eyeball of the viewer is as follows: because the position of the terminal camera is fixed, the position of the terminal screen light source is also fixed, the center position of the eyeball is unchanged, and the absolute position of the purkinje spot does not change along with the rotation of the eyeball. But their positions relative to the pupils and eyeballs are constantly changing, e.g., purkinje's spot is located between the pupils of the viewer when the viewer gazes at the camera; and when the viewer lifts his head, the purkinje spot is just below the viewer's pupil.

Therefore, the sight line direction of the viewer can be estimated by using the geometric model as long as the positions of the pupil and the purkinje spot on the eye image are positioned in real time and the corneal reflection vector is calculated. And then based on the relationship between the eye characteristics of the viewer established in the previous calibration process (namely, allowing the viewer to watch a specific point on the terminal screen) and the video playing page presented by the terminal screen, the falling point corresponding to the sight of the viewer can be determined in the video playing page.

For example, the client determines the corneal reflection vector of the viewer according to the positions of the pupil of the viewer and the reflective bright spot on the outer surface of the cornea of the eyeball; determining the sight direction of a viewer when the viewer watches a video playing page according to the corneal reflection vector of the viewer; and determining a drop point in the video playing page according to the sight line direction of the viewer when watching the video playing page. Therefore, the current focus area can be determined in real time and accurately according to the sight of the viewer, so that the introduction information is presented in the non-focus area, and the condition that the introduction information shields the focus area and influences the viewing experience of the viewer can be avoided.

In step S103, when the video playing page is played to the second playing time point at which the object appears again, and the interval between the first playing time point and the second playing time point is greater than the forgetting duration, the introduction information of the object is presented.

Here, the objects appearing at the second play time point may be one or more.

The forgetting duration may be a value set by a user or a default value, and the following specifically describes a determination method of the forgetting duration.

In some embodiments, when the forgetting duration is a value set by the user, before step S103, the method further includes: presenting an object introduction mode button in a video playing page; responding to the trigger operation aiming at the object introduction mode button, switching to an object introduction mode for introducing the object in the video, and presenting a forgetting duration setting page; and acquiring the forgetting time length set in the forgetting time length setting page.

Here, the client may switch from a mode in which the object in the video is not introduced to an object introduction mode in which the object in the video is introduced according to a forgetting duration; it is also possible to switch from the object in the introduction video, but not according to the forgetting duration introduction, to the object introduction mode in which the object in the video is introduced according to the forgetting duration.

Here, the forgetting duration setting page and the video playing page may be displayed simultaneously, for example, the forgetting duration setting page and the video playing page are displayed in a split screen manner; the forgetting time setting page is displayed above the video playing page in a floating layer mode, so that the forgetting time setting page has transparency and cannot completely shield the video playing page. Of course, the forgotten duration setting page and the video playback page may not be displayed at the same time, for example, when the client responds to the trigger operation for the object introduction mode button, the display is switched from the video playback page to the presentation forgotten duration setting page.

Here, the forgetting duration setting page further includes a forgetting duration adjustment button, and the user can adjust the forgetting duration by triggering the forgetting duration adjustment button, so that the method further includes, before step S103: responding to the forgetting time length adjusting operation, and acquiring the adjusted forgetting time length; thus, step S103 may be: and when the video playing page is played to a second playing time point when the object appears again, and the interval between the first playing time point and the second playing time point is greater than the adjusted forgetting time length, presenting the introduction information of the object.

As one example, the type of video is determined; and responding to the forgetting duration setting operation aiming at the type of the video, and acquiring the forgetting duration set corresponding to the type of the video.

Here, the type of the video includes at least one of: terror; comedy; suspension; tragedies. Because the evolution rhythm of the suspense type video is faster than that of the tragedy type video, the user can set that the forgetting duration corresponding to the suspense type video is greater than that corresponding to the tragedy type video. Therefore, the user can set different time intervals to present the introduction information according to different video types, and the user can fully know the objects presented in the video, so that the watching experience of the user is improved.

As another example, the type of the object is determined; in response to a forgetting duration setting operation for the type of the object, forgetting durations respectively set for different types of the object are acquired.

Here, the object may be a person, a place, an animal, or the like. Taking the example where the object is a person, a person having a characteristic of a sign (e.g., dressing odds or growth odds) is easier for the user to recognize than a person having no characteristic of a sign, and therefore, the forgetting period corresponding to a person having a characteristic of a sign can be set to be longer than the forgetting period corresponding to a person having no characteristic of a sign. Therefore, the user can set different time intervals to present the introduction information according to different object types, and the user can fully know the objects presented in the video, so that the watching experience of the user is improved.

As yet another example, a degree of similarity between a plurality of objects in a video is detected; when the similarity degree between the plurality of objects is higher than the similarity threshold value, prompt information is presented.

Here, the similarity threshold may be a default value or a value set by the user. The prompt information is used for the user to set forgetting duration for the object with the similarity degree higher than the similarity threshold. In this way, the user can set different forgetting durations for the objects with high similarity, so that the user can distinguish the objects with high similarity, and the watching experience of the user is improved.

Taking the object as a person as an example, the degree of similarity between the persons can be determined by detecting the dresses and growing phases of a plurality of persons.

As an example, the type of the video is determined, and a forgetting duration corresponding to the type of the video is obtained.

Here, the types of video include: horror, comedy, suspense, tragedy, etc.

For example, historical video data is obtained; in the historical video data, the average value of the time intervals of viewing the introduction information when the user watches the video with the same type as the video is counted, and the obtained average value is determined as the forgetting duration of the type of the corresponding video. Therefore, the introduction information can be presented at different time intervals according to different video types, so that the user can be helped to fully know the objects presented in the video, and the watching experience of the user is improved.

As another example, the type of the object is determined, and a forgetting duration corresponding to the type of the object is obtained.

Here, the object may be a person, a place, an animal, or the like.

For example, historical video data is obtained; in the historical video data, the average value of the time intervals of the introduction information viewed when the user watches the object with the same type as the object is counted, and the obtained average value is determined as the forgetting duration of the type of the corresponding object. Therefore, the introduction information can be presented at different time intervals according to different object types, so that the user can be helped to fully know the objects presented in the video, and the watching experience of the user is improved.

As yet another example, a degree of similarity between each object and the remaining objects in the video is determined; and determining the forgetting duration of each object according to the similarity between each object and the rest objects.

Here, the similarity threshold may be a default value or a value set by the user. The similarity between the object and other objects is inversely proportional to the forgetting time of the corresponding object, so that a shorter forgetting time can be set for the object with high similarity, and the user can be helped to distinguish the object with high similarity, and the watching experience of the user is improved.

As yet another example, a forgetting duration set by the social friend is obtained, and the forgetting duration set by the social friend is determined as the forgetting duration. Therefore, the forgetting duration set by using the social friends can be inherited, and the operation of the user is reduced.

As yet another example, historical video data is obtained; calling a neural network model to perform the following processing on the historical video data: extracting a feature vector of historical video data; and mapping the extracted feature vectors into probabilities corresponding to a plurality of candidate forgetting durations respectively, and determining the candidate forgetting duration corresponding to the maximum probability as the forgetting duration.

Here, the sample historical video data used to train the neural network model includes: the type of historical video; the viewing duration of the historical video; the number of interactions related to information consultation of the object to be introduced during the playing period of the historical video; number of pauses during play of the historical video. Therefore, the forgetting duration which meets the user requirements can be accurately determined through machine learning, so that the viewing experience of the user is met, and the waste of service resources caused by the presentation of too much introduction information is avoided.

Referring to fig. 4, fig. 4 is a schematic flowchart of a video playing processing method according to an embodiment of the present invention, and based on fig. 3, step S104 may be further included before step S102.

In step S104, when the video playing page is played to the first playing time point, information of the object to be introduced is acquired.

Here, the type of the object to be introduced includes at least one of: a first occurrence object of a first play time point; and a target introduction object of the first playing time point. The first appearing object at the first playing time point is an object which appears at the first playing time point and does not appear before the first playing time point; the target introduction object at the first play time point is an object selected by the introduction information viewing operation among objects appearing at the first play time point. The information of the object to be introduced comprises position information and introduction information, and the position information is used for indicating the position of presenting the introduction information in the video playing page.

In some embodiments, the client may invoke a corresponding service (e.g., a target recognition service) of the terminal, and the process of target recognition is completed by the terminal. The client can also call a corresponding service (for example, a target identification service) of the server, and the target identification process is completed through the server.

The following description will take an example of a process in which the client calls a corresponding service of the server and the server completes the target identification.

In some embodiments, an information acquisition request is sent to a server, and information of an object to be introduced sent by the server is received; the information acquisition request comprises a first playing time point, and the information acquisition request is used for the server to search the information of the object to be introduced corresponding to the first playing time point in the cache of the corresponding video.

Here, the client may obtain information of the corresponding object to be introduced from the server at each time point when the object appears, but a large communication consumption may be generated if the server is called to obtain the corresponding introduction information at each time point.

Therefore, as an alternative to step S104, it may be: when the video playing page is played to a first playing time point, acquiring introduction information of a plurality of objects appearing in a video playing interface starting from the first playing time point and ending to a fourth playing time point; or obtaining introduction information of a plurality of objects appearing in the video playing interface with the duration set to be obtained from the first playing time point. In this way, the client identifies the objects appearing in the video frame, determines the introduction information matched with the objects appearing in the current video frame from the acquired introduction information of the plurality of objects, and presents the matched introduction information. Therefore, the client can present the introduction information without relying on communication with the server, so as to avoid the situation that the introduction information cannot be presented when the client is offline.

The following describes a process in which the server acquires information of an object to be introduced.

As an example, the server performs object recognition processing on a current picture of a video to obtain object feature information and an area of an object in the current picture; according to the object characteristic information, searching introduction information matched with the object characteristic information in a database; and determining position information for indicating presentation of introduction information in the video playing page according to the area of the object in the current picture.

Specifically, the process of the target identification processing may be: the server divides the current picture into a plurality of candidate frames and extracts the feature vector of each candidate frame; and determining a candidate frame comprising the object and object characteristic information of the corresponding object contained in the candidate frame according to the characteristic vector of each candidate frame.

It should be noted that the process of the client invoking the corresponding service of the terminal and completing the target identification through the terminal is similar to that described above, and will not be described again.

In the embodiment of the invention, the introduction information of the object to be introduced corresponding to the current picture is obtained in advance, so that the corresponding introduction information can be presented in the video in time without waiting for the user, and the watching experience of the user is improved. If the target identification process is finished by the server, the computing resources of the terminal can be saved, and the hardware use threshold of the terminal is reduced. If the process of target identification is finished by the terminal, the computing resource of the server can be saved, and the speed of displaying the introduction information is improved.

In some embodiments, after step S103, the method may further include: and storing the introduction information of the object into the blockchain network so that the blockchain network responds to the introduction information acquisition request aiming at the object according to the stored introduction information.

Referring to fig. 5, fig. 5 is a schematic view of an application scenario of a video playing processing method provided in an embodiment of the present invention, and includes a block chain network 600 (exemplarily showing a consensus node 610-1, a consensus node 610-2, and a consensus node 610-3), an authentication center 700, and a service body 800, which are respectively described below.

The type of blockchain network 600 is flexible and may be, for example, any of a public chain, a private chain, or a federation chain. Taking the public chain as an example, any electronic device (e.g., client 410) of a business entity may access the blockchain network 600 as a client node without authorization; taking a federation chain as an example, after being authorized, a business entity can access the electronic device under its jurisdiction to the blockchain network 600 to become a client node.

As an example, when the blockchain network 600 is a federation chain, the business entity 800 registers with the certificate authority 700 to obtain a respective digital certificate, which includes the public key of the business entity and a digital signature signed by the certificate authority 700 on the public key and identity information of the business entity 800, is appended to the transaction (e.g., for uplink storage of the introduction information or query of the introduction information) together with the digital signature of the business entity for the transaction, and is sent to the blockchain network 600, so that the blockchain network 600 takes the digital certificate and the digital signature out of the transaction, verifies the authenticity of the transaction (i.e., whether it has not been tampered with) and the identity information of the business entity sending the message, and the blockchain network 600 verifies the identity, for example, whether it has the right to initiate the transaction.

In some embodiments, the client node may act as a mere watcher of the blockchain network 600, i.e., provide support for the business entity to initiate transaction functions, and may be implemented by default or selectively (e.g., depending on the specific business requirements of the business entity) for the functions of consensus nodes of the blockchain network 600, such as ranking functions, consensus services, ledger functions, and the like. Therefore, the data and the service processing logic of the service subject can be migrated to the blockchain network 600 to the maximum extent, and the credibility and traceability of the data and service processing process are realized through the blockchain network 600.

Consensus nodes in blockchain network 600 receive transactions submitted by client nodes from different business entities (e.g., business entity 800 shown in fig. 5), perform transactions to update or query the ledger, and various intermediate or final results of performing transactions may be returned for display in the business entity's client nodes.

An exemplary application of the blockchain network is described below by taking an example of uploading introduction information to the blockchain network by a first client, which may be a client belonging to the service body 800 in fig. 5 as a client node 810 of the blockchain network. Illustratively, the first client may be client 410 in fig. 1.

First, the logic of the uplink of the introductory information is set at the client node 810, for example, when the introductory information is obtained, the client node 810 generates a corresponding transaction when sending the introductory information to the blockchain network 600, and the transaction includes: the intelligent contract which needs to be called for uploading the introduction information and the parameters transferred to the intelligent contract; the transaction also includes the client node's 810 digital certificate, signed digital signature, and broadcasts the transaction to the consensus nodes in the blockchain network 600.

Then, when the consensus node in the blockchain network 600 receives the transaction, the digital certificate and the digital signature carried in the transaction are verified, and after the verification is successful, whether the service entity 800 has the transaction right is determined according to the identity of the service entity 800 carried in the transaction, and the transaction failure will be caused by any verification judgment of the digital signature and the right verification. After successful verification, the consensus node's own digital signature (e.g., encrypted using the private key of node 610-1 to obtain a digest of the transaction) is signed and broadcast on the blockchain network 600.

Finally, after the consensus node in the blockchain network 600 receives the transaction that is successfully verified, the transaction is filled into a new block and broadcast. When a new block is broadcasted by a consensus node in the block chain network 600, the new block is verified, for example, whether a digital signature of a transaction in the new block is valid is verified, if the verification is successful, the new block is appended to the tail of the block chain stored in the new block, and the state database is updated according to the transaction result to execute the transaction in the new block: for committed transactions that store referral information, key-value pairs that include referral information are added to the status database.

An exemplary application of the blockchain network is described by taking an example where a second client queries introduction information in the blockchain network 600. As an example, the second client may be a client belonging to the traffic body 800 in fig. 5, which is a client node 820 of the blockchain network. Illustratively, the second client may be client 410 in fig. 1.

Here, it is assumed that the second client is a client that needs to present the introduction information, and the introduction information presented by the first client and the second client is the same (which will be explained below in a detailed example).

In some embodiments, the type of data that the client node 820 can query in the blockchain network 600 may be implemented by the consensus node by restricting the authority of the transaction that the client phase of the business entity can initiate, when the client node 820 has the authority to initiate query introduction information, a transaction for querying introduction information may be generated by the client node 820 and submitted into the blockchain network 600, a transaction is performed from the consensus node to query the corresponding introduction information from the state database, and returned to the client node 820.

Taking a video viewing scene as an example, the service body 800 is a video company, the client node 810 and the client node 820 are clients (e.g., the first client and the second client described above) belonging to the video company and serving different users, and the videos played by the first client and the second client are the same, so that the required introduction information is the same. Therefore, the second client can directly obtain the introduction information through the blockchain network 600, so that the second client can be prevented from obtaining the introduction information from the background server, and the consumption of service resources of the background server is reduced.

Next, a video playback processing method provided by an embodiment of the present invention cooperatively implemented by the terminal 400 and the server 200 in fig. 1 will be described as an example. Referring to fig. 6, fig. 6 is a flowchart illustrating a video playing processing method according to an embodiment of the present invention, and will be described with reference to the steps shown in fig. 6.

In step S601, the client starts playing a video in the video playing page in response to a video playing operation.

In step S602, the client presents an object introduction mode button in the video playback page.

In step S603, the client switches to the object introduction mode for introducing the object in the video in response to the trigger operation for the object introduction mode button, and presents a forgetting duration setting page.

In step S604, the client acquires the forgetting duration set in the forgetting duration setting page.

In step S605, the client acquires introduction information of the object to be introduced, which is transmitted by the server.

In step S606, when the video playing page is played to the first playing time point, the client presents the introduction information of the object.

In step S607, when the video playing page is played to the second playing time point at which the object appears again, and the interval between the first playing time point and the second playing time point is greater than the forgetting duration, the client presents the introduction information of the object.

It should be noted that the specific implementation manner in steps S601 to S607 is similar to the embodiment included in steps S101 to S104, and will not be described again here.

In the embodiment of the invention, the server has strong computing capability and high computing speed compared with the terminal, and the process of determining the target identification is completed through the server, so that the speed of acquiring the introduction information by the terminal can be improved, and the computing resources of the terminal can be reduced.

Next, a video playback processing method according to an embodiment of the present invention will be described with reference to the above-described object as a character.

The embodiment of the invention utilizes the face recognition technology to store the time point of the appearance of all main characters, the face position and the role introduction (also called character introduction and role information, namely the introduction information) in the screenplay in the server. And returning related data to the client through the server, and displaying the role introduction corresponding to the play to the repeated scene-climbing characters at regular intervals when the user views the play, wherein the time interval supports manual adjustment of the user. In addition, the user can trigger the presentation of the persona introduction by manually clicking on the persona portion.

The embodiment of the invention can solve the problem that the user cannot remember the face clearly when watching the drama. And the server caches the departure time of the role, so that the effect of displaying role information corresponding to the video face in real time can be achieved, and the efficiency of face detection during video playing is improved.

Referring to fig. 7, fig. 7 is a schematic view of an application scenario of a video playing processing method according to an embodiment of the present invention. In fig. 7, a blind mode button 702 (i.e., the above-mentioned object introduction mode button) is provided for the user to select, and when the user turns on the blind mode (i.e., the above-mentioned object introduction mode), a blind interval adjustment button 703 is provided for the user to adjust, so that the user can customize the time interval to display the introduction information 701.

Here, in order to enrich the content of the introduction information, a camp corresponding to a character in the drama, an icon, or the like may be displayed to enhance the viewing experience of the user.

In some embodiments, the introductory information is presented when a person first appears in the video, or appears again and the time interval of the introductory information of the same object as the last presentation exceeds the face-blind interval (i.e., the forgetting period described above).

In other embodiments, when the condition for triggering the presentation information display is not satisfied, the user may directly display the corresponding presentation information by clicking the character avatar portion. In addition, all introduction information in the current video frame can be triggered and displayed through a remote control signal in a television scene.

Referring to fig. 8, fig. 8 is a schematic flowchart of a video playing processing method according to an embodiment of the present invention.

The server carries out face recognition on each video frame in the drama through the face recognition model, compares the face in each video frame with the face information of the main actor in the drama, and records the position of the face and the name of the corresponding actor if the face information exists.

Here, the effect of face recognition is as shown in fig. 9A, where fig. 9A is an application scene schematic diagram of the video playing processing method according to the embodiment of the present invention, and in fig. 9A, the server performs face recognition on a video frame, and can determine an actor name 901 corresponding to a face included in the video frame, so that all characters can be recognized from the video frame in a drama.

The server acquires the play role information corresponding to the identified actor names from the play role information database.

Here, the character information is preset information already existing in the video, as shown in fig. 9B. Fig. 9B is a schematic view of an application scenario of the video playing processing method according to the embodiment of the present invention, and in fig. 9B, the character information may include not only the character avatar 902, the character name 904 in the drama, and the actor name 903, but also related information such as the position or race of the character in the drama.

The server caches the information of all important persons in the drama in a mapping data format of ' departure time ' ═ and { role information and face position } '.

The client pulls the role information of the person (such as the person appearing for the first time or the person selected by the user) to be introduced in the current video frame from the server through the currently played time point, and meanwhile, the server returns the time point (hereinafter referred to as the next appearance time point) when the time difference of the next appearance of the person is larger than the face blind interval. The client terminal is used as a timer at the next departure time point to pull the role information at regular time.

Here, if the user readjusts the blind interval or clicks the face position recognition character information, the corresponding character information acquired at the latest blind interval is immediately pulled again from the server.

When the client plays the video, it is determined whether the time difference between the face appearance time in the video frame with the face and the appearance time of the character information of the same face last time is greater than the blind interval set by the user, and if the face appearance time is greater than the blind interval, or the character never before the current video, the character information is inserted near the position corresponding to the face, as shown in fig. 7.

The embodiment of the invention can reduce the complexity of manually inserting the character introduction content by video processing personnel. Meanwhile, people introduction is displayed in the video at intervals, so that a user can conveniently identify people in a drama when watching the drama, the problem that the influence of face blindness on the understanding of the drama when watching the drama is avoided, and the drama watching experience of the user is improved.

An exemplary structure of the video playback processing apparatus 555 provided by the embodiment of the present invention implemented as a software module is described below with reference to fig. 2, and in some embodiments, as shown in fig. 2, the software module in the video playback processing apparatus 555 stored in the memory 550 may include:

the video playing module 5551 is configured to respond to a video playing operation, and start playing a video in a video playing page;

an introduction presentation module 5552, configured to, when the video playback page is played to a first playback time point, present introduction information of an object appearing at the first playback time point;

the introduction presentation module 5552 is further configured to present introduction information of the object when the video playing page is played to a second playing time point at which the object appears again, and an interval between the first playing time point and the second playing time point is greater than the forgetting time length.

In the above solution, the introduction presentation module 5552 is further configured to present introduction information corresponding to the first occurrence object at the first playing time point; wherein the first appearing object of the first playing time point is an object which appears at the first playing time point and does not appear before the first playing time point; or presenting introduction information corresponding to the target introduction object at the first playing time point in response to introduction information viewing operation; and the target introduction object at the first playing time point is an object selected by the introduction information viewing operation in the objects appearing at the first playing time point.

In the above solution, the introduction presentation module 5552 is further configured to, starting from the first playing time point, present introduction information of the object at the presentation position of the object, and stop presenting the introduction information at a third playing time point, or stop presenting the introduction information when a set introduction duration is reached; the third playing time point is the time point when the object moves out of the video playing page, and the set introduction time length is counted from the first playing time point.

In the above solution, the introduction presentation module 5552 is further configured to present the introduction information at a position where the object appears in the video playback page; or, presenting the introduction information and a position identifier in an edge area of the video playing page, where the position identifier is used to indicate a position where an object introduced by the introduction information appears in the video playing page.

In the above solution, the video playing processing device 555 further includes: the acquisition module is used for acquiring the information of the object to be introduced when the video playing page is played to a first playing time point; wherein the type of the object to be introduced comprises at least one of the following types: a first occurrence object of the first play time point; a target introduction object of the first playing time point; wherein the first appearing object of the first playing time point is an object which appears at the first playing time point and does not appear before the first playing time point; the target introduction object of the first playing time point is an object selected by introduction information viewing operation in the objects appearing at the first playing time point; the information for acquiring the object to be introduced comprises position information and introduction information, wherein the position information is used for indicating the position of presenting the introduction information in the video playing page.

In the above solution, the video playback processing device 555 further includes: the adjusting module is used for responding to the forgetting time length adjusting operation and acquiring the adjusted forgetting time length; the introduction presentation module is further configured to present introduction information of the object when the video playing page is played to a second playing time point at which the object appears again, and an interval between the first playing time point and the second playing time point is greater than the adjusted forgetting duration.

Embodiments of the present invention provide a computer program product or computer program comprising computer instructions stored in a computer readable storage medium. The processor of the computer device reads the computer instructions from the computer-readable storage medium, and executes the computer instructions, so that the computer device executes the video playing processing method according to the embodiment of the present invention.

Embodiments of the present invention provide a computer-readable storage medium storing computer-executable instructions, which, when executed by a processor, cause the processor to execute a video playing processing method provided by an embodiment of the present invention, for example, the video playing processing methods shown in fig. 3, 4, 6, and 8, where the computer includes various computing devices including an intelligent terminal and a server.

In some embodiments, the computer-readable storage medium may be memory such as FRAM, ROM, PROM, EP ROM, EEPROM, flash memory, magnetic surface memory, optical disk, or CD-ROM; or may be various devices including one or any combination of the above memories.

In some embodiments, the computer-executable instructions may be in the form of programs, software modules, scripts or code written in any form of programming language, including compiled or interpreted languages, or declarative or procedural languages, and they may be deployed in any form, including as a stand-alone program or as a module, component, subroutine, or other unit suitable for use in a computing environment.

By way of example, computer-executable instructions may correspond, but do not necessarily have to correspond, to files in a file system, and may be stored in a portion of a file that holds other programs or data, e.g., in one or more scripts in a hypertext markup language document, in a single file dedicated to the program in question, or in multiple coordinated files (e.g., files that store one or more modules, sub-programs, or portions of code).

By way of example, computer-executable instructions may be deployed to be executed on one computing device or on multiple computing devices at one site or distributed across multiple sites and interconnected by a communication network.

In summary, the embodiments of the present invention have the following beneficial effects:

(1) in the playing process of the video, the introduction information of the objects appearing in the video is displayed to the user, so that the user can be helped to know the playing content of the video and master the development of the video scenario, and the watching experience of the user is improved; and the same introduction information is presented to the user again at intervals of forgetting, so that the situation that the user cannot master the development of the plot due to forgetting can be avoided.

(2) In the video playing process, the client can automatically present the introduction information of the object which is first shown, so that the user can fully know the object appearing in the video watching process, and the user does not need to pause the playing of the video to retrieve the corresponding object.

(3) Through simple operation, the requirement of obtaining the introduction information of the user is met, so that the user can correctly master the development of the video scenario.

(4) The current focus area is accurately determined in real time according to the sight of the viewer so as to present the introduction information in the non-focus area, and therefore the situation that the introduction information blocks the focus area and influences the viewing experience of the viewer can be avoided.

(5) The forgetting duration meeting the user requirements is accurately determined through machine learning, so that the viewing experience of the user is met, and the waste of service resources caused by the presentation of too much introduction information is avoided.

(6) The introduction information of the object to be introduced corresponding to the current picture is obtained in advance, the corresponding introduction information can be presented in the video in time, and the user does not need to wait, so that the watching experience of the user is improved. The process of target identification is completed by the server, so that the computing resource of the terminal can be saved, and the hardware use threshold of the terminal is reduced. The process of target identification is finished by the terminal, so that the computing resource of the server can be saved, and the speed of displaying the introduction information is improved.

The above description is only an example of the present invention, and is not intended to limit the scope of the present invention. Any modification, equivalent replacement, and improvement made within the spirit and scope of the present invention are included in the protection scope of the present invention.

Claims

1. A video playing processing method is characterized by comprising the following steps:

acquiring historical video data;

calling a neural network model to perform the following processing on the historical video data:

extracting a feature vector of the historical video data;

mapping the extracted feature vectors into probabilities corresponding to a plurality of candidate forgetting durations respectively, and determining the candidate forgetting duration corresponding to the maximum probability as a forgetting duration;

wherein the sample historical video data used to train the neural network model comprises:

the type of historical video; a viewing duration of the historical video; the number of interactions related to information consultation of the object to be introduced during the playing period of the historical video; a number of pauses during play of the historical video;

2. The method according to claim 1, wherein the presenting, for the object appearing at the first play time point, the introduction information of the object comprises:

presenting introduction information corresponding to the first appearing object at the first playing time point;

wherein the first appearing object of the first playing time point is an object which appears at the first playing time point and does not appear before the first playing time point;

alternatively, the first and second electrodes may be,

presenting introduction information corresponding to the target introduction object at the first playing time point in response to introduction information viewing operation;

and the target introduction object at the first playing time point is an object selected by the introduction information viewing operation in the objects appearing at the first playing time point.

3. The method according to claim 1, wherein the presenting, for the object appearing at the first play time point, the introduction information of the object comprises:

presenting introduction information of the object at the presentation position of the object from the first playing time point, and stopping presenting the introduction information at a third playing time point, or stopping presenting the introduction information when a set introduction duration is reached;

the third playing time point is the time point when the object moves out of the video playing page, and the set introduction time length is counted from the first playing time point.

4. The method of claim 3, wherein the presenting the introduction information of the object at the presentation position of the object comprises:

presenting the introduction information at the position where the object appears in the video playing page; alternatively, the first and second electrodes may be,

presenting the introduction information and a position identifier in an edge area of the video playing page, wherein the position identifier is used for indicating a position where an object introduced by the introduction information appears in the video playing page.

5. The method of claim 1, further comprising:

when the video playing page is played to the first playing time point, acquiring information of an object to be introduced;

wherein the type of the object to be introduced comprises at least one of the following types: a first occurrence object of the first play time point; a target introduction object of the first playing time point;

wherein the first appearing object of the first playing time point is an object which appears at the first playing time point and does not appear before the first playing time point; the target introduction object of the first playing time point is an object selected by introduction information viewing operation in the objects appearing at the first playing time point;

the information of the object to be introduced comprises position information and introduction information, wherein the position information is used for indicating the position of presenting the introduction information in the video playing page.

6. The method according to claim 5, wherein the obtaining information of the object to be introduced comprises:

sending an information acquisition request to a server, and receiving the information of the object to be introduced, which is sent by the server;

the information obtaining request includes the first playing time point, and the information obtaining request is used for the server to search the information of the object to be introduced corresponding to the first playing time point in the cache corresponding to the video.

7. The method of claim 1, wherein after the beginning of playing the video in the video playing page, the method further comprises:

responding to the forgetting time length adjusting operation, and acquiring the adjusted forgetting time length;

when the video playing page is played to a second playing time point when the object appears again, and the interval between the first playing time point and the second playing time point is greater than the forgetting duration, presenting the introduction information of the object includes:

and when the video playing page is played to a second playing time point when the object appears again, and the interval between the first playing time point and the second playing time point is greater than the adjusted forgetting time length, presenting the introduction information of the object.

8. A video playback processing apparatus, comprising:

the acquisition module is used for acquiring historical video data; calling a neural network model to perform the following processing on the historical video data: extracting a feature vector of the historical video data; mapping the extracted feature vectors into probabilities corresponding to a plurality of candidate forgetting durations respectively, and determining the candidate forgetting duration corresponding to the maximum probability as a forgetting duration; wherein the sample historical video data used to train the neural network model comprises: the type of historical video; a viewing duration of the historical video; the number of interactions related to information consultation of the object to be introduced during the playing period of the historical video; a number of pauses during play of the historical video;

9. An electronic device, comprising:

a memory for storing executable instructions;

a processor for implementing the method of any one of claims 1 to 7 when executing executable instructions stored in the memory.

10. A computer-readable storage medium having stored thereon executable instructions for causing a processor, when executed, to implement the method of any one of claims 1 to 7.