CN115278292B

CN115278292B - Video reasoning information display method and device and electronic equipment

Info

Publication number: CN115278292B
Application number: CN202210768673.6A
Authority: CN
Inventors: 许国军
Original assignee: Beijing IQIYI Science and Technology Co Ltd
Current assignee: Beijing IQIYI Science and Technology Co Ltd
Priority date: 2022-06-30
Filing date: 2022-06-30
Publication date: 2023-12-05
Anticipated expiration: 2042-06-30
Also published as: CN115278292A

Abstract

The embodiment of the invention provides a method, a device and electronic equipment for displaying video reasoning information, wherein the method is applied to a server of a video reasoning information display system; the video reasoning information display system also comprises a client; the method comprises the following steps: receiving an inference information acquisition request aiming at a target video and sent by a client; responding to the reasoning information acquisition request to acquire a target timestamp; determining a timestamp matched with the target timestamp in a preset database as a timestamp to be utilized; wherein, each time stamp and corresponding video reasoning information are recorded in the preset database; acquiring video reasoning information corresponding to the timestamp to be utilized from the preset database as an information acquisition result; and sending the information acquisition result to the client so that the client performs video reasoning information display based on the information acquisition result. By the scheme, occupation of computing resources when a user knows the reasoning information in the video in real time can be reduced.

Description

Video reasoning information display method and device and electronic equipment

Technical Field

The present invention relates to the field of data transmission technologies, and in particular, to a method and an apparatus for displaying video inference information, and an electronic device.

Background

At present, in order to satisfy real-time understanding of inference information in video by users, such as character information, actor information, wearing information of characters, food appliance information and the like in video, related technologies perform real-time image analysis on video frames specified by users to obtain inference information in video frames specified by users. But this approach requires significant real-time computing resources.

Therefore, how to reduce the occupation of real-time computing resources when users know the reasoning information in the video in real time is a problem to be solved.

Disclosure of Invention

The embodiment of the invention aims to provide a method and a device for displaying video reasoning information and electronic equipment, so as to reduce occupation of real-time computing resources when a user knows the reasoning information in a video in real time. The specific technical scheme is as follows:

in a first aspect of the present invention, a method for displaying video reasoning information is provided, which is applied to a server of a video reasoning information display system; the video reasoning information display system also comprises a client; the method comprises the following steps:

receiving an inference information acquisition request aiming at a target video, which is sent by the client;

Responding to the reasoning information acquisition request to acquire a target timestamp; the target time stamp is a time stamp used for representing a video frame to be subjected to information reasoning;

determining a time stamp matched with the target time stamp in a preset database as a time stamp to be utilized; each time stamp and corresponding video reasoning information are recorded in the preset database, each time stamp is a time stamp of a video frame of the target video, and each video reasoning information corresponding to each time stamp is a reasoning information of the video frame with the time stamp;

acquiring video reasoning information corresponding to the timestamp to be utilized from the preset database as an information acquisition result;

and sending the information acquisition result to the client so that the client performs video reasoning information display based on the information acquisition result.

Optionally, the target timestamp is a timestamp generated based on a first designated timestamp, and the first designated timestamp is a playing time of the target video in the client when the reasoning information acquisition request is generated.

Optionally, the obtaining, from the predetermined database, the video inference information corresponding to the timestamp to be utilized, as an information obtaining result, includes:

And determining video reasoning information corresponding to the time stamp to be utilized and video reasoning information corresponding to a plurality of time stamps in a preset time period before the time stamp to be utilized from the preset database as information acquisition results.

Optionally, the determining manner of the target timestamp includes:

determining the first specified timestamp as a target timestamp;

the sending the information acquisition result to the client so that the client performs video reasoning information display based on the information acquisition result, including:

and sending the information acquisition result to the client so that the client can display video reasoning information outside a playing interface of the target video played by the client based on the information acquisition result.

Optionally, the determining manner of the target timestamp includes:

selecting a time stamp which is not earlier than the first appointed time stamp and has the shortest time interval with the first appointed time stamp from all time stamps recorded in the preset database as a target time stamp;

The information acquisition result is sent to the client so that the client can display video reasoning information in a snapshot picture of the target video played by the client based on the information acquisition result;

the snapshot picture is a picture obtained by performing picture snapshot on a video frame corresponding to a second designated timestamp in the target video, and the second designated timestamp is a timestamp which is not earlier than the first designated timestamp and has the shortest time interval with the first designated timestamp in all the timestamps recorded in the predetermined database.

Optionally, the determining the timestamp matching with the target timestamp in the predetermined database, as the timestamp to be utilized, includes:

and selecting a timestamp with the shortest time difference with the target timestamp or a timestamp which is not earlier than the target timestamp and has the shortest time interval from all the timestamps recorded in the preset database as a timestamp to be utilized.

Optionally, the construction mode of the predetermined database includes:

extracting a plurality of video frames from the target video according to a preset video frame extraction rule;

Identifying video reasoning information of each video frame;

a predetermined database is constructed containing each video inference information and corresponding time stamps.

Optionally, the extracting a plurality of video frames from the target video according to a preset video frame extraction rule includes:

selecting a specified number of expected times for a time period of a predetermined unit duration;

based on each expected time, extracting video frames from video frames in each target time period of the target video, wherein each target time period has the preset unit time length.

Optionally, each video frame within the predetermined unit duration of time has a respective distribution time;

the extracting video frames from video frames in each target time period of the target video based on each expected time comprises: for each expected time, determining the distribution time meeting the first matching condition in the time period of the preset unit time length as the extraction time corresponding to the expected time; wherein the first matching condition includes the shortest time interval from the expected time;

and extracting the video frames at the extraction time corresponding to each expected time from the video frames in each time period of the target video.

In a second aspect of the implementation of the present invention, a method for displaying video reasoning information is also provided, which is applied to a client of a video reasoning information display system; the video reasoning information display system also comprises a server; the method comprises the following steps:

when an inference information request operation for the target video is detected, generating an inference information acquisition request for the target video;

sending the reasoning information generation request to the server so that the server receives a reasoning information acquisition request sent by the client for a target video; responding to the reasoning information acquisition request, and acquiring a target timestamp corresponding to the reasoning information acquisition request; determining a time stamp matched with the target time stamp in a preset database as a time stamp to be utilized; based on the timestamp to be utilized, acquiring video reasoning information corresponding to the timestamp to be utilized from the preset database as an information acquisition result; sending the information acquisition result to the client; the target time stamp is a time stamp used for representing a video frame to be subjected to information reasoning; each time stamp and corresponding video reasoning information are recorded in the preset database, each time stamp is a time stamp of a video frame of the target video, and each video reasoning information corresponding to each time stamp is a reasoning information of the video frame with the time stamp;

And receiving the information acquisition result, and displaying video reasoning information based on the information acquisition result.

In a third aspect of the present invention, a method for displaying video inference information is provided, which is applied to a client; the method comprises the following steps:

when the reasoning information request operation aiming at the target video is detected, a target timestamp is acquired; the target time stamp is a time stamp used for representing a video frame to be subjected to information reasoning;

and displaying video reasoning information based on the information acquisition result.

Optionally, the determining manner of the target timestamp includes:

determining the first specified timestamp as a target timestamp;

the video reasoning information display based on the information acquisition result comprises the following steps:

and based on the information acquisition result, video reasoning information display is carried out outside a playing interface of the target video played by the client.

Optionally, the determining manner of the target timestamp includes:

based on the information acquisition result, video reasoning information display is carried out in a snapshot picture of the target video played by the client;

In a fourth aspect of the present invention, there is also provided a video inference information presentation system, the video inference information presentation system comprising: a client and a server;

the client is used for generating an inference information acquisition request for the target video when the inference information request operation for the target video is detected;

the server is used for responding to the reasoning information acquisition request and acquiring a target timestamp; determining a time stamp matched with the target time stamp in a preset database as a time stamp to be utilized; acquiring video reasoning information corresponding to the timestamp to be utilized from the preset database as an information acquisition result; the target time stamp is a time stamp used for representing a video frame to be subjected to information reasoning; each time stamp and corresponding video reasoning information are recorded in the preset database, each time stamp is a time stamp of a video frame of the target video, and each video reasoning information corresponding to each time stamp is a reasoning information of the video frame with the time stamp; sending the information acquisition result to the client;

the client is also used for receiving the target video reasoning information and displaying the video reasoning information based on the information acquisition result.

In a fifth aspect of the present invention, there is also provided a video inference information presentation apparatus, which is applied to a server of a video inference information presentation system; the video reasoning information display system also comprises a client; the device comprises:

the receiving module is used for receiving an inference information acquisition request aiming at a target video, which is sent by the client;

the response module is used for responding to the reasoning information acquisition request and acquiring a target timestamp; the target time stamp is a time stamp used for representing a video frame to be subjected to information reasoning;

the determining module is used for determining a timestamp matched with the target timestamp in a preset database and taking the timestamp as a timestamp to be utilized; each time stamp and corresponding video reasoning information are recorded in the preset database, each time stamp is a time stamp of a video frame of the target video, and each video reasoning information corresponding to each time stamp is a reasoning information of the video frame with the time stamp;

the acquisition module is used for acquiring video reasoning information corresponding to the timestamp to be utilized from the preset database as an information acquisition result;

and the sending module is used for sending the information acquisition result to the client so that the client can display video reasoning information based on the information acquisition result.

In yet another aspect of the present invention, there is also provided an electronic device including a processor, a communication interface, a memory, and a communication bus, wherein the processor, the communication interface, and the memory perform communication with each other through the communication bus;

a memory for storing a computer program;

and the processor is used for realizing the steps of the video reasoning information display method when executing the program stored in the memory.

In still another aspect of the implementation of the present invention, there is also provided a computer readable storage medium, in which a computer program is stored, the computer program implementing the above-mentioned method for displaying video inference information when being executed by a processor.

In yet another aspect of the present invention, there is also provided a computer program product containing instructions that, when run on a computer, cause the computer to perform the method of presenting video reasoning information as described above.

The display method of the video reasoning information provided by the embodiment of the invention is applied to a server of a video reasoning information display system; the video reasoning information display system also comprises a client; the method comprises the following steps: receiving an inference information acquisition request aiming at a target video and sent by a client; responding to the reasoning information acquisition request to acquire a target timestamp; the target time stamp is a time stamp used for representing a video frame to be subjected to information reasoning; determining a timestamp matched with the target timestamp in a preset database as a timestamp to be utilized; each time stamp is a time stamp of a video frame of the target video, and the video inference information corresponding to each time stamp is inference information of the video frame with the time stamp; acquiring video reasoning information corresponding to the timestamp to be utilized from a preset database, and taking the video reasoning information as an information acquisition result; and sending the information acquisition result to the client so that the client performs video reasoning information display based on the information acquisition result. In the scheme, video reasoning information corresponding to each time stamp is recorded in the preset database in advance, when a reasoning information acquisition request is received, the matched time stamp to be utilized can be determined from the preset database directly based on the target time stamp, and the corresponding video reasoning information is acquired, so that real-time computing resources are not required to be occupied for image analysis, and therefore, the occupation of computing resources when a user knows the reasoning information in the video in real time can be reduced through the scheme.

Drawings

In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below.

FIG. 1 is a flow chart of a method for displaying video reasoning information in an embodiment of the invention;

FIG. 2 is a flowchart of a method for constructing a predetermined database according to an embodiment of the present invention;

FIG. 3 is another flowchart of a method for displaying visual reasoning information in an embodiment of the present invention;

FIG. 4 is another flowchart of a method for displaying visual reasoning information in an embodiment of the present invention;

FIG. 5 is another flow chart of a method for displaying visual reasoning information in an embodiment of the present invention;

FIG. 6 is another flow chart of a method for displaying visual reasoning information in an embodiment of the invention;

fig. 7 is a schematic structural diagram of a system for displaying video inference information according to an embodiment of the present invention;

fig. 8 is a schematic structural diagram of a device for displaying video reasoning information according to an embodiment of the present invention;

FIG. 9 is a schematic diagram of a construction apparatus for a predetermined database according to an embodiment of the present invention;

fig. 10 is a schematic structural diagram of an electronic device according to an embodiment of the invention.

Detailed Description

The technical solutions in the embodiments of the present invention will be described below with reference to the accompanying drawings in the embodiments of the present invention.

With the popularization of video entertainment becoming more and more widespread and deeper, information transmitted by video content becomes more and more abundant, and users have higher and higher demands on information extraction in video content, information understanding integrity, timeliness and the like. For example, for a television drama, a user not only can intuitively see the image character information, but also can learn more about actor information behind the character (such as basic history, performance experience, representative works, online evaluation, etc.), character wearing information (such as lipstick, upper garment, trousers, skirt, cap, etc. brand details, price, purchasing links, etc.), product information (such as food products, appliance category introduction, etc.) and other reasoning information in real time.

To meet such a demand, a solution in the related art is to identify inference information in video frames based on real-time image analysis, i.e., image recognition technology, on the video frames specified by the user, for example: character information, clothing information, food information, text, side information, and the like. However, the identification method strongly depends on cloud to provide high-performance real-time computing resources, and analysis operation is relatively high in delay, so that the speed of acquiring reasoning information by a user is low, and user experience is to be improved; and limited by timeliness requirements and limited real-time computing resources, the results identified by this method are often not sufficiently accurate and comprehensive.

Based on the above, in order to reduce the occupation of computing resources when a user knows the reasoning information in the video in real time, the invention provides a method and a device for displaying the video reasoning information and electronic equipment.

The following first describes a method for displaying video reasoning information provided by the embodiment of the present invention.

The video reasoning information display method provided by the embodiment of the invention can be applied to a server of a video reasoning information display system; the video reasoning information presentation system also includes a client. The client may be a software program with a video playing function applied to the terminal device, or the terminal device itself.

The method for displaying the video reasoning information provided by the embodiment of the invention can comprise the following steps:

In the scheme, video reasoning information corresponding to each time stamp is recorded in the preset database in advance, when a reasoning information acquisition request is received, the matched time stamp to be utilized can be determined from the preset database directly based on the target time stamp, and the corresponding video reasoning information is acquired, so that real-time computing resources are not required to be occupied for image analysis, and therefore, the occupation of computing resources when a user knows the reasoning information in the video in real time can be reduced through the scheme.

The following describes a method for displaying video reasoning information provided by the embodiment of the invention with reference to the accompanying drawings.

As shown in fig. 1, the method for displaying video inference information provided by the embodiment of the present invention includes the following steps S101 to S105:

s101, receiving an inference information acquisition request aiming at a target video, which is sent by the client;

the inference information acquisition request may be generated by the client when the client detects an inference information request operation for the target video, the inference information request operation for the target video may be issued by the user, and various operation modes may exist, for example, the user may generate the inference information acquisition request by clicking a specified button in the client in a playing interface where the target video is displayed in the client, or the user may perform a predetermined gesture operation on the playing interface of the target video to generate the inference information acquisition request, where the predetermined gesture operation may be a clicking operation by two fingers, and is not limited to this of course. After generating the inference information acquisition request, the client may send the inference information acquisition request to the server.

S102, responding to the reasoning information acquisition request, and acquiring a target timestamp; the target time stamp is a time stamp used for representing a video frame to be subjected to information reasoning;

the method for obtaining the target timestamp may be that the server determines the target timestamp of the video frame to be subjected to information reasoning from the client after receiving the reasoning information obtaining request, or may be that the client determines the target timestamp of the video frame to be subjected to information reasoning while sending the reasoning information obtaining request, and sends the target timestamp to the server, and the server obtains the target timestamp.

In one implementation, the target timestamp is a timestamp generated based on a first specified timestamp that is a playing time of the target video in the client when the inference information acquisition request is generated.

The playing time of the target video in the client is the time to which the target video is currently played on the video track time axis. The target timestamp is a time determined based on the first specified timestamp time. The target timestamp may be determined by the client based on the first specified timestamp, or may be determined by the server after the server obtains the first specified timestamp.

Alternatively, in one implementation, the first specified timestamp may be determined as a target timestamp; at this time, the first specified timestamp is directly taken as the target timestamp.

Alternatively, in another implementation, a timestamp that matches the first specified timestamp may be selected from the timestamps of the plurality of video frames as the target timestamp. With respect to such an implementation, the description of the predetermined database is presented below.

S103, determining a timestamp matched with the target timestamp in a preset database as a timestamp to be utilized; each time stamp and corresponding video reasoning information are recorded in the preset database, each time stamp is a time stamp of a video frame of the target video, and each video reasoning information corresponding to each time stamp is a reasoning information of the video frame with the time stamp;

in the scheme, frames can be extracted from a target video in advance, image analysis is carried out on each extracted video frame, video inference information of each video frame is obtained, a time stamp corresponding to each video frame is recorded in a database of a cloud end, and the time stamp corresponding to each video frame is the time of the video frame on a video track time axis, so that a preset database is constructed.

Since the video frame content has the characteristic of continuity, that is, the maximum probability of two adjacent video frames of the video is similar, in order to save the occupation of storage resources, the target video can be subjected to frame extraction and then video inference information identification, for example, a video with a frame rate of 25 (that is, a video per second is composed of 25 video frames), and video frame extraction is performed by sampling frame extraction at intervals of 0.125 seconds, so that 8 frames are extracted per second on average.

When the target timestamp is acquired, a timestamp to be utilized in the predetermined database that matches the target timestamp may be determined. For example: and determining the timestamp to be utilized, which is matched with the target timestamp, from a preset database according to a preset time matching rule. The above method of video frame extraction, and the method of determining the time stamp to be utilized will be described in the following by way of example.

S104, acquiring video reasoning information corresponding to the timestamp to be utilized from the preset database as an information acquisition result;

in the predetermined database, each time stamp has corresponding video reasoning information. In one implementation, the video inference information corresponding to the timestamp to be utilized may be used as an information acquisition result.

In another implementation manner, the obtaining, from the predetermined database, the video inference information corresponding to the timestamp to be utilized, as an information obtaining result, may also include:

By determining the video inference information corresponding to a plurality of time stamps within a predetermined time period before the time stamp is to be utilized as the information acquisition result for final presentation, the content to be presented can be enriched. Meanwhile, considering the behavior characteristics of the user, namely that the operation of the hand has hysteresis relative to human eyes, before the time stamp to be utilized is selected, video reasoning information corresponding to a plurality of time stamps in a preset time period can be ensured to cover the information which the user wants to acquire. Wherein the predetermined period of time may be set according to the specific case.

S105, sending the information acquisition result to the client so that the client can display video reasoning information based on the information acquisition result.

By way of example, this step may include the following two implementations:

the implementation mode is as follows: and sending the information acquisition result to the client so that the client can display video reasoning information outside a playing interface of the target video played by the client based on the information acquisition result.

For example, character data information is recognized by adding the current video picture outside the playing interface of the target video. In the scene, the first designated time stamp can be determined as the target time stamp, the time stamp to be utilized is determined through the target time stamp, then video reasoning information corresponding to the time stamp to be utilized and video reasoning information corresponding to a plurality of time stamps in a preset time period before the time stamp to be utilized are subjected to operations such as data deduplication, sequencing, detail supplement and the like, so that the client side can conveniently display the video better, and then display the video outside a video picture.

The implementation mode II is as follows: the information acquisition result is sent to the client so that the client can display video reasoning information in a snapshot picture of the target video played by the client based on the information acquisition result;

in this case, the video frame corresponding to the target timestamp needs to be subjected to the frame snapshot, and stored locally, then the determined information acquisition result is applied to the snapshot frame, the video reasoning information corresponding to the timestamp to be utilized is acquired from the predetermined database, and the video frame is displayed on the frame snapshot after the secondary assembly processing (such as data detail supplement and the like) of the data is performed. For example, the position of the character head portrait identified by the snapshot picture is highlighted by framing, so that the interactive effect experience is improved.

The method stores the reasoning information of each video frame in a preset database in advance through an off-line calculation mode. The way of identifying the video reasoning information by real-time computation is obviously slower than the way of matching the video reasoning information from the result pre-saved by offline computation. In practical applications, the time taken to identify the video inference information in real time may be more than 5 times that of the offline acquisition, for example, 300 milliseconds may be required for real-time identification and only 60 milliseconds may be required for offline acquisition. Therefore, by the video reasoning information display method provided by the embodiment of the invention, the speed of acquiring the video reasoning information by the user can be improved while the occupation of computing resources is reduced, so that the user experience is effectively improved.

In this embodiment, the video inference information corresponding to each timestamp is recorded in the predetermined database in advance, when the inference information acquisition request is received, the matched timestamp to be utilized can be determined from the predetermined database directly based on the target timestamp, and the corresponding video inference information is acquired, and the real-time computing resource is not required to be occupied for image analysis, so that the occupation of the computing resource when the user learns the inference information in the video in real time can be reduced through the scheme.

It should be noted that, in a specific application, the method for displaying video inference information provided by the present invention mainly includes two application scenarios, namely, a scenario that does not strictly require picture data alignment, and a scenario that strictly requires picture data alignment.

In a scenario where the alignment of the picture data is not strictly required, the inference information of the adjacent video frame is allowed to be used as the inference information of the current video frame (i.e. the video frame aimed at by the inference information acquisition request of the target video), and in a typical scenario, as in the above example, based on the determined information acquisition result, the video inference information display is performed outside the playing interface of the target video played by the client.

Under the scene of strict requirement of picture data alignment, the reasoning information of the adjacent video frames is not allowed to be used as the reasoning information of the current video frame, and the typical scene is used for carrying out video reasoning information display in the snapshot picture of the target video played by the client based on the determined information acquisition result as in the above example. In this scenario, the first specified timestamp is finely time-shifted to determine the target timestamp.

Optionally, in another embodiment of the present invention, for a scenario that does not strictly require frame data alignment, the determining the target timestamp includes:

The first specified timestamp is determined as a target timestamp.

At this time, the determining, as the timestamp to be utilized, a timestamp in the predetermined database that matches the target timestamp includes:

and selecting a timestamp with the shortest time difference with the target timestamp or a timestamp which is not earlier than the target timestamp and has the shortest time interval from all the timestamps recorded in the preset database as a timestamp to be utilized. In one implementation, a timestamp in the predetermined database having the shortest time difference from the target timestamp may be determined as the timestamp to be utilized.

It will be appreciated that in another implementation, a time stamp of the predetermined database that is not earlier than the target time stamp and that has the shortest time interval may be taken as the time stamp to be utilized, taking into account the behavioral characteristics of the user, i.e. the hysteresis of the operation of the hand with respect to the human eye. At this time, before the time stamp is to be utilized, the video inference information corresponding to the time stamps in the preset time period is also determined as the information acquisition result for final display, so that coverage of the first designated time stamp when the inference information acquisition request is generated can be further ensured.

Aiming at the scene with non-strict requirement on picture data alignment, the implementation mode can be combined with the actual scene requirement, so that the expected reasoning information is displayed to the user.

Optionally, in another embodiment of the present invention, for a scenario that strictly requires frame data alignment, the determining manner of the target timestamp may include:

at this time, the snapshot picture is a picture obtained by performing picture snapshot on a video frame corresponding to a second specified timestamp in the target video, where the second specified timestamp is a timestamp that is not earlier than the first specified timestamp and has a shortest time interval with the first specified timestamp among the timestamps recorded in the predetermined database.

Before the time stamp is to be utilized in the preset database, the video frames corresponding to the time stamps in the preset time period are subjected to picture snapshot, and video reasoning information corresponding to the time stamps is displayed in the corresponding snapshot picture, so that coverage of a first appointed time stamp when a reasoning information acquisition request is generated can be ensured.

In this scenario, the target timestamp is the timestamp to be utilized.

In addition, it will be appreciated that, in an alternative implementation, the determination of the target timestamp may also include: taking the timestamp with the shortest interval with the first designated timestamp as a target timestamp, and carrying out picture snapshot on the video frame corresponding to the target timestamp for displaying video reasoning information. In this scenario, the target timestamp is the timestamp to be utilized.

Aiming at the scenes with strict requirements for picture data alignment, the implementation mode can be combined with actual scene requirements, and the expected reasoning information can be displayed to the user.

According to the method for displaying video inference information, in another embodiment of the present invention, as shown in fig. 2, the method for constructing the predetermined database may include the following steps:

s201, extracting a plurality of video frames from the target video according to a preset video frame extraction rule;

the preset video frame extraction rules can be all-frame extraction, sampling frame extraction and other rules, which are all reasonable. Specifically, in one implementation, a specified number of desired times may be selected for a time period of a predetermined unit length of time; the predetermined unit time length is generally one second, which is not limited to this;

And extracting video frames from video frames in each target time period of the target video based on each expected time, wherein each target time period has the preset unit time length.

In one implementation, each video frame within the predetermined unit time period has a respective distribution time;

taking a video with a frame rate of 25 as an example, 8 expected times are selected, and the time period of the predetermined unit time length is one second, namely 1000 milliseconds, and the 25 video frames are uniformly distributed on a time axis with the length of 1000 milliseconds and are respectively located in 0,40,80,120,160,200,240,280,320,360,400,440,480,520,560,600,640,680,720,760,800,840,880,920,960 (milliseconds).

8 desired times were chosen from a time period of 1 second duration, if evenly distributed, the chosen time points were 0,125,250,375,500,625,750,875 (milliseconds). Obviously, the 8 desired times to be selected cannot all be the same as the distribution time of the video frames. Therefore, it is necessary to find the extraction time corresponding to the desired time according to a certain matching condition. For example, the video frame distribution time with the shortest time interval may be selected as the extraction time, or the video frame distribution time with the shortest time interval after the desired time may be selected as the extraction time. For example, in the above example, the first video frame with the position of 0 ms is selected, the second video frame with the position of 125 ms is selected, the target period is traversed, and the video frame distribution time with the position of 125 ms or more and the shortest interval with 125 ms, namely 160 ms, is selected; a third video frame for which a 250 ms position is desired to be selected, then 280 ms is selected; by this, the distribution time of the finally selected 8 video frames is 0,160,280,400,520,640,760,880 (milliseconds).

The above scheme may also be adopted by other extraction modes of the target video with the frame rate of 24, such as extraction modes of 3 frames per second, 10 frames per second, 16 frames per second, and the like, which are not specifically described herein. In addition, the preset video frame extraction rule may be full frame extraction, for example, 25 frames of video per second may be extracted, which is all that is required.

S202, identifying video reasoning information of each video frame;

in one implementation, the cloud database stores various video reasoning information, and after extracting a plurality of video frames, each video frame can be matched with the cloud stored reasoning information through an image recognition technology, so that the video reasoning information of each video frame is recognized.

S203, constructing a preset database containing each video reasoning information and corresponding time stamp.

That is, each video inference information and the corresponding time stamp are stored in a database of the cloud, thereby constructing a predetermined database.

In this embodiment, according to a preset video frame extraction rule, extracting a plurality of video frames from the target video; and identifying video inference information for each video frame; constructing a preset database containing each video reasoning information and corresponding time stamps; therefore, when receiving the reasoning information acquisition request, the corresponding video reasoning information can be directly acquired from the preset database, and the real-time computing resource is not required to be occupied for image analysis, so that the occupation of the computing resource when the user knows the reasoning information in the video in real time can be reduced through the scheme.

In order to more clearly understand the content of the embodiment of the present invention, the following describes a specific implementation procedure of the method for displaying video inference information according to the present invention with reference to fig. 3 and fig. 4.

For a scene with non-strict requirement on picture data alignment, as shown in fig. 3, a cloud video production system is used for generating a target video; the video content identification system is used for converting the target time stamp into a time stamp to be utilized, acquiring an information acquisition result and returning the information acquisition result to the client; the offline inference system is used for construction of a predetermined database. In this scenario, the specific implementation process of the video reasoning information display method of the present invention may include the following steps:

step 1, a video production system generates a target video and triggers offline analysis operation of video reasoning information;

step 2, after the offline reasoning system acquires the target video, acquiring a video frame extraction rule from the video content recognition system to sample and extract, for example, 8 video frames are selected from each second of the target video to analyze and recognize, and the recognized video reasoning information and the corresponding time stamp are written into a predetermined database of the cloud;

step 3, the client side generates an inference information acquisition request for the target video when detecting that a user triggers an inference information request operation for the target video on a video playing interface;

Step 4, the client side transmits the playing time of the target video in the client, namely, the target timestamp to a video content identification system of the cloud;

step 5, the video content recognition system converts the target timestamp into a corresponding timestamp to be utilized in a preset database according to the delivered target timestamp and the determination mode of the timestamp to be utilized, and returns the timestamp to the client after offline analysis results, namely information acquisition results, are processed in a secondary assembly mode (such as analysis data de-duplication, sequencing, detail supplement and the like, and are used for more friendly display of the client) of the data;

and 6, the client displays the acquired information acquisition result after the secondary assembly processing to the user outside the video picture.

In a strict requirement picture data alignment scene, as shown in fig. 4, the cloud video production system is used for generating a target video; the offline reasoning system is used for constructing a preset database; the video content recognition system is used for sending an offset strategy to the client, screening information acquisition results in a preset database, performing secondary assembly processing of data (such as analysis data detail supplement and the like for more friendly display of the client), and returning the data to the client. In this scenario, the specific implementation process of the video reasoning information display method of the present invention may include the following steps:

step 2, after the offline inference system acquires the target video, acquiring a video frame extraction rule from the video content recognition system, wherein the rule may be full-frame extraction, for example, 25 frames of video are extracted every second for 25 frames of video; sampling and extracting can be performed according to the video frame extraction rule, for example, 8 video frames are extracted every second of 25 frames of video, analysis is performed to obtain corresponding video reasoning information, and the video reasoning information is written into a predetermined database of the cloud;

step 3, the client side obtains a time offset strategy from a video content identification system of the cloud end when the application is started or at regular time;

step 4, the client side generates an inference information acquisition request for the target video when detecting that a user triggers an inference information request operation for the target video on a video playing interface;

step 5, the client side determines a first appointed time stamp of the target video in the client side, and converts the first appointed time stamp into a corresponding target time stamp according to the offset strategy; at this time, the target timestamp and the timestamp to be utilized;

step 6, the client side performs picture snapshot on the video frame corresponding to the target timestamp and stores the picture snapshot in the local; meanwhile, the target time stamp is transmitted to a video content identification system of the cloud;

Step 7, the video content recognition system screens the corresponding offline analysis result, namely the information acquisition result, in the database according to the delivered target time stamp, and returns the result to the client after the secondary assembly processing of the data;

and 8, the client displays the acquired information acquisition result to the user on a snapshot picture.

In this embodiment, each video inference information and the corresponding timestamp are recorded in the predetermined database in advance, and when the inference information acquisition request is received, the video inference information can be directly selected from the predetermined database based on the target timestamp, and the real-time computing resource is not required to be occupied for image analysis, so that the occupation of the computing resource when the user knows the inference information in the video in real time can be reduced through the scheme.

The embodiment of the invention also provides a display method of the video reasoning information, which is applied to the client of the video reasoning information display system; the video reasoning information display system also comprises a server; as shown in fig. 5, the method may include the steps of:

s501, when an inference information request operation for a target video is detected, generating an inference information acquisition request for the target video;

S502, sending the reasoning information generation request to the server so that the server receives a reasoning information acquisition request sent by the client for a target video; responding to the reasoning information acquisition request, and acquiring a target timestamp corresponding to the reasoning information acquisition request; determining a time stamp matched with the target time stamp in a preset database as a time stamp to be utilized; based on the timestamp to be utilized, acquiring video reasoning information corresponding to the timestamp to be utilized from the preset database as an information acquisition result; sending the information acquisition result to the client; the target time stamp is a time stamp used for representing a video frame to be subjected to information reasoning; each time stamp and corresponding video reasoning information are recorded in the preset database, each time stamp is a time stamp of a video frame of the target video, and each video reasoning information corresponding to each time stamp is a reasoning information of the video frame with the time stamp;

s503, receiving the information acquisition result, and displaying video reasoning information based on the information acquisition result.

In this embodiment, when detecting an inference information request operation for a target video, a client generates an inference information acquisition request for the target video, and sends the inference information generation request to a server, so that the server acquires a target timestamp corresponding to the inference information acquisition request; determining a timestamp matched with the target timestamp in a preset database as a timestamp to be utilized; based on the time stamp to be utilized, acquiring video reasoning information corresponding to the time stamp to be utilized from a preset database as an information acquisition result; sending the information acquisition result to a client; and after the client receives the information acquisition result, carrying out video reasoning information display based on the information acquisition result.

In this embodiment, video inference information corresponding to each timestamp is recorded in a predetermined database in advance, and when a client sends an inference information acquisition request for a target video to a server after generating the inference information acquisition request, the client can directly determine a matched timestamp to be utilized from the predetermined database based on the target timestamp and acquire the corresponding video inference information, and does not need to occupy real-time computing resources to perform image analysis.

The embodiment of the invention also provides a display method of the video reasoning information, which is applied to the client in the case that a preset database is built at the client; the client may be a program with a video playing function in the electronic device, or the electronic device itself may be a terminal device, for example: smart phones, tablet computers, desktop computers, etc., of course, the electronic device may also be a server.

As shown in fig. 6, the method may include the steps of:

s601, when an inference information request operation aiming at a target video is detected, a target timestamp is acquired; the target time stamp is a time stamp used for representing a video frame to be subjected to information reasoning;

S602, determining a timestamp matched with the target timestamp in a preset database as a timestamp to be utilized; each time stamp and corresponding video reasoning information are recorded in the preset database, each time stamp is a time stamp of a video frame of the target video, and each video reasoning information corresponding to each time stamp is a reasoning information of the video frame with the time stamp;

s603, acquiring video reasoning information corresponding to the timestamp to be utilized from the preset database as an information acquisition result;

s604, video reasoning information display is carried out based on the information acquisition result.

In this embodiment, the predetermined database is built locally at the client, and when the operation of requesting for the inference information of the target video is detected, the client may directly determine the timestamp matching with the target timestamp in the predetermined database, as the timestamp to be utilized, obtain the video inference information corresponding to the timestamp to be utilized from the predetermined database, as the information obtaining result, and then display the video inference information based on the information obtaining result. Therefore, according to the scheme, when the inference information acquisition request is received, the matched time stamp to be utilized can be determined from the preset database directly based on the target time stamp, the corresponding video inference information is acquired, and real-time computing resources are not required to be occupied for image analysis, so that the occupation of computing resources when a user knows the inference information in the video in real time can be reduced.

The embodiment of the invention also provides a video reasoning information display system, as shown in fig. 7, which comprises: a client 710 and a server 720;

a client 710 for generating an inference information acquisition request for a target video when an inference information request operation for the target video is detected;

a server 720 for acquiring a target timestamp in response to the inference information acquisition request; determining a time stamp matched with the target time stamp in a preset database as a time stamp to be utilized; acquiring video reasoning information corresponding to the timestamp to be utilized from the preset database as an information acquisition result; the target time stamp is a time stamp used for representing a video frame to be subjected to information reasoning; each time stamp and corresponding video reasoning information are recorded in the preset database, each time stamp is a time stamp of a video frame of the target video, and each video reasoning information corresponding to each time stamp is a reasoning information of the video frame with the time stamp; sending the information acquisition result to the client 710;

the client 710 is further configured to receive the target video inference information, and perform video inference information presentation based on the information acquisition result.

The embodiment of the invention also provides a display device of the video reasoning information, which is applied to a server of the video reasoning information display system; the video reasoning information display system also comprises a client; as shown in fig. 8, the apparatus includes:

a receiving module 810, configured to receive an inference information acquisition request for a target video sent by the client;

a response module 820, configured to obtain a target timestamp in response to the inference information obtaining request; the target time stamp is a time stamp used for representing a video frame to be subjected to information reasoning;

a determining module 830, configured to determine a timestamp in a predetermined database that matches the target timestamp, as a timestamp to be utilized; each time stamp and corresponding video reasoning information are recorded in the preset database, each time stamp is a time stamp of a video frame of the target video, and each video reasoning information corresponding to each time stamp is a reasoning information of the video frame with the time stamp;

An obtaining module 840, configured to obtain, from the predetermined database, video inference information corresponding to the timestamp to be utilized as an information obtaining result;

and the sending module 850 is configured to send the information acquisition result to the client, so that the client performs video reasoning information display based on the information acquisition result.

Optionally, the acquiring module is specifically configured to:

Optionally, the determining manner of the target timestamp includes:

determining the first specified timestamp as a target timestamp;

the sending module is specifically configured to:

Optionally, the determining manner of the target timestamp includes:

the sending module is specifically configured to:

Optionally, the determining module is specifically configured to:

The embodiment of the invention also provides a device for constructing the preset database, as shown in fig. 9, the device for constructing the preset database comprises:

the extracting module 910 is configured to extract a plurality of video frames from the target video according to a preset video frame extracting rule;

an identifying module 920 for identifying video inference information of each video frame;

a construction module 930 is configured to construct a predetermined database including each video inference information and a corresponding timestamp.

Optionally, the extraction module includes:

the selecting submodule is used for selecting a specified number of expected times for a time period of a preset unit duration;

and the extraction sub-module is used for extracting video frames from video frames in each target time period of the target video based on each expected time, wherein each target time period has the preset unit time length.

the extraction submodule comprises:

a determining unit configured to determine, for each expected time, a distribution time that satisfies a first matching condition within a time period of the predetermined unit time length as an extraction time corresponding to the expected time; wherein the first matching condition includes the shortest time interval from the expected time;

And the extraction unit is used for extracting each expected from the video frames in each time period of the target video.

The embodiment of the invention also provides an electronic device, as shown in fig. 10, which comprises a processor 1001, a communication interface 1002, a memory 1003 and a communication bus 1004, wherein the processor 1001, the communication interface 1002 and the memory 1003 complete communication with each other through the communication bus 1004,

a memory 1003 for storing a computer program;

the processor 1001 is configured to implement the method for displaying video inference information when executing the program stored in the memory 1003.

The communication bus mentioned by the above terminal may be a peripheral component interconnect standard (Peripheral Component Interconnect, abbreviated as PCI) bus or an extended industry standard architecture (Extended Industry Standard Architecture, abbreviated as EISA) bus, etc. The communication bus may be classified as an address bus, a data bus, a control bus, or the like. For ease of illustration, the figures are shown with only one bold line, but not with only one bus or one type of bus.

The communication interface is used for communication between the terminal and other devices.

The memory may include random access memory (Random Access Memory, RAM) or non-volatile memory (non-volatile memory), such as at least one disk memory. Optionally, the memory may also be at least one memory device located remotely from the aforementioned processor.

The processor may be a general-purpose processor, including a central processing unit (Central Processing Unit, CPU for short), a network processor (Network Processor, NP for short), etc.; but also digital signal processors (Digital Signal Processor, DSP for short), application specific integrated circuits (Application Specific Integrated Circuit, ASIC for short), field-programmable gate arrays (Field-Programmable Gate Array, FPGA for short) or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components.

In still another embodiment of the present invention, a computer readable storage medium is provided, where a computer program is stored, where the computer program is executed by a processor to implement the method for displaying video inference information in the foregoing embodiment.

In yet another embodiment of the present invention, a computer program product containing instructions that, when executed on a computer, cause the computer to perform the method of presenting visual reasoning information in the above embodiment is also provided.

In the above embodiments, it may be implemented in whole or in part by software, hardware, firmware, or any combination thereof. When implemented in software, may be implemented in whole or in part in the form of a computer program product. The computer program product includes one or more computer instructions. When loaded and executed on a computer, produces a flow or function in accordance with embodiments of the present invention, in whole or in part. The computer may be a general purpose computer, a special purpose computer, a computer network, or other programmable apparatus. The computer instructions may be stored in or transmitted from one computer-readable storage medium to another, for example, by wired (e.g., coaxial cable, optical fiber, digital Subscriber Line (DSL)), or wireless (e.g., infrared, wireless, microwave, etc.). The computer readable storage medium may be any available medium that can be accessed by a computer or a data storage device such as a server, data center, etc. that contains an integration of one or more available media. The usable medium may be a magnetic medium (e.g., floppy Disk, hard Disk, magnetic tape), an optical medium (e.g., DVD), or a semiconductor medium (e.g., solid State Disk (SSD)), etc.

It is noted that relational terms such as first and second, and the like are used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Moreover, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising … …" does not exclude the presence of other like elements in a process, method, article or apparatus that comprises the element.

The foregoing description is only of the preferred embodiments of the present invention and is not intended to limit the scope of the present invention. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of the present invention are included in the protection scope of the present invention.

Claims

1. The display method of the video reasoning information is characterized by being applied to a server of a video reasoning information display system; the video reasoning information display system also comprises a client; the method comprises the following steps:

the information acquisition result is sent to the client so that the client can display video reasoning information based on the information acquisition result;

The construction mode of the preset database comprises the following steps:

selecting a specified number of expected times for a time period of a predetermined unit duration; each video frame within the predetermined unit duration of time has a respective distribution time;

for each expected time, determining the distribution time meeting the first matching condition in the time period of the preset unit time length as the extraction time corresponding to the expected time; wherein the first matching condition includes the shortest time interval from the expected time;

extracting video frames at extraction time corresponding to each expected time from video frames in each time period of the target video;

identifying video reasoning information of each video frame;

2. The method of claim 1, wherein the target timestamp is a timestamp generated based on a first specified timestamp that is a play time of a target video in the client when the inference information acquisition request is generated.

3. The method according to claim 1, wherein the obtaining, from the predetermined database, the video inference information corresponding to the timestamp to be utilized as the information obtaining result includes:

4. The method according to claim 2, wherein the determining of the target timestamp comprises:

determining the first specified timestamp as a target timestamp;

5. The method according to claim 2, wherein the determining of the target timestamp comprises:

6. A method according to claim 1 or 3, wherein said determining a timestamp in a predetermined database that matches said target timestamp as a timestamp to be utilized comprises:

7. The display method of the video reasoning information is characterized by being applied to a client of a video reasoning information display system; the video reasoning information display system also comprises a server; the method comprises the following steps:

Receiving the information acquisition result, and displaying video reasoning information based on the information acquisition result;

the construction mode of the preset database comprises the following steps:

identifying video reasoning information of each video frame;

8. The display method of the video reasoning information is characterized by being applied to a client; the method comprises the following steps:

video reasoning information display is carried out based on the information acquisition result;

the construction mode of the preset database comprises the following steps:

Identifying video reasoning information of each video frame;

9. The method of claim 8, wherein the target timestamp is a timestamp generated based on a first specified timestamp that is a play time of a target video in the client when the inference information acquisition request is generated.

10. The method according to claim 9, wherein the determining of the target timestamp comprises:

determining the first specified timestamp as a target timestamp;

11. The method according to claim 9, wherein the determining of the target timestamp comprises:

12. A video reasoning information presentation system, the video reasoning information presentation system comprising: a client and a server;

The client is also used for receiving the target video reasoning information and displaying the video reasoning information based on the information acquisition result;

the construction mode of the preset database comprises the following steps: selecting a specified number of expected times for a time period of a predetermined unit duration; each video frame within the predetermined unit duration of time has a respective distribution time; for each expected time, determining the distribution time meeting the first matching condition in the time period of the preset unit time length as the extraction time corresponding to the expected time; wherein the first matching condition includes the shortest time interval from the expected time; extracting video frames at extraction time corresponding to each expected time from video frames in each time period of the target video; identifying video reasoning information of each video frame; a predetermined database is constructed containing each video inference information and corresponding time stamps.

13. The display device of the video reasoning information is characterized by being applied to a server of a video reasoning information display system; the video reasoning information display system also comprises a client; the device comprises:

the sending module is used for sending the information acquisition result to the client so that the client can display video reasoning information based on the information acquisition result;

14. The electronic equipment is characterized by comprising a processor, a communication interface, a memory and a communication bus, wherein the processor, the communication interface and the memory are communicated with each other through the communication bus;

a memory for storing a computer program;

a processor for carrying out the method steps of any one of claims 1-11 when executing a program stored on a memory.

15. A computer-readable storage medium, characterized in that the computer-readable storage medium has stored therein a computer program which, when executed by a processor, implements the method steps of any of claims 1-11.