CN115019231A

CN115019231A - Video frame identification method and device, electronic equipment and storage medium

Info

Publication number: CN115019231A
Application number: CN202210634500.5A
Authority: CN
Inventors: 杨子江
Original assignee: Beijing QIYI Century Science and Technology Co Ltd
Current assignee: Beijing QIYI Century Science and Technology Co Ltd
Priority date: 2022-06-06
Filing date: 2022-06-06
Publication date: 2022-09-06

Abstract

The embodiment of the invention relates to a video frame identification method, a video frame identification device, electronic equipment and a storage medium, wherein the method is applied to a server and comprises the following steps: receiving a video frame identification request from a client, wherein the client sends the video frame identification request to a server when detecting a trigger operation on a currently played video, and the video frame identification request comprises video information and playing progress information of the currently played video; responding to the video frame identification request, acquiring a video file of the currently played video according to the video information, and determining a target video frame to be identified from the video file according to the playing progress information; identifying the target video frame to obtain an identification result; and returning the identification result to the client. Therefore, the definition of the video frame picture can be improved, the influence of factors such as products and versions on the video frame identification result is reduced, and the accuracy of the identification result of the video frame is improved.

Description

Video frame identification method and device, electronic equipment and storage medium

Technical Field

The embodiment of the invention relates to the technical field of computers, in particular to a video frame identification method and device, electronic equipment and a storage medium.

Background

Currently, a user often uses a mobile phone client (e.g., an android client, an IOS (apple Operating System) client, etc.) to watch a video, and in some scenarios, the user may want to obtain information such as a person, an article, or a speech in a certain video picture.

In the prior art, when a user wants to obtain information such as characters, articles or lines in a video image, the user can obtain the image of the video image in a screenshot mode when playing the video image at a mobile phone client, and then perform image recognition on the image by using an image recognition algorithm built in a server or the mobile phone client to obtain the information that the user wants to obtain.

However, the picture obtained by screenshot may have the problems of blur, unclear and the like, so that the accuracy of the picture identification result is affected, and meanwhile, the screenshot mode may also affect the performance of the mobile phone.

Disclosure of Invention

In view of this, in order to solve the above technical problems that the picture obtained by screenshot may have blurriness, unclear, and the like, thereby affecting the accuracy of the picture recognition result, and meanwhile, the screenshot mode may also affect the performance of the mobile phone, embodiments of the present invention provide a video frame recognition method and apparatus, an electronic device, and a storage medium.

In a first aspect, an embodiment of the present invention provides a video frame identification method, where the method is applied to a server, and the method includes:

receiving a video frame identification request from a client, wherein the client sends the video frame identification request to a server when detecting a trigger operation on a currently played video, and the video frame identification request carries video information and playing progress information of the currently played video;

responding to the video frame identification request, acquiring a video file of the currently played video according to the video information, and determining a target video frame to be identified from the video file according to the playing progress information;

identifying the target video frame to obtain an identification result;

and returning the identification result to the client.

In a possible implementation manner, the determining, from the video file, a target video frame to be identified according to the play progress information includes:

determining a target video clip from a plurality of video clips included in the video file according to the playing progress information, wherein a playing time point represented by the playing progress information is located in a playing time period corresponding to the target video clip;

and determining one or more key video frames in the target video clip as target video frames to be identified.

In a possible embodiment, the determining one or more key video frames in the target video segment as the target video frames to be identified includes:

determining a target time interval from the playing time intervals corresponding to the target video clips according to the playing progress information;

and determining the key video frame of the target video clip, of which the corresponding playing time point is located in the target time interval, as the target video frame to be identified.

In a possible embodiment, the method further comprises:

determining whether the target video clip meets a preset condition;

under the condition that the target video clip is determined to meet the preset condition, storing the identification result corresponding to the target video frame to a specified storage medium;

before the identifying the target video frame to obtain an identification result, the method further includes:

searching the appointed storage medium according to the target video frame;

and if the identification result corresponding to the target video frame is not found from the specified storage medium, executing the step of identifying the target video frame to obtain the identification result.

In a possible embodiment, the determining that the target video segment satisfies the preset condition includes:

determining the on-demand times of the target video clip in a set historical time period;

comparing the on-demand times with a preset time threshold;

and if the on-demand times are larger than or equal to the time threshold value through comparison, determining that the target video clip meets a preset condition.

In a second aspect, an embodiment of the present invention provides a video frame identification method, where the method is applied to a client, and the method includes:

when the triggering operation of the currently played video is detected, sending a video frame identification request to a server, wherein the video frame identification request carries video information and playing progress information of the currently played video, so that the server responds to the video frame identification request to obtain a video file of the currently played video, determines a target video frame from the video file, and identifies the target video frame to obtain an identification result;

and receiving the identification result returned by the server.

In a third aspect, an embodiment of the present invention provides a video frame identification apparatus, where the apparatus is applied to a server, and the apparatus includes:

the system comprises a request receiving module, a video frame identification request sending module and a video frame identification processing module, wherein the request receiving module is used for receiving a video frame identification request from a client, the client sends the video frame identification request to a server when detecting a trigger operation on a currently played video, and the video frame identification request carries video information and playing progress information of the currently played video;

the request response module is used for responding to the video frame identification request, acquiring the video file of the currently played video according to the video information, and determining a target video frame to be identified from the video file according to the playing progress information;

the video frame identification module is used for identifying the target video frame to obtain an identification result;

and the result returning module is used for returning the identification result to the client.

In a possible implementation manner, the request response module includes:

the video clip determining submodule is used for determining a target video clip from a plurality of video clips included in the video file according to the playing progress information, wherein the playing time point represented by the playing progress information is positioned in the playing time period corresponding to the target video clip;

and the video frame determining submodule is used for determining one or more key video frames in the target video clip as the target video frames to be identified.

In a possible implementation manner, the video frame determination sub-module is specifically configured to:

and determining the key video frame of the target video clip, of which the corresponding playing time point is positioned in the target time interval, as the target video frame to be identified.

In a possible embodiment, the apparatus further comprises:

the determining module is used for determining whether the target video clip meets a preset condition;

the result storage module is used for storing the identification result corresponding to the target video frame to a specified storage medium under the condition that the target video clip is determined to meet the preset condition;

the device further comprises:

the searching module is used for searching the appointed storage medium according to the target video frame before the target video frame is identified and an identification result is obtained;

and the execution module is used for executing the step of identifying the target video frame to obtain an identification result if the identification result corresponding to the target video frame is not found in the specified storage medium.

In a possible implementation manner, the result storage module is specifically configured to:

comparing the on-demand times with a preset time threshold;

In a fourth aspect, an embodiment of the present invention provides a video frame identification apparatus, where the apparatus is applied to a client, and the apparatus includes:

the request sending module is used for sending a video frame identification request to a server when the triggering operation of the currently played video is detected, wherein the video frame identification request carries video information and playing progress information of the currently played video, so that the server responds to the video frame identification request to obtain a video file of the currently played video, determines a target video frame from the video file, and identifies the target video frame to obtain an identification result;

and the result receiving module is used for receiving the identification result returned by the server.

In a fifth aspect, an embodiment of the present invention provides an electronic device, including: a processor and a memory, the processor being configured to execute a video frame identification program stored in the memory to implement the video frame identification method of any one of the first aspect or the second aspect.

In a sixth aspect, an embodiment of the present invention provides a storage medium storing one or more programs, where the one or more programs are executable by one or more processors to implement the video frame identification method according to any one of the first aspect or the second aspect.

According to the technical scheme provided by the embodiment of the invention, when the client detects the triggering operation of the currently played video, the client sends a video frame identification request to the server, wherein the video frame identification request carries the video information and the playing progress information of the currently played video. And then the server responds to the video frame identification request, acquires a video file of the currently played video according to the video information, determines a target video frame to be identified from the video file according to the playing progress information, identifies the target video frame to obtain an identification result, and returns the identification result to the client. Because the server acquires the target video frame to be identified from the video file of the currently played video of the client, the client does not need to perform screenshot processing on the video picture of the currently played video, and the problem that the performance of the client is influenced due to screenshot can be solved; meanwhile, the target video frame to be identified is obtained from the video file of the video currently played by the client, and the video picture obtained by comparing the video frame with the screenshot is clearer, so that the identification result obtained by identifying the target video frame is more accurate.

Drawings

Fig. 1 is a schematic diagram of a video frame recognition system according to an embodiment of the present invention;

fig. 2 is a flowchart illustrating an embodiment of a video frame recognition method according to an embodiment of the present invention;

FIG. 3 is a flowchart illustrating another method for identifying video frames according to an embodiment of the present invention;

FIG. 4 is a flowchart of an embodiment of a method for identifying a video frame according to another embodiment of the present invention;

FIG. 5 is a flowchart illustrating a further method for identifying video frames according to an embodiment of the present invention;

fig. 6 is a flowchart of an embodiment of a video frame identification method according to the present invention;

FIG. 7 is a block diagram of an embodiment of a video frame recognition apparatus according to the present invention;

FIG. 8 is a block diagram of an embodiment of an apparatus for identifying video frames according to the present invention;

fig. 9 is a schematic structural diagram of an electronic device according to an embodiment of the present invention.

Detailed Description

In order to make the objects, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, but not all, embodiments of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

For the understanding of the embodiments of the present invention, the following description first illustrates the architecture of the video frame recognition system according to the present invention with reference to the accompanying drawings:

fig. 1 is a schematic diagram of an architecture of a video frame recognition system according to an embodiment of the present invention. The system shown in fig. 1 comprises: client 101, server 102.

Among them, the client 101 is an application program that provides a service for a user, and may be an application program running on a separate process, a sub-application (applet) embedded in the client independent of a main page, a function (also referred to as WebApp) running on a web browser, an applet embedded in an email, or the like. Or may be a hardware device that provides services to the user, such as: smart phones, tablets, laptop portable computers, desktop computers, etc., fig. 1 only exemplifies a client 101 as a smart phone.

The server 102 is a background server corresponding to the application program and providing background services for the application program, where the background services include, but are not limited to: data transmission services, data processing services, video services, and the like.

It should be noted that the number of clients and servers in fig. 1 is merely an exemplary illustration, and the embodiment of the present invention is not limited thereto.

The following further explains the video frame recognition method provided by the present invention with specific embodiments in conjunction with the drawings, and the embodiments do not limit the embodiments of the present invention.

Referring to fig. 2, a flowchart of an embodiment of a video frame identification method according to an embodiment of the present invention is provided. For one embodiment, the method may be applied to a server, such as server 102 illustrated in FIG. 1. As shown in fig. 2, the process may include the following steps:

step 201, a server receives a video frame identification request from a client, wherein the client sends the video frame identification request to the server when detecting a triggering operation on a currently played video, and the video frame identification request carries video information and playing progress information of the currently played video.

In an embodiment, in a process of playing a video (hereinafter referred to as a currently played video) at a client, if a user wants to acquire information such as a person, an article, or a speech in a current video picture, the user may perform a trigger operation on the currently played video. When detecting the triggering operation of the current playing video, the client generates a video frame identification request for indicating the identification of the current video picture, and sends the video frame identification request to the server.

The trigger operation may be a click operation (including a single click operation or a double click operation) on a video screen, a long-time press operation, or the like, or a click operation (including a single click operation or a double click operation) on a designated icon in the video screen, which is not limited in this embodiment of the present invention.

The video frame identification request may carry video information and playing progress information of a currently played video. Optionally, the video information of the currently playing video may be a name, an id (identity document, id number), or a video stream address of the currently playing video. The play progress information may be a play time point.

For example, if the name of the currently playing video is video a, and if the client detects a trigger operation on the currently playing video when the currently playing video is played for 20 seconds in 5 minutes, the video information may be video a, and the playing progress information may be 20 seconds in 5 minutes.

For another example, assuming that the video stream address of the currently playing video is rtmp:// 58.200.131.2:// livetv// tv, and assuming that the client detects the trigger operation on the currently playing video when the currently playing video is played for 20 seconds at 10 minutes, the video information may be the video stream address rtmp:// 58.200.131.2:// livetv// tvt, and the playing progress information may be 20 seconds at 10 minutes.

Step 202, the server responds to the video frame identification request, acquires a video file of the currently played video according to the video information, and determines a target video frame to be identified from the video file according to the playing progress information.

Step 203, the server identifies the target video frame to obtain an identification result.

And step 204, the server returns the identification result to the client.

Steps 202 to 204 are collectively described below:

as can be seen from the description in step 201, the video frame identification request carries the video information and the playing progress information of the currently playing video, so that the server can analyze the video information and the playing progress information of the currently playing video from the client after receiving the video frame identification request.

Further, the server may obtain a video file of the currently played video from the local storage medium according to the parsed video information. Alternatively, the video file may be an M3U (Moving Picture Experts Group Audio Layer 3 Uniform Resource Locator)8 format file, and the M3U8 file refers to a file in UTF-8 encoding format.

Furthermore, the server can determine one or more target video frames to be identified from the obtained video file according to the analyzed playing progress information, and then the server identifies the target video frames to obtain an identification result and returns the identification result to the client, so that the client outputs the identification result and meets the user requirements.

In an embodiment, the server may call a built-in image recognition algorithm module to recognize the target video frame, so as to obtain a recognition result. Here, the image recognition algorithm module may be software, or may be a module combining software and hardware, which is not limited in this embodiment of the present invention.

It should be clear that the above is only one example of implementing recognition on a target video frame to obtain a recognition result, and in practical applications, the recognition may also be implemented by other ways (for example, an image recognition model, etc.), and the embodiment of the present invention is not limited to this.

So far, the description about the flow shown in fig. 2 is completed.

As can be seen from the flow shown in fig. 2, in the technical solution of the present invention, when the client detects the trigger operation on the currently played video, a video frame identification request is sent to the server, where the video frame identification request carries the video information and the playing progress information of the currently played video. And then the server responds to the video frame identification request, acquires a video file of the currently played video according to the video information, determines a target video frame to be identified from the video file according to the playing progress information, identifies the target video frame to obtain an identification result, and returns the identification result to the client. Because the server acquires the target video frame to be identified from the video file of the currently played video of the client, the client does not need to perform screenshot processing on the video picture of the currently played video, and therefore the problem that the performance of the client is affected due to screenshot can be solved; meanwhile, the target video frame to be identified is obtained from the video file of the video currently played by the client, and the video picture obtained by comparing the video frame with the screenshot is clearer, so that the identification result obtained by identifying the target video frame is more accurate.

Referring to fig. 3, a flowchart of another embodiment of a video frame identification method according to an embodiment of the present invention is provided. The flow shown in fig. 3 is based on the flow shown in fig. 2, and describes how to determine the target video frame to be identified from the video file according to the playing progress information. As shown in fig. 3, the process may include the following steps:

step 301, the server determines a target video clip from a plurality of video clips included in the video file according to the playing progress information, wherein the playing time point represented by the playing progress information is located within a playing time period corresponding to the target video clip.

As can be known from the related description in the flow shown in fig. 2, the video frame identification request carries playing progress information, so in this step 301, when the server receives the video frame identification request, the server can directly analyze the playing progress information of the currently playing video of the client from the video frame identification request, and further determine a target video segment from a plurality of video segments included in the video file according to the playing progress information. Here, the playing time point represented by the playing progress information is located within the playing time period corresponding to the target video clip.

For example, assume that the playing time point represented by the playing progress information is 20 seconds at 5 minutes, and assume that the video a file includes three video clips, and the playing periods corresponding to the three video clips are: 0 to 10 minutes 10 seconds (hereinafter referred to as a first video clip), 10 minutes 11 seconds to 20 minutes 10 seconds (hereinafter referred to as a second video clip), and 20 minutes 11 seconds to 30 minutes 10 seconds (hereinafter referred to as a third video clip). Then, according to the above description, since the playing time point represented by the playing progress information is located within the playing time period corresponding to the first video segment, the first video segment can be determined as the target video segment.

Step 302, the server determines one or more key video frames in the target video clip as target video frames to be identified.

In an embodiment, the determining one or more key video frames in the target video segment as the specific implementation of the target video frame to be identified may include: and determining a target time interval from the playing time intervals corresponding to the target video clip according to the playing progress information, and determining the key video frame of the target video clip, of which the corresponding playing time point is located in the target time interval, as the target video frame to be identified.

The target time interval may be a time interval of any time (for example, 2 seconds) before and after the playing time point is taken as a center, or may be any time interval of a corresponding playing time interval in the target video segment, which includes the playing time point and has a duration of a preset duration (for example, 2 seconds), which is not limited in this embodiment of the present invention.

For example, assuming that the playing time point represented by the playing progress information is 10 th minute and 10 seconds, and the playing time period corresponding to the target video clip is 0 to 10 th minute and 20 seconds, according to the above description, the time period of 2 seconds before and after the playing time point is taken as the center, which is the target time period of 08 seconds to 10 th minute and 12 seconds.

For another example, assuming that the playing time point represented by the playing progress information is 10 seconds at 10 minutes, and the playing time period corresponding to the target video clip is 09 seconds to 20 seconds at 13 minutes at 10 minutes, according to the above description, any time period, which includes the playing time point and has a preset time length, in the playing time period corresponding to the target video clip may be determined as the target time period, and the target time period may be 10 seconds to 12 seconds at 10 minutes, or 09 seconds to 10 seconds at 10 minutes.

So far, the description about the flow shown in fig. 3 is completed.

In the technical scheme of the invention, the target video clip is determined from the plurality of video clips included in the video file according to the playing progress information, wherein the playing time point represented by the playing progress information is positioned in the playing time period corresponding to the target video clip. And then determining one or more key video frames in the target video clip as the target video frames to be identified. Because the target video frame is obtained from the video file of the currently played video of the client according to the playing progress information, the video time point of the video frame obtained by comparing the video frame with the screenshot is more accurate, so that the video frame of which the user wants to obtain the information can be accurately identified, the information which the user wants is obtained, and the user requirements are met; meanwhile, the target video frame is a key video frame in the target video clip, so that the information of human actions, articles and the like in the video frame can be accurately and clearly identified.

Referring to fig. 4, a flowchart of another embodiment of a video frame identification method according to an embodiment of the present invention is provided. The process shown in fig. 4 may include the following steps based on the processes shown in fig. 2 and fig. 3:

step 401, a server receives a video frame identification request from a client, wherein the client sends the video frame identification request to the server when detecting a trigger operation on a currently played video, and the video frame identification request carries video information and playing progress information of the currently played video.

Step 402, the server responds to the video frame identification request and acquires a video file of the currently played video according to the video information.

For the detailed description of step 401 and step 402, refer to the related description in the flow shown in fig. 2, and are not described here again.

Step 403, the server determines a target video segment from the plurality of video segments included in the video file according to the playing progress information, wherein the playing time point represented by the playing progress information is located within the playing time period corresponding to the target video segment.

Step 404, the server determines a target time interval from the playing time intervals corresponding to the target video clips according to the playing progress information.

Step 405, the server determines the key video frame of the target video clip, of which the corresponding playing time point is located in the target time period, as the target video frame to be identified.

For the detailed description of step 403 to step 405, refer to the related description in the flow shown in fig. 3, which is not described herein again.

Step 406, the server searches a designated storage medium according to the target video frame; if the identification result corresponding to the target video frame is not found from the designated storage medium, step 407 is executed, and if the identification result corresponding to the target video frame is found from the designated storage medium, step 408 is executed.

And 407, identifying the target video frame by the server to obtain an identification result.

Step 408, the server returns the recognition result to the client.

In step 409, the server determines whether the target video clip meets a preset condition.

And step 410, under the condition that the target video clip meets the preset condition, the server stores the identification result corresponding to the target video frame to a specified storage medium.

Step 406 and step 410 are described collectively below:

in the embodiment of the invention, in order to save the time for determining the identification result corresponding to the target video frame, the identification result corresponding to the target video frame can be stored in the specified storage medium, so that the server can search the specified storage medium according to the target video frame and determine the identification result corresponding to the target video frame from the specified storage medium without executing the step of identifying the target video frame to obtain the identification result. The specified storage medium may be a cloud (e.g., a network disk, etc.), a local (e.g., a local gallery, etc.), and the embodiment of the present invention is not limited thereto.

Specifically, the search can be performed in a specified storage medium according to video frame information (e.g., a video file, a target video clip, a target time period, or video progress information, etc.) indicated by the target video frame; if the identification result corresponding to the target video frame is not found in the designated storage medium, performing step 407; if the identification result corresponding to the target video frame is found from the designated storage medium, step 408 is performed.

For example, assume that the specified storage medium stores therein the recognition results as shown in the following table 1:

TABLE 1

Assuming that the target video frame is a key video frame in the first segment of the video a, where the target time period is from 18 seconds to 20 seconds in 10 th minute, the table 1 is searched according to the video frame information represented by the target video frame, so that it can be determined that the identification result corresponding to the target video frame is not found in the specified storage medium, and then the server can identify the target video frame to obtain the identification result and return the identification result to the client.

Further, assuming that the target video frame is a key video frame in the first segment of the video a, where the target time period is from 08 seconds to 14 seconds in 5 th minute, the table 1 is searched according to the video frame information represented by the target video frame, and since the video frame information represented by the key video frame, which is consistent with the video frame information represented by the target video frame, can be found from the table 1, it can be determined that the identification result corresponding to the target video frame found from the specified storage medium is a second identification result, and then, the server can return the second identification result to the client.

Through the processing mode, the identification result corresponding to the target video frame can be directly searched from the specified storage medium, and the identification result is returned to the client, so that the time for identifying the target video frame is saved.

As can be seen from the above description of step 409 and step 410, for the recognition result of the target video frame, the server may store the recognition result to a specified storage medium when the target video segment meets the preset condition.

In an embodiment, since the hot video segment of the hot play is more likely to be selected by the user for video frame identification than other videos or video segments, after the target video frame is identified and the identification result is obtained, it may be determined whether the target video segment corresponding to the identification result meets the preset condition of the hot video segment. If the target video clip is determined to meet the preset condition, the recognition result can be stored in a designated storage medium. And then, when the target video frame to be identified is determined from the video file again, the appointed storage medium can be directly searched according to the target video frame, and the identification result is returned to the client after being searched, so that the waiting time of a user can be reduced, and the efficiency is improved.

Optionally, the specific implementation of determining that the target video segment meets the preset condition may include: determining the on-demand times of the target video clip in a set historical time period, comparing the on-demand times with a preset time threshold, and if the on-demand times are greater than or equal to the time threshold, determining that the target video clip meets a preset condition. The set historical time period may be one week, one month, one year, and the like, which is not limited in this embodiment of the present invention.

In addition, if the comparison result shows that the number of times of the on-demand is smaller than the number threshold, it is determined that the target video clip does not meet the preset condition.

For example, assume that the on-demand number of the target video segment in the set historical time period is 15, and the preset number threshold is 10. Then, according to the above description, the on-demand times are compared with a preset time threshold, and if the comparison result shows that the on-demand times are greater than the time threshold, it is determined that the target video segment meets the preset condition.

For another example, assume that the on-demand number of the target video segment in the set historical time period is 5, and the preset number threshold is 10. Then, according to the above description, the on-demand times are compared with a preset time threshold, and if the comparison result shows that the on-demand times are smaller than the time threshold, it is determined that the target video segment does not meet the preset condition.

By the processing mode, the identification result corresponding to the target video frame can be directly searched from the specified storage medium, the video frame identification time is reduced, and the efficiency is improved.

In addition, in practice, there may be a possibility that the user selects video frame recognition for the cold video clip, and therefore, it may be determined whether the recognition result corresponding to the target video frame can be stored to the specified storage medium according to whether another preset condition (hereinafter, referred to as a first preset condition) is satisfied by the number of times that the target video frame is recognized.

Optionally, the number of times of identification of the target video frame may be determined, the number of times of identification is compared with a preset identification number threshold, and if the number of times of identification is greater than or equal to the identification number threshold, it is determined that the target video frame meets a first preset condition; and if the identified times are smaller than the identification time threshold value, determining that the target video frame does not meet the first preset condition. It should be clear, among other things, that the above-mentioned preset threshold number of identifications may be large enough to reduce the pressure of a given storage medium to store data.

For example, assuming that the preset recognition number threshold is 20, the recognition number of the target video frame is 30. Then, according to the above description, the identified times are compared with the identification time threshold, and if the compared identified times are greater than the identification time threshold, it is determined that the target video frame meets the first preset condition.

For another example, assuming that the preset threshold of the number of identifications is 20, the number of identifications of the target video frame is 5. Then, according to the above description, the identified number is compared with the identification number threshold, and if the identified number is smaller than the identification number threshold, it is determined that the target video frame does not satisfy the first preset condition.

Referring to fig. 5, a flowchart of another embodiment of a video frame identification method according to an embodiment of the present invention is provided. As an embodiment, the method may be applied to a client, such as the client 101 illustrated in fig. 1. As shown in fig. 5, the process may include the following steps:

step 501, when detecting a trigger operation on a currently played video, a client sends a video frame identification request to a server, where the video frame identification request carries video information and playing progress information of the currently played video, so that the server responds to the video frame identification request to obtain a video file of the currently played video, determines a target video frame from the video file, and identifies the target video frame to obtain an identification result.

Step 502, the client receives the identification result returned by the server.

Step 501 and step 502 are collectively described below:

in the embodiment of the present invention, in a process of playing a video at a client (hereinafter referred to as a currently played video), if a user wants to acquire information such as a person, an article, or a word in a current video picture, the user may perform a trigger operation on the currently played video. When detecting the triggering operation of the current playing video, the client generates a video frame identification request for indicating the identification of the current video picture, and sends the video frame identification request to the server.

For example, when a currently playing video B of a client is played for 10 seconds at 10 minutes, a user wants to acquire character information in a current video picture, a trigger operation may be performed on the currently playing video, the client detects the trigger operation on the currently playing video, and generates a video frame identification request for indicating identification of the current video picture, where the video frame identification request carries video information (for example, a video name is video B) of the currently playing video of the client and video progress information (for example, a playing time is 10 seconds at 10 minutes).

And then, the client sends the video frame identification request to the server, so that the server responds to the video frame identification request after receiving the video frame identification request, acquires a video file of the currently played video, determines a target video frame from the video file, and identifies the target video frame to obtain an identification result. As for how the server responds to the video frame identification request, acquires the video file of the currently played video, determines the target video frame from the video file, and determines the identification result of the target video frame, reference may be made to the related descriptions in the flows shown in fig. 2 to fig. 4, which are not repeated here.

After obtaining the identification result of the target video frame, the server can return the identification result to the client, so that the client can receive the identification result of the target video frame returned by the server.

So far, the description about the flow shown in fig. 5 is completed.

In the technical scheme of the invention, when the triggering operation of the currently played video is detected, a video frame identification request is sent to the server, wherein the video frame identification request comprises the video information and the playing progress information of the currently played video, so that the server responds to the video frame identification request to obtain the video file of the currently played video, a target video frame is determined from the video file, the target video frame is identified to obtain an identification result, and the client receives the identification result returned by the server. In the process, the client does not need to capture the video picture of the currently played video, but the server acquires the target video frame to be identified from the video file of the currently played video of the client, so that the problem that the performance of the client is influenced due to capture can be solved; meanwhile, the target video frame identified by the server is obtained from the video file of the video currently played by the client, and the video picture obtained by comparing the video picture with the screenshot is clearer, so that the identification result obtained by identifying the target video frame by the server received by the client is more accurate.

Referring to fig. 6, a flowchart of an embodiment of a video frame identification method according to an embodiment of the present invention is provided. As shown in fig. 6, the process may include the following steps:

step 601, when detecting the trigger operation of the currently played video, the client sends a video frame identification request to the server, where the video frame identification request carries video information and playing progress information of the currently played video.

Step 602, the server receives a video frame identification request from the client.

Step 603, the server responds to the video frame identification request, acquires a video file of the currently played video according to the video information, and determines a target video frame to be identified from the video file according to the playing progress information.

And step 604, the server identifies the target video frame to obtain an identification result.

Step 605, the server returns the identification result to the client.

Step 606, the client receives the identification result returned by the server.

For the detailed description of step 601 to step 606, refer to the related description in the flows shown in fig. 2 to fig. 5, which is not repeated herein.

In the technical scheme of the invention, when the client detects the triggering operation of the currently played video, a video frame identification request is sent to the server, and the video frame identification request carries the video information and the playing progress information of the currently played video. The server can receive a video frame identification request from the client, respond to the video frame identification request, acquire a video file of a currently played video according to video information, and determine a target video frame to be identified from the video file according to the playing progress information. And then, identifying the target video frame to obtain an identification result, and returning the identification result to the client. The client can receive the identification result returned by the server. Therefore, the target video frame can be extracted from the video file of the video currently played by the client to replace the screenshot, and the problem that the performance of the client is affected due to the screenshot is avoided; meanwhile, compared with the screenshot, the method for acquiring the target video frame from the video file ensures that the acquired video frame is clearer, so that the identification result obtained by identifying the target video frame is more accurate.

Corresponding to the foregoing embodiments of the video frame identification method, the present invention further provides a block diagram of an embodiment of an apparatus.

Referring to fig. 7, a block diagram of an embodiment of a video frame identification apparatus according to an embodiment of the present invention is provided. As an embodiment, the apparatus may be applied to a server. As shown in fig. 7, the apparatus includes:

a request receiving module 701, configured to receive a video frame identification request from a client, where the client sends the video frame identification request to a server when detecting a trigger operation on a currently played video, and the video frame identification request carries video information and playing progress information of the currently played video;

a request response module 702, configured to respond to the video frame identification request, obtain a video file of the currently played video according to the video information, and determine a target video frame to be identified from the video file according to the playing progress information;

a video frame identification module 703, configured to identify the target video frame to obtain an identification result;

a result returning module 704, configured to return the identification result to the client.

In a possible implementation, the request response module 702 includes (not shown in the figure):

In a possible embodiment, the device further comprises (not shown in the figures):

the device further comprises (not shown in the figures):

comparing the on-demand times with a preset time threshold;

Referring to fig. 8, a block diagram of another embodiment of a video frame recognition apparatus according to an embodiment of the present invention is provided. As an embodiment, the apparatus may be applied to a client. As shown in fig. 8, the apparatus includes:

a request sending module 801, configured to send a video frame identification request to a server when a trigger operation on a currently played video is detected, where the video frame identification request carries video information and playing progress information of the currently played video, so that the server responds to the video frame identification request to obtain a video file of the currently played video, determine a target video frame from the video file, and identify the target video frame to obtain an identification result;

a result receiving module 802, configured to receive the identification result returned by the server.

Fig. 9 is a schematic structural diagram of an electronic device according to an embodiment of the present invention, where the electronic device 900 shown in fig. 9 includes: at least one processor 901, memory 902, at least one network interface 904, and a user interface 903. Various components in the electronic device 900 are coupled together by a bus system 905. It is understood that the bus system 905 is used to enable communications among the components. The bus system 905 includes a power bus, a control bus, and a status signal bus, in addition to a data bus. For clarity of illustration, however, the various buses are labeled in fig. 9 as bus system 905.

The user interface 903 may include, among other things, a display, a keyboard or pointing device (e.g., a mouse, trackball), a touch pad or touch screen, etc.

It is to be understood that the memory 902 in embodiments of the present invention may be either volatile memory or nonvolatile memory, or may include both volatile and nonvolatile memory. The non-volatile memory may be a Read-only memory (ROM), a programmable Read-only memory (PROM), an erasable programmable Read-only memory (erasabprom, EPROM), an electrically erasable programmable Read-only memory (EEPROM), or a flash memory. The volatile memory may be a Random Access Memory (RAM) which serves as an external cache. By way of example, but not limitation, many forms of RAM are available, such as static random access memory (staticiram, SRAM), dynamic random access memory (dynamic RAM, DRAM), synchronous dynamic random access memory (syncronous DRAM, SDRAM), double data rate synchronous dynamic random access memory (DDRSDRAM ), Enhanced Synchronous DRAM (ESDRAM), Synchronous Link DRAM (SLDRAM), and direct memory bus RAM (DRRAM). The memory 902 described herein is intended to comprise, without being limited to, these and any other suitable types of memory.

In some embodiments, memory 902 stores elements, executable units or data structures, or a subset thereof, or an expanded set thereof: an operating system 9021 and application programs 9022.

The operating system 9021 includes various system programs, such as a framework layer, a core library layer, a driver layer, and the like, and is configured to implement various basic services and process hardware-based tasks. The application 9022 includes various applications, such as a media player (MediaPlayer), a Browser (Browser), and the like, for implementing various application services. A program implementing the method of an embodiment of the present invention may be included in application 9022.

In the embodiment of the present invention, by calling a program or an instruction stored in the memory 902, specifically, a program or an instruction stored in the application 9022, the processor 901 is configured to perform the method steps provided by the method embodiments illustrated in fig. 2 to 6, for example, including:

the method disclosed in the above embodiments of the present invention may be applied to the processor 901, or implemented by the processor 901. The processor 901 may be an integrated circuit chip having signal processing capabilities. In implementation, the steps of the above method may be implemented by integrated logic circuits of hardware or instructions in the form of software in the processor 901. The processor 901 may be a general-purpose processor, a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), an off-the-shelf programmable gate array (FPGA) or other programmable logic device, discrete gate or transistor logic device, or discrete hardware components. The various methods, steps and logic blocks disclosed in the embodiments of the present invention may be implemented or performed. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like. The steps of the method disclosed in connection with the embodiments of the present invention may be directly implemented by a hardware decoding processor, or implemented by a combination of hardware and software elements in the decoding processor. The software elements may be located in ram, flash, rom, prom, or eprom, registers, among other storage media that are well known in the art. The storage medium is located in the memory 902, and the processor 901 reads the information in the memory 902, and in combination with the hardware thereof, performs the steps of the method.

It is to be understood that the embodiments described herein may be implemented in hardware, software, firmware, middleware, microcode, or any combination thereof. For a hardware implementation, the processing units may be implemented within one or more Application Specific Integrated Circuits (ASICs), Digital Signal Processors (DSPs), Digital Signal Processing Devices (DSPDs), Programmable Logic Devices (PLDs), Field Programmable Gate Arrays (FPGAs), general purpose processors, controllers, micro-controllers, microprocessors, other electronic units configured to perform the functions described herein, or a combination thereof.

For a software implementation, the techniques described herein may be implemented by means of units performing the functions described herein. The software codes may be stored in a memory and executed by a processor. The memory may be implemented within the processor or external to the processor.

The electronic device provided in this embodiment may be the electronic device shown in fig. 9, and may perform all the steps of the video frame identification method shown in fig. 2 to 6, so as to achieve the technical effects of the video frame identification method shown in fig. 2 to 6, and for brevity, reference is specifically made to the descriptions related to fig. 2 to 6, which are not repeated herein.

The embodiment of the invention also provides a storage medium (computer readable storage medium). The storage medium herein stores one or more programs. Among others, the storage medium may include volatile memory, such as random access memory; the memory may also include non-volatile memory, such as read-only memory, flash memory, a hard disk, or a solid state disk; the memory may also comprise a combination of memories of the kind described above.

When one or more programs in the storage medium are executable by one or more processors, the above-described method for identifying video frames as illustrated in fig. 2 to 6 is performed on the electronic device side.

The processor is configured to execute the video frame identification program stored in the memory to implement the relevant steps of the above-mentioned video frame identification method as exemplified in fig. 2 to 6 performed on the electronic device side.

Those of skill would further appreciate that the various illustrative components and algorithm steps described in connection with the embodiments disclosed herein may be implemented as electronic hardware, computer software, or combinations of both, and that the various illustrative components and steps have been described above generally in terms of their functionality in order to clearly illustrate this interchangeability of hardware and software. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the implementation. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present invention.

The steps of a method or algorithm described in connection with the embodiments disclosed herein may be embodied in hardware, a software module executed by a processor, or a combination of the two. A software module may reside in Random Access Memory (RAM), memory, read-only memory (ROM), electrically programmable ROM, electrically erasable programmable ROM, registers, hard disk, a removable disk, a CD-ROM, or any other form of storage medium known in the art.

The above-mentioned embodiments are intended to illustrate the objects, technical solutions and advantages of the present invention in further detail, and it should be understood that the above-mentioned embodiments are merely exemplary embodiments of the present invention, and are not intended to limit the scope of the present invention, and any modifications, equivalent substitutions, improvements and the like made within the spirit and principle of the present invention should be included in the scope of the present invention.

Claims

1. A video frame identification method is applied to a server and comprises the following steps:

identifying the target video frame to obtain an identification result;

and returning the identification result to the client.

2. The method according to claim 1, wherein the determining a target video frame to be identified from the video file according to the playing progress information comprises:

3. The method according to claim 2, wherein the determining one or more key video frames in the target video segment as target video frames to be identified comprises:

4. The method of claim 2, further comprising:

determining whether the target video clip meets a preset condition;

searching the appointed storage medium according to the target video frame;

5. The method according to claim 4, wherein the determining that the target video segment satisfies the preset condition comprises:

comparing the on-demand times with a preset time threshold;

6. A video frame identification method is applied to a client side, and comprises the following steps:

and receiving the identification result returned by the server.

7. A video frame recognition apparatus, wherein the apparatus is applied to a server, and the apparatus comprises:

the system comprises a request receiving module, a video frame identification request sending module and a video frame identification processing module, wherein the request receiving module is used for receiving a video frame identification request from a client, the client sends the video frame identification request to a server when detecting a triggering operation of a currently played video, and the video frame identification request carries video information and playing progress information of the currently played video;

8. A video frame recognition apparatus, applied to a client, the apparatus comprising:

9. An electronic device, comprising: a processor and a memory, the processor being configured to execute a video frame identification program stored in the memory to implement the video frame identification method of any one of claims 1-5 or 6.

10. A storage medium storing one or more programs, the one or more programs being executable by one or more processors to implement the video frame recognition method of any one of claims 1 to 5 or 6.