CN113691758A

CN113691758A - Frame insertion method and device, equipment and medium

Info

Publication number: CN113691758A
Application number: CN202110968308.5A
Authority: CN
Inventors: 马咏芮; 童超宇
Original assignee: Shenzhen TetrasAI Technology Co Ltd
Current assignee: Shenzhen TetrasAI Technology Co Ltd
Priority date: 2021-08-23
Filing date: 2021-08-23
Publication date: 2021-11-23

Abstract

The application discloses a frame interpolation method, a device, equipment and a medium, wherein the frame interpolation method comprises the following steps: acquiring two continuous frames of video frame images; detecting whether two frames of video frame images meet frame interpolation conditions or not; predicting an intermediate frame between the two frames of video frame images in response to the two frames of video frame images satisfying the frame interpolation condition; and performing frame interpolation processing on the two frames of video frame images by using the intermediate frame. According to the scheme, the effect of video frame insertion can be improved.

Description

Frame insertion method and device, equipment and medium

Technical Field

The present application relates to the field of image processing technologies, and in particular, to a frame interpolation method, apparatus, device, and medium.

Background

Currently, most videos including movies and television series are low frame rate videos with a frame rate of 30FPS (Frames Per Second) or less, and the screen specification of the current display end has reached 60FPS, even 120 FPS. Video indicating conventional frame rates has not kept pace with the development of screen refresh rates. Accordingly, video frame interpolation techniques have been developed to perform frame interpolation by predicting intermediate frames to convert a low frame rate video to a high frame rate video.

The existing video frame interpolation technology performs frame interpolation on all video frames, but the effect of the video after frame interpolation is better than that before frame interpolation, namely, more invalid frame interpolation is possible.

Disclosure of Invention

The application at least provides a frame interpolation method, a device, equipment and a medium.

The application provides a frame interpolation method, which comprises the following steps: acquiring two continuous frames of video frame images, wherein the two frames of video frame images are used for playing in sequence; detecting whether two frames of video frame images meet frame interpolation conditions or not; predicting an intermediate frame between the two frames of video frame images in response to the two frames of video frame images satisfying the frame interpolation condition; and performing frame interpolation processing on the two frames of video frame images by using the intermediate frame.

Therefore, whether the two frames of video frame images meet the frame interpolation condition is detected before the intermediate frame between the two frames of video frame images is predicted, so that the intermediate frame is predicted under the condition that the two frames of video frame images meet the condition, the intermediate frame can be screened, and the frame interpolation effect is improved. And when the two frames of video frame images meet the frame insertion condition, the prediction of the intermediate frame is carried out, and the intermediate frame is screened after the intermediate frame prediction is not carried out blindly, so that the resource consumption of the intermediate frame prediction can be reduced.

Wherein the frame interpolation condition comprises at least one of the following conditions: the scene state of the two frames of video frame images is not a preset state, and the picture motion category in the two frames of video frame images is not in a preset motion category, wherein the preset state is a static scene state or a scene switching state; and/or the degree of motion of the preset motion category is greater than the preset degree of motion.

Therefore, when the scene state of the two frames of video frame images is the static scene state or the scene switching state, the inter frame of the two frames of video frame images is continuously predicted and interpolated, which does not improve the interpolation effect, but causes the waste of resources. In addition, when the picture motion types of the two video frame images are in the preset motion type, the quality of the intermediate frame may be poor due to a large motion degree, and the frame interpolation effect may be reduced, so that the intermediate frame is predicted only when the picture motion degrees of the two video frame images are less than or equal to the preset motion degree, and the frame interpolation effect can be improved.

Under the condition that the frame interpolation condition includes that the scene state of the two frames of video frame images is not a preset state, detecting whether the two frames of video frame images meet the frame interpolation condition or not, wherein the method comprises the following steps: judging whether the scene state of the two frames of video frame images is a preset state or not by using the parallax of the two frames of video frame images and/or the number of the matched characteristic point pairs; if the scene state of the two frames of video frame images is not a preset state, determining that the two frames of video frame images meet the frame interpolation condition; under the condition that the frame interpolation condition includes that the picture motion category in the two frames of video frame images is not in the preset motion category, detecting whether the two frames of video frame images meet the frame interpolation condition or not, wherein the method comprises the following steps: determining the picture motion category of the two frames of video frame images based on the optical flow information between the two frames of video frame images; and determining that the two frames of video frame images meet the frame inserting condition in response to the picture motion category not belonging to the preset motion category.

Therefore, the scene state of the two frames of video frame images is determined through the parallax of the two frames of video frame images and the number of the matching feature point pairs, and the accuracy of the determined scene state can be improved. In addition, the image motion type of the two-frame video frame image can be determined through the optical flow information between the two-frame video frame images.

The method for judging whether the scene state of the two frames of video frame images is the preset state or not by utilizing the parallax of the two frames of video frame images and/or the number of the matched feature point pairs includes the following steps: determining the scene state of the two frames of video frame images as a static scene state in response to the number of the matched feature point pairs being greater than or equal to a first preset number; or, in response to the parallax being less than or equal to a first preset parallax, determining that the scene state of the two frames of video frame images is a static scene state; or, in response to the number of the matched feature point pairs being less than or equal to a second preset number, determining the scene state of the two frames of video frame images as a scene switching state; or, in response to the parallax being greater than or equal to a second preset parallax, determining the scene state of the two frames of video frame images as a scene switching state; the first preset number is larger than the second preset number, and the first preset parallax is smaller than the second preset parallax.

Therefore, by matching the feature point pairs and the combination of the parallax, the scene state of the two frames of video frame images is determined, and the accuracy of the determined scene state can be improved.

The method for determining the picture motion category of the two frames of video frame images based on the optical flow information between the two frames of video frame images comprises the following steps: acquiring at least one optical flow information between two frames of video frame images, wherein the at least one optical flow information comprises one or more of optical flow information about a foreground, optical flow information about a background, and optical flow information difference between the foreground and the background; and determining the picture motion category of the two frames of video frame images based on the motion category corresponding to each type of optical flow information.

Therefore, the corresponding picture motion category is determined by the optical flow information of the foreground, the optical flow information about the background and the optical flow information difference between the foreground and the background, so that the determined picture motion category is more accurate.

The frame interpolation condition comprises that the scene state of two frames of video frame images is not a preset state, and the picture motion category in the two frames of video frame images is not in a preset motion category; detecting whether two frames of video frame images meet the frame interpolation condition or not, wherein the method comprises the following steps: detecting whether the scene state of two frames of video frame images is a preset state or not; detecting whether picture motion categories in two frames of video frame images are in a preset motion category or not in response to the scene state not being in the preset state; and determining that the two frames of video frame images meet the frame inserting condition in response to the fact that the picture motion category is not in the preset motion category.

Therefore, the final frame interpolation effect can be improved by successively carrying out detection twice.

The frame interpolation condition comprises that the picture motion category in the two frames of video frame images is not in the preset motion category, and before detecting that the picture motion category in the two frames of video frame images is not in the preset motion category, the method further comprises the following steps: inquiring whether two frames of historical video frame images used in the (i-1) th frame insertion meet the frame insertion condition or not, wherein the (i-1) th frame insertion is represented as the previous frame insertion of the current ith frame insertion, and i is a positive integer greater than 1; and in response to the two frames of historical video frame images used in the (i-1) th time of frame interpolation meeting the frame interpolation condition, performing the step of detecting that the picture motion category in the two frames of video frame images is not in the preset motion category.

Therefore, by combining the history information, the efficiency of detecting whether the two video frame image frames satisfy the frame interpolation condition can be improved.

Wherein, inquiring whether the two frames of historical video frame images used in the (i-1) th frame interpolation meet the frame interpolation condition comprises: inquiring whether the state of the scene change mark is a first preset state or not; if the state of the scene change mark is a first preset state, determining that the two frames of historical video frame images do not meet the frame interpolation condition; the method further comprises the following steps: responding to the fact that the two frames of historical video frame images do not meet the frame inserting condition, and switching the state of the scene change mark to a second preset state, wherein the second preset state is different from the first preset state; and acquiring two frames of video frame images required to be used in the (i +1) th frame insertion in response to the two frames of historical video frame images not meeting the frame insertion condition.

Wherein, the method further comprises: and in response to that the two frames of historical video frame images do not meet the frame interpolation condition, determining and saving optical flow information between the two frames of video frame images in the current ith frame interpolation by using the optical flow information between the two frames of historical video frame images, wherein the optical flow information between the two frames of video frame images is used for determining the optical flow information between the two frames of video frame images used by the (i +1) th frame interpolation.

Therefore, the optical flow information of the current time is stored, so that the optical flow information between two frames of video frames used in the frame interpolation process of the current time can be combined in the next frame interpolation process to determine the optical flow information.

Before the frame interpolation processing is performed on the two frames of video frame images by using the intermediate frame, the method further comprises the following steps: acquiring image quality information of an intermediate frame; and performing the step of performing frame interpolation processing on the two frames of video frame images by using the intermediate frame in response to the image quality information meeting the quality requirement.

Therefore, when the image quality information of the intermediate frame meets the quality requirement, the frame interpolation is carried out, and the frame interpolation effect can be improved.

The two frames of video frame images comprise a first video frame image and a second video frame image, and intermediate frames between the two frames of video frame images comprise a first intermediate frame obtained based on the first video frame image and a second intermediate frame obtained based on the second video frame image; acquiring image quality information of an intermediate frame, comprising: acquiring a first matching error between the first intermediate frame and the second intermediate frame; judging whether the image quality information meets the quality requirement or not, comprising the following steps: judging whether the first matching error is smaller than or equal to a preset error threshold value or not; and if the first matching error is smaller than or equal to a preset error threshold, determining that the image quality information meets the quality requirement.

Therefore, the first intermediate frame and the second intermediate frame are obtained by respectively combining the first video frame image and the second video frame image with the optical flow information, and if the matching error of the first intermediate frame and the second intermediate frame is small, the quality of the two intermediate frames can be considered to meet the requirement.

Wherein obtaining a first match error between the first intermediate frame and the second intermediate frame comprises: dividing the first intermediate frame into a third preset number of first areas and dividing the second intermediate frame into a third preset number of second areas according to a preset dividing mode; calculating a second matching error between the first region and the second region; the first match error is determined based on a third preset number of second match errors.

Therefore, the first intermediate frame and the second intermediate frame are divided into a plurality of areas to obtain the matching errors, so that the first matching errors of the two intermediate frames are obtained, and the accuracy of the obtained first matching errors can be improved.

Wherein, the method further comprises: predicting to obtain a first intermediate frame based on optical flow information between the first video frame image and the two frames of video frame images, and predicting to obtain a second intermediate frame based on optical flow information between the second video frame image and the two frames of video frame images; the method for performing frame interpolation processing on two frames of video frame images by using the intermediate frame comprises the following steps: and acquiring a final intermediate frame based on the first intermediate frame and the second intermediate frame, and inserting the final intermediate frame between the two video frame images.

Therefore, two intermediate frames are acquired through the optical flow information and the two video frame image frames, and the final intermediate frame is acquired by utilizing the two intermediate frames, so that the accuracy of the final intermediate frame can be improved.

The application provides a significance detection device, includes: the image acquisition module is used for acquiring two continuous frames of video frame images; the detection module is used for detecting whether the two frames of video frame images meet the frame interpolation condition; the prediction module is used for responding to the condition that the two frames of video frame images meet the frame interpolation condition and predicting an intermediate frame between the two frames of video frame images; and the frame interpolation module is used for performing frame interpolation processing on the two frames of video frame images by utilizing the intermediate frame.

The application provides an electronic device comprising a memory and a processor, wherein the processor is used for executing program instructions stored in the memory so as to realize the frame interpolation method.

The present application provides a computer readable storage medium having stored thereon program instructions that, when executed by a processor, implement the above-described frame interpolation method.

According to the scheme, whether the two frames of video frame images meet the frame interpolation condition is detected before the intermediate frame between the two frames of video frame images is predicted, so that the intermediate frame is predicted under the condition that the two frames of video frame images meet the condition, the intermediate frame can be screened, and the frame interpolation effect is improved. And when the two frames of video frame images meet the frame insertion condition, the prediction of the intermediate frame is carried out, and the intermediate frame is screened after the intermediate frame prediction is not carried out blindly, so that the resource consumption of the intermediate frame prediction can be reduced.

It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the application.

Drawings

The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the present application and, together with the description, serve to explain the principles of the application.

FIG. 1 is a flowchart illustrating an embodiment of a frame interpolation method according to the present application;

fig. 2 is a schematic flowchart illustrating step S12 in an embodiment of the frame interpolation method of the present application;

FIG. 3 is another flow chart illustrating a frame insertion method according to an embodiment of the present application;

FIG. 4 is a schematic structural diagram of an embodiment of a frame interpolation apparatus according to the present application;

FIG. 5 is a schematic structural diagram of an embodiment of an electronic device of the present application;

FIG. 6 is a schematic structural diagram of an embodiment of a computer-readable storage medium of the present application.

Detailed Description

The following describes in detail the embodiments of the present application with reference to the drawings attached hereto.

In the following description, for purposes of explanation and not limitation, specific details are set forth such as particular system structures, interfaces, techniques, etc. in order to provide a thorough understanding of the present application.

The term "and/or" herein is merely an association describing an associated object, meaning that three relationships may exist, e.g., a and/or B, may mean: a exists alone, A and B exist simultaneously, and B exists alone. In addition, the character "/" herein generally indicates that the former and latter related objects are in an "or" relationship. Further, the term "plurality" herein means two or more than two. In addition, the term "at least one" herein means any one of a plurality or any combination of at least two of a plurality, for example, including at least one of A, B, C, and may mean including any one or more elements selected from the group consisting of A, B and C.

Referring to fig. 1, fig. 1 is a flowchart illustrating a frame interpolation method according to an embodiment of the present application. Specifically, the method may include the steps of:

step S11: two consecutive video frame images are acquired.

The two-frame video frame image can be an image obtained by shooting through a camera shooting assembly, and can also be drawn through drawing software, video production software and the like. That is, the frame interpolation method provided by the embodiment of the present disclosure may perform frame interpolation on a video obtained through shooting, or may perform frame interpolation on a video such as a cartoon. Therefore, the specific form of the video frame image is not specifically defined here.

By continuous it is meant that the two frames of video images may be adjacent two frames of images in a segment of video, such as the first frame and the second frame in a segment of video.

Step S12: and detecting whether the two frames of video frame images meet the frame interpolation condition.

The detection can be performed according to image information corresponding to two frames of video frame images. Specifically, it may be detected whether the image information of the two video frame images both satisfy the frame interpolation condition, or it may be detected whether the image information between the two video frame images satisfies the frame interpolation condition.

Step S13: and predicting an intermediate frame between the two frames of video frame images in response to the two frames of video frame images satisfying the frame interpolation condition.

And when the response video frame image does not meet the frame interpolation condition, no intermediate frame between the two video frame images is predicted. In this case, the two video frame images are not subjected to the frame interpolation process. There are various ways to predict the inter frame between two video frame images, for example, the inter frame may be obtained according to a video frame interpolation model. The way how to obtain the intermediate frame from the two-frame image can be referred to the general technology, and will not be described herein too much.

Step S14: and performing frame interpolation processing on the two frames of video frame images by using the intermediate frame.

Specifically, an intermediate frame is inserted between two video frame images to obtain a new video. And then, the video can be displayed according to the corresponding display relation, thereby improving the frame rate of the video.

In some disclosed embodiments, the manner of acquiring the two frames of video frame images may be:

and respectively acquiring two adjacent frames of images from the target video as two frames of video frame images. The frame interpolation may be performed from the direction in which the target video is displayed, or may be performed from the direction opposite to the direction in which the target video is displayed. By acquiring two adjacent frame images as two frame video frame images, compared with the method of performing intermediate frame prediction and performing frame interpolation through any two video frames, the method can screen the intermediate frames, so that the frame interpolation effect is improved.

Before detecting whether the two frames of video frame images meet the frame interpolation condition, the method further comprises the following steps: and adjusting the resolution of the two video frame images to be consistent. Generally, images captured using the same camera module have the same resolution. The specific adjustment manner may be to reduce the resolution of the high-resolution video frame image to the resolution of another video frame image, or to increase the resolution of the low-resolution video frame image to the resolution of another video frame image, or to simultaneously adjust the resolutions of two video frame images to a third resolution, and the adjustment manner regarding the resolutions is not specifically specified here. By adjusting the resolution of the two video frame images to be consistent, the quality of the intermediate frame can be improved, and the frame interpolation effect is improved.

In some disclosed embodiments, the frame insertion condition includes at least one of:

the first frame interpolation condition may be that the scene status of the two frames of video frame images is not a preset status. Wherein the scene state is determined based on a change between scenes of the two video frame images. Wherein the change between scenes of the two video frame images can be determined based on image information between the two video frame images. The frame interpolation condition is judged according to the scene state of the two frames of video frame images and the motion type of the picture, so that whether to interpolate the frame or not can be determined based on the scene and/or the motion type of the video frame images.

Specifically, the preset state is a static scene state or a scene switching state. The still scene state may be considered that a change between scenes of the two frames of video frame images is smaller than a first preset change, and the scene switching state may be considered that a change between scenes of the two frames of video frame images is larger than a second preset change. Wherein the first predetermined variation is less than the second predetermined variation. For example, a still scene may be a scene in which image information included in two video frame images is identical, and a scene switching state may be a scene in which image information included in two video frame images is completely different. When the scene state of the two frames of video frame images is a static scene state or a scene switching state, the intermediate frames of the two frames of video frame images are continuously predicted and interpolated, the frame interpolation effect is not improved, and resource waste is caused.

Optionally, when the frame interpolation condition includes that the scene state of the two frames of video frame images is not a preset state, the manner of detecting whether the two frames of video frame images satisfy the frame interpolation condition may be: and judging whether the scene state of the two frames of video frame images is a preset state or not by utilizing the parallax of the two frames of video frame images and/or the number of the matched characteristic point pairs. Specifically, in some application scenes, the disparity of two video frame images is used to determine whether the scene state of the two video frame images is the preset state, and in other application scenes, the number of the matching feature point pairs of the two video frame images is used to determine whether the scene state of the two video frame images is the preset state. In some application scenes, the disparity of the two frames of video frame images and the number of the matched feature point pairs are used for judging whether the scene state of the two frames of video frame images is a preset state or not. When the pictures of the two video frame images are in the preset motion category, the quality of the intermediate frame may be poor due to the large motion degree, and the frame interpolation effect may be reduced, so that the intermediate frame is predicted only when the picture motion degree of the two video frame images is less than or equal to the preset motion degree, and the frame interpolation effect can be improved.

Specifically, the manner of determining whether the scene state of the two frames of video frame images is the preset state by using the number of the matching feature point pairs of the two frames of video frame images may be:

and determining the scene state of the two frames of video frame images as a static scene state in response to the number of the matched feature point pairs being greater than or equal to a first preset number. Or, in response to the number of the matched feature point pairs being less than or equal to a second preset number, determining the scene state of the two frames of video frame images as a scene switching state. The first preset number is larger than the second preset number. The first preset number and the second preset number may be set according to specific needs, and if it is determined that a strict static scene state or a scene switching state is required, the first preset number may be set to be slightly larger, and the second preset number may be set to be slightly smaller, for example, the first preset number may be set to be 30 or more, and the second preset number may be set to be 3 or less. Therefore, the specific determination regarding the first predetermined number and the second predetermined number is not limited herein.

The detection process specifically comprises: and carrying out feature point detection on the two frames of video frame images to obtain two groups of feature point sets. And then carrying out feature matching on the two groups of feature point sets to obtain a plurality of groups of matched feature point pairs.

In some application scenes, the mode of determining whether the scene state of two frames of video frame images is the preset state by using the parallax of the two frames of video frame images may be:

and in response to the parallax being less than or equal to the first preset parallax, determining that the scene state in the two frames of video frame images is a static scene state. Or in response to the parallax being greater than or equal to the second preset parallax, determining the scene state in the two frames of video frame images as the scene switching state. The first preset parallax is smaller than the second preset parallax. The specific values of the first preset parallax and the second preset parallax can be specifically set according to actual requirements. The determining method of the disparity may be to obtain the disparity between each matching feature point pair, average the disparities of all the matching feature point pairs to obtain a final disparity, or obtain a median of the disparities of each matching feature point pair, and use the median as the final disparity, or use the disparity corresponding to the maximum value as the disparity corresponding to the final two-frame video frame image. Of course, in other embodiments, the manner of acquiring the parallax may also be determined according to the optical flow information, and the like, and therefore, the manner of acquiring the parallax is not specifically defined here.

In some application scenes, the manner of determining whether the scene state of two frames of video frame images is the preset state by using the disparity of the two frames of video frame images and the number of the matching feature point pairs may be as follows:

and determining the scene state in the two frames of video frame images as a static scene state in response to the number of the matched feature point pairs being greater than or equal to a first preset number and/or the parallax being less than or equal to a first preset parallax. Or, in response to the number of the matched feature point pairs being less than or equal to a second preset number and/or the disparity being greater than or equal to a second preset disparity, determining the scene state in the two frames of video frame images as the scene switching state. That is, when the number of the matching feature point pairs of the two frames of video frame images is greater than or equal to the first preset number, or the disparity between the two frames of video frame images is less than or equal to the first preset disparity, the scene state of the two frames of video frame images is considered to be a static scene state. Or when the number of the matching feature point pairs of the two frames of video frame images is greater than or equal to a first preset number and the parallax between the two frames of video frame images is less than or equal to a first preset parallax, the scene state of the two frames of video frame images is considered to be a static scene state. And determining the scene state in the two frames of video frame images to be the scene switching state, wherein the number of the matched characteristic point pairs of the two frames of video frame images is less than or equal to a second preset number, or the parallax between the two frames of video frame images is greater than or equal to a second preset parallax. Or the number of the matching feature point pairs of the two frames of video frame images is less than or equal to a second preset number, the parallax between the two frames of video frame images is greater than or equal to a second preset parallax, and the scene state in the two frames of video frame images is determined to be the scene switching state.

In some application scenes, under the condition that the number of the matched characteristic point pairs is detected to be greater than or equal to a first preset number, the parallax between the two characteristic point pairs is acquired, and if the parallax is less than or equal to the first preset parallax, the scene state of the two frames of video frame images is determined to belong to a static scene state. Or, under the condition that the number of the matched characteristic point pairs is smaller than or equal to a second preset number, acquiring the parallax between the two characteristic point pairs, and if the parallax is larger than or equal to the second preset parallax, determining that the scene state of the two frames of video frame images is the scene switching state. By combining the matching feature point pairs and the parallaxes, the scene states of the two frames of video frame images are determined, and the accuracy of the determined scene states can be improved.

The second frame interpolation condition may be that the picture motion category of the two-frame video frame image is not in the preset motion category. Specifically, the picture motion category of two frame video frame images is determined based on optical flow information between the two frame video frame images. And determining that the two frames of video frame images meet the frame inserting condition in response to the picture motion category not belonging to the preset motion category. Specifically, at least one optical flow information between two video frame images is acquired. Wherein the at least one optical-flow information comprises one or more of optical-flow information about the foreground, optical-flow information about the background, and differences in optical-flow information between the foreground and the background. The optical flow information may be obtained by inputting two frames of video frame images into an optical flow prediction network model to obtain at least one type of optical flow information between the two frames of video frame images. The optical flow prediction network model can refer to a general model, which is not described herein. Then, based on the motion category corresponding to each optical flow information, the picture motion category of the two frame video frame images is determined.

In some application scenarios, the corresponding motion class is determined from the optical flow information of the background. Wherein the division of the background and the foreground may be divided according to the optical flow information. Specifically, the optical flow representation of the background has a less severe degree of motion than the optical flow representation of the foreground. If the intensity of motion represented by the background optical flow information is greater than the first preset intensity, the intensity of motion of the foreground is considered to be greater than the first preset distance, and therefore, the motion types of the picture can be classified according to the background optical flow information.

In other application scenes, only the intensity of the motion of the foreground may be concerned, for example, when we watch a video, the attention of the foreground is generally higher than the attention of the background, and if the intensity represented by the optical flow information corresponding to the foreground corresponding to two frames of video frame images is greater than a second preset intensity, we may consider that the object concerned by us has a severe change, and at this time, we may consider that the picture motion category in the two frames of video frame images is in the preset motion category. Wherein the second predetermined severity is greater than or equal to the first predetermined severity.

In other application scenes, the motion category corresponding to the optical flow information can be determined according to the optical flow information difference between the foreground and the background, and the picture motion category of the two frames of video frame images can be determined according to the motion category. In other application scenes, the picture motion categories of the two frames of video frame images can be comprehensively determined according to the optical flow information of the foreground, the optical flow information of the background and the difference of the optical flow information between the foreground and the background. And determining the corresponding picture motion category through the optical flow information of the foreground, the optical flow information about the background and the optical flow information difference between the foreground and the background, so that the determined picture motion category is more accurate.

In the embodiment of the present disclosure, the frame interpolation condition includes that the scene state of the two video frame images is not a preset state, and the picture motion category in the two video frame images is not in a preset motion category. The scene state of the two frames of video frame images is determined through the parallax of the two frames of video frame images and the number of the matched characteristic point pairs, and the accuracy of the determined scene state can be improved. In addition, the image motion type of the two-frame video frame image can be determined through the optical flow information between the two-frame video frame images.

Referring to fig. 2, fig. 2 is a flowchart illustrating step S12 in an embodiment of the frame interpolation method of the present application. As shown in fig. 2, in the embodiment of the present disclosure, the step S12 includes the following sub-steps:

step S121: and detecting whether the scene state of the two frames of video frame images is a preset state or not.

The preset state is also a static scene state or a scene switching state. Please refer to the above description for a method for detecting whether the scene state of the two frames of video frame images is the preset state, which is not described herein again.

Step S122: and detecting whether the picture motion category in the two frames of video frame images is in a preset motion category or not in response to the scene state not being in the preset state.

That is, if the scene state is the preset state, the two frames of video frame images are directly considered not to satisfy the frame interpolation condition. And if the two frames of video frame images are determined not to meet the frame interpolation condition, the step of predicting the intermediate frame between the two frames of video frame images is not executed any more. The manner of detecting that the picture motion category in the two frames of video frame images is in the preset motion category is not described herein again.

Step S123: and determining that the two frames of video frame images meet the frame inserting condition in response to the fact that the picture motion category is not in the preset motion category.

That is, in the embodiment of the present disclosure, the scene state requiring two frames of video frame images is not the preset state, and the picture motion category is not in the preset motion category. And if the picture motion type is in the preset motion type, the two frames of video frame images are considered not to meet the frame interpolation condition, and the step of predicting the intermediate frame between the two frames of video frame images is not executed. By carrying out detection twice in sequence, the final frame interpolation effect can be improved.

In some disclosed embodiments, the frame interpolation condition includes that a picture motion class in the two frame video frame images is not in a preset motion class. Before detecting that the picture motion category in the two frames of video frame images is not in the preset motion category, the method further comprises the following steps:

and (5) inquiring whether the two frames of historical video frame images used in the (i-1) th frame interpolation meet the frame interpolation condition. The (i-1) th frame is represented as the previous frame of the current ith frame, and i is a positive integer greater than 1. For example, the two frames of video frame images used in the frame interpolation process are the second frame of video frame and the third frame of video frame in a video segment, and the two frames of historical video frame images may be the first frame of video frame and the second frame of video frame in the same video segment. That is, one frame shown in advance in the two video frame images used for the current frame interpolation is one frame shown in the later frame in the two video frame images used for the last frame interpolation. The manner for inquiring whether the two frames of historical video frame images used in the (i-1) th frame insertion satisfy the frame insertion condition may be to inquire whether the state of the scene change flag is a first preset state. And if the state of the scene change mark is a first preset state, determining that the two frames of historical video frame images do not meet the frame interpolation condition. Otherwise, determining that the two frames of historical video frame images meet the frame interpolation condition. Wherein the first preset state may be true. For example, if the state of the scene change flag is true, it is determined that the two frames of historical video frame images do not satisfy the frame interpolation condition, otherwise, it is determined that the two frames of historical video frame images satisfy the frame interpolation condition. In the last frame interpolation process, if the two historical video frame images do not meet one or two of the two frame interpolation conditions, the last frame interpolation process further comprises the step of modifying the state of the scene change mark to be true. In the process of inquiring whether the two historical video frame images used in the (i-1) th frame interpolation satisfy the frame interpolation condition, the frame interpolation condition may only include the first frame interpolation condition, may also include the second frame interpolation condition, and may also include both the two frame interpolation conditions.

And responding to that the two frames of historical video frame images meet the frame inserting condition, and executing the step of detecting that the picture motion category in the two frames of video frame images is not in the preset motion category. And in response to the two frames of historical video frame images not meeting the frame interpolation condition, not performing the step of detecting that the picture motion category in the two frames of video frame images is not in the preset motion category. By combining the historical information, the efficiency of detecting whether the two video frame image frames meet the frame interpolation condition can be improved.

And switching the state of the scene change mark to a second preset state in response to that the two frames of video frame images do not meet the frame insertion condition. Wherein the second predetermined state is different from the first predetermined state. For example, if the first predetermined state is "true", the second predetermined state may be "false".

In some disclosed embodiments, in response to the two frames of historical video frame images not satisfying the frame interpolation condition, optical flow information between the two frames of historical video frame images is utilized to determine and save the optical flow information between the two frames of video frame images in the current ith frame interpolation. Wherein, the optical flow information between the two frames of video frame images is used for determining the optical flow information between the two frames of video frame images used by the (i +1) th time of interpolation frame. The (i +1) th frame represents a frame subsequent to the current ith frame. The optical flow information between two frames of video frame images can be determined by inputting the optical flow information between two frames of historical video frame images and the two frames of video frame images into an optical flow prediction network model to obtain the optical flow information between the two frames of video frame images.

The optical flow information of the current time is stored, so that the optical flow information between two video frame image frames used in the frame interpolation process of the current time can be combined in the next frame interpolation process to determine the optical flow information.

And acquiring two frames of video frame images required to be used in the (i +1) th interpolation frame in response to the two frames of historical video images not meeting the interpolation frame condition. Then, whether the two video frames meet the frame interpolation condition is detected. If the two frames of video frame images required to be used in the (i +1) th frame insertion meet the frame insertion condition, predicting an intermediate frame between the two frames of video frame images, and then performing frame insertion processing on the two frames of video frame images by using the intermediate frame. For a specific (i +1) th frame interpolation process, please refer to the current ith frame interpolation process, which is not described herein again.

In some disclosed embodiments, after predicting an intermediate frame between two video frame images, the method further comprises the steps of:

image quality information of the intermediate frame is acquired. The two-frame video frame image includes a first video frame image and a second video frame image. The intermediate frame between the two video frame images comprises a first intermediate frame obtained based on the first video frame image and a second intermediate frame obtained based on the second video frame image. Specifically, the first intermediate frame may be predicted from optical flow information between two frames of video frame images and image information of the first video frame image, and the second intermediate frame may be predicted from optical flow information between two frames of video frame images and image information of the second video frame image. Specifically, the manner of acquiring the intermediate frame according to the optical flow information and the corresponding image information can be used for prediction by using an interpolation prediction network model. The specific frame interpolation prediction network model can refer to a general model, which is not described herein too much. Specifically, the manner of acquiring the image quality information of the intermediate frame may be: a first match error between the first intermediate frame and the second intermediate frame is obtained. Then, whether the first matching error is smaller than or equal to a preset error threshold value is judged. And if the first matching error is smaller than or equal to a preset error threshold, determining that the image quality information meets the quality requirement. Specifically, the first intermediate frame and the second intermediate frame are matched to obtain a matching error between the first intermediate frame and the second intermediate frame. Wherein the first matching error is used as the image quality information of the intermediate frame. The first matching error is used to indicate the matching degree between the two, and the smaller the first matching error is, the higher the matching degree is, and otherwise, the lower the matching degree is.

The manner of obtaining the first matching error between the first intermediate frame and the second intermediate frame may be: according to a preset dividing mode, the first intermediate frame is divided into a third preset number of first areas, and the second intermediate frame is divided into a third preset number of second areas. Optionally, the dividing manner may be to equally divide the first intermediate frame and the second intermediate frame into a plurality of regions with preset sizes, respectively. For example, the first intermediate frame and the second intermediate frame are equally divided into several regions of size 6 × 6, respectively. Then, a second matching error between the first region and the second region is calculated. Specifically, a second matching error between each set of corresponding first and second regions is calculated. And the first area and the second area corresponding to each group refer to areas located at the same image position of the first intermediate frame and the second intermediate frame. For example, if the first area is located in the first row and the first column of the first intermediate frame, the second area corresponding to the first area is located in the first row and the first column of the second intermediate frame. And acquiring a second matching error between each first area and the corresponding second area in a sliding window mode. Specifically, the second matching error between a single first region and a single second region may be obtained by obtaining matching feature point pairs between the first region and the second region, and determining the second matching error between the first region and the second region based on the position difference of the matching feature point pairs in the respective regions and the position difference between the matching feature point pairs. The second matching error may also be obtained by obtaining optical flow information between the two areas and then determining the matching error based on the optical flow information. Of course, any other manner capable of obtaining the image matching degree may be used to obtain the second matching error, and the obtaining manner of the second matching error is not specifically specified here. Finally, the first match error is determined based on a third preset number of second match errors. Optionally, all the second matching errors are added to obtain the first matching error. The first intermediate frame and the second intermediate frame are divided into a plurality of areas to obtain the matching errors, so that the first matching errors of the two intermediate frames are obtained, and the accuracy of the obtained first matching errors can be improved.

And performing the step of performing frame interpolation processing on the two frames of video frame images by using the intermediate frame in response to the image quality information meeting the quality requirement. Or, in response to the image quality information not meeting the quality requirement, the step of performing frame interpolation processing on the two frames of video frame images by using the intermediate frame is not performed. When the image quality information of the intermediate frame meets the quality requirement, the frame interpolation is carried out, and the frame interpolation effect can be improved.

In some disclosed embodiments, there may be more than one preset error threshold. The selection mode of the preset error threshold in the current frame interpolation process can be determined according to the normal motion category of the picture motion category in the two frames of video frame images. In the embodiment of the present disclosure, the normal motion category may be classified according to the situation of strenuous motion, for example, the more strenuous the motion, the higher the level of the motion category. The normal motion category does not include the above-mentioned preset motion category. Each normal motion category corresponds to a preset error threshold. Optionally, the value of the preset error threshold is in positive correlation with the level of the normal motion category, that is, the higher the level of the normal motion category is, the larger the preset error threshold is.

In other disclosed embodiments, when the two frames of video frame images satisfy the frame interpolation condition, the first intermediate frame is predicted based on the optical flow information between the first video frame image and the two frames of video frame images. And predicting to obtain a second intermediate frame based on the second video frame image and the optical flow information between the two video frame images. The above-described step S14 is then executed.

In some disclosed embodiments, step S14 is performed when the first intermediate frame and the second intermediate frame meet the quality requirement. Specifically, step S14 includes: and acquiring a final intermediate frame based on the first intermediate frame and the second intermediate frame. The final intermediate frame is then inserted between the two video frame images. Two intermediate frames are obtained through the optical flow information and the two video frame image frames, and then the final intermediate frame is obtained through the two intermediate frames, so that the accuracy of the final intermediate frame can be improved.

For better understanding of the technical solutions proposed in the embodiments of the present disclosure, please refer to the following examples. Please refer to fig. 3, fig. 3 is another flowchart illustrating a frame interpolation method according to an embodiment of the present application. As shown in fig. 3, the frame interpolation method provided by the embodiment of the present disclosure includes the following steps:

step S21: two consecutive video frame images are acquired.

The manner of obtaining the two continuous frames of video frame images is as described above, and is not described herein again.

Step S22: and preprocessing the two frames of video frame images.

The mode of preprocessing the two frames of video frame images may be to unify the resolutions of the two frames of video frame images, and the specific mode is as described above, which is not described herein again.

Step S23: and detecting whether the scene state of the two frames of video frame images is a preset state or not.

The manner of detecting whether the scene state of the two frames of video frame images is the preset state is as described above, and details are not repeated here.

Step S24: it is detected whether the state of the scene change flag is true.

For a specific way of detecting whether the state of the scene change flag is true, please refer to the above description, which is not repeated herein.

Step S25: determining optical flow information between two video frame images, and setting a state of a scene change flag to false.

The manner of determining the optical flow information between two video frame images is as described above, and is not described herein again. And, the manner of setting the state of the scene change flag to false is as described above, and is not described herein again.

Step S26: optical flow information between two video frame images is determined.

The specific manner of determining the optical flow information between two video frame images is not described herein again.

Step S27: and detecting whether the picture motion category of the two frames of video frame images is in a preset motion category.

The manner for detecting whether the picture motion category of the two frames of video frame images is in the preset motion category is as described above, and is not described herein again.

Step S28: an intermediate frame between two video frame images is predicted.

The manner of predicting the inter frame between two video frame images is as described above, and is not described herein again.

Step S29: and judging whether the image quality information of the intermediate frame meets the quality requirement.

The manner for determining whether the image quality information of the intermediate frame meets the quality requirement is as described above, and is not described herein again.

Step S30: and acquiring a final intermediate frame.

The manner of obtaining the final intermediate frame is as described above, and is not described herein again.

Step S31: and performing frame interpolation processing on the two frames of video frame images by using the final intermediate frame.

The manner of performing frame interpolation on the two frames of video frame images by using the final intermediate frame is as described above, and is not described herein again.

For example, the significance detection method may be executed by a terminal device or a server or other processing device, where the terminal device may be a User Equipment (UE), a mobile device, a User terminal, a cellular phone, a cordless phone, a Personal Digital Assistant (PDA), a handheld device, a computing device, a vehicle-mounted device, a wearable device, and the like. In some possible implementations, the saliency detection method may be implemented by a processor calling computer-readable instructions stored in a memory.

Referring to fig. 4, fig. 4 is a schematic structural diagram of an embodiment of a frame interpolation apparatus according to the present application. The frame interpolation apparatus 40 includes an image acquisition module 41, a detection module 42, a prediction module 43, and a frame interpolation module. An image obtaining module 41, configured to obtain two consecutive frames of video frame images; the detection module 42 is configured to detect whether two frames of video frame images satisfy a frame interpolation condition; a prediction module 43, configured to predict an intermediate frame between two frames of video frame images in response to the two frames of video frame images satisfying the frame interpolation condition; and the frame interpolation module is used for performing frame interpolation processing on the two frames of video frame images by utilizing the intermediate frame.

In some disclosed embodiments, the frame insertion condition includes at least one of: the scene state of the two frames of video frame images is not a preset state, and the picture motion category in the two frames of video frame images is not in a preset motion category, wherein the preset state is a static scene state or a scene switching state; and/or the degree of motion of the preset motion category is greater than the preset degree of motion.

According to the scheme, when the scene state of the two frames of video frame images is the static scene state or the scene switching state, the intermediate frames of the two frames of video frame images are continuously predicted and interpolated, the frame interpolation effect cannot be improved, and resource waste can be caused. In addition, when the picture motion types of the two video frame images are in the preset motion type, the quality of the intermediate frame may be poor due to a large motion degree, and the frame interpolation effect may be reduced, so that the intermediate frame is predicted only when the picture motion type motion degree of the two video frame images is less than or equal to the preset motion degree, and the frame interpolation effect can be improved.

In some disclosed embodiments, in a case that the frame interpolation condition includes that the scene state of the two frames of video frame images is not the preset state, the detecting module 42 detects whether the two frames of video frame images satisfy the frame interpolation condition, including: determining whether the scene state of the two frames of video frame images is a preset state or not by using the parallax of the two frames of video frame images and/or the number of the matched characteristic point pairs; if the scene state of the two frames of video frame images is not a preset state, determining that the two frames of video frame images meet the frame interpolation condition; under the condition that the frame interpolation condition includes that the picture motion category in the two frames of video frame images is not in the preset motion category, detecting whether the two frames of video frame images meet the frame interpolation condition or not, wherein the method comprises the following steps: determining the picture motion category of the two frames of video frame images based on the optical flow information between the two frames of video frame images; and determining that the two frames of video frame images meet the frame inserting condition in response to the picture motion category not belonging to the preset motion category.

According to the scheme, the scene states of the two frames of video frame images are determined through the parallax of the two frames of video frame images and the number of the matched feature point pairs, and the accuracy of the determined scene states can be improved. In addition, the image motion type of the two-frame video frame image can be determined through the optical flow information between the two-frame video frame images.

In some disclosed embodiments, the preset state is a static scene state or a scene switching state, and the determining module 42 determines whether the scene state of the two frames of video frame images is the preset state by using the disparity and/or the number of the matching feature point pairs of the two frames of video frame images, including: determining the scene state of the two frames of video frame images as a static scene state in response to the number of the matched feature point pairs being greater than or equal to a first preset number; or, in response to the parallax being less than or equal to a first preset parallax, determining that the scene state of the two frames of video frame images is a static scene state; or, in response to the number of the matched feature point pairs being less than or equal to a second preset number, determining the scene state of the two frames of video frame images as a scene switching state; or, in response to the parallax being greater than or equal to a second preset parallax, determining the scene state of the two frames of video frame images as a scene switching state; the first preset number is larger than the second preset number, and the first preset parallax is smaller than the second preset parallax.

According to the scheme, the scene states of the two frames of video frame images are determined by combining the matching feature point pairs and the parallaxes, and the accuracy of the determined scene states can be improved.

In some disclosed embodiments, detecting module 42 determines the picture motion category of the two frames of video frame images based on optical flow information between the two frames of video frame images, including: acquiring at least one optical flow information between two frames of video frame images, wherein the at least one optical flow information comprises one or more of optical flow information about a foreground, optical flow information about a background, and optical flow information difference between the foreground and the background; and determining the picture motion category of the two frames of video frame images based on the motion category corresponding to each type of optical flow information.

According to the scheme, the corresponding picture motion category is determined according to the optical flow information of the foreground, the optical flow information about the background and the optical flow information difference between the foreground and the background, so that the determined picture motion category is more accurate.

In some disclosed embodiments, the frame interpolation condition includes that the scene state of the two frames of video frame images is not a preset state, and the picture motion category in the two frames of video frame images is not in a preset motion category; the detection module 42 detects whether the two frames of video frame images satisfy the frame interpolation condition, including: detecting whether the scene state of two frames of video frame images is a preset state or not; detecting whether picture motion categories in two frames of video frame images are in a preset motion category or not in response to the scene state not being in the preset state; and determining that the two frames of video frame images meet the frame inserting condition in response to the fact that the picture motion category is not in the preset motion category.

According to the scheme, the final frame interpolation effect can be improved by carrying out detection twice in sequence.

In some disclosed embodiments, the frame insertion condition includes that the picture motion category in the two video frame images is not in the preset motion category, and before detecting that the picture motion category in the two video frame images is not in the preset motion category, the detection module 42 is further configured to: inquiring whether two frames of historical video frame images used in the (i-1) th frame insertion meet the frame insertion condition or not, wherein the (i-1) th frame insertion is represented as the previous frame insertion of the current ith frame insertion, and i is a positive integer greater than 1; and in response to the two frames of historical video frame images used in the (i-1) th time of frame interpolation meeting the frame interpolation condition, performing the step of detecting that the picture motion category in the two frames of video frame images is not in the preset motion category.

According to the scheme, the efficiency of detecting whether the two video frame image frames meet the frame interpolation condition or not can be improved by combining the historical information.

The step of querying whether the two frames of historical video frame images used in the (i-1) th frame interpolation satisfy the frame interpolation condition by the detection module 42 includes: inquiring whether the state of the scene change mark is a first preset state or not; if the state of the scene change mark is a first preset state, determining that the two frames of historical video frame images do not meet the frame interpolation condition; the method further comprises the following steps: responding to the fact that the two frames of historical video frame images do not meet the frame inserting condition, and switching the state of the scene change mark to a second preset state, wherein the second preset state is different from the first preset state; and acquiring two frames of video frame images required to be used in the (i +1) th frame insertion in response to the two frames of historical video frame images not meeting the frame insertion condition.

In some disclosed embodiments, the detection module 42 is further configured to: and in response to that the two frames of historical video frame images do not meet the frame interpolation condition, determining and saving optical flow information between the two frames of video frame images in the current ith frame interpolation by using the optical flow information between the two frames of historical video frame images, wherein the optical flow information between the two frames of video frame images is used for determining the optical flow information between the two frames of video frame images used by the (i +1) th frame interpolation.

According to the scheme, the optical flow information at this time is stored, so that the optical flow information between two video frame image frames used in the frame interpolation process at this time can be combined to determine the optical flow information in the next frame interpolation process.

In some disclosed embodiments, before the frame interpolation processing is performed on the two frames of video frame images by using the intermediate frame, the frame interpolation module is further configured to: acquiring image quality information of an intermediate frame; and performing the step of performing frame interpolation processing on the two frames of video frame images by using the intermediate frame in response to the image quality information meeting the quality requirement.

According to the scheme, when the image quality information of the intermediate frame meets the quality requirement, the frame interpolation is carried out, and the frame interpolation effect can be improved.

In some disclosed embodiments, the two video frame images include a first video frame image and a second video frame image, and the intermediate frame between the two video frame images includes a first intermediate frame derived based on the first video frame image and a second intermediate frame derived based on the second video frame image; acquiring image quality information of an intermediate frame, comprising: acquiring a first matching error between the first intermediate frame and the second intermediate frame; judging whether the image quality information meets the quality requirement or not, comprising the following steps: judging whether the first matching error is smaller than or equal to a preset error threshold value or not; and if the first matching error is smaller than or equal to a preset error threshold, determining that the image quality information meets the quality requirement.

According to the scheme, the first intermediate frame and the second intermediate frame are obtained by respectively combining the first video frame image and the second video frame image with the optical flow information, and if the matching error of the first intermediate frame and the second intermediate frame is small, the quality of the two intermediate frames can be considered to meet the requirement.

In some disclosed embodiments, the frame interpolation module obtains a first match error between the first intermediate frame and the second intermediate frame, including: dividing the first intermediate frame into a third preset number of first areas and dividing the second intermediate frame into a third preset number of second areas according to a preset dividing mode; calculating a second matching error between the first region and the second region; the first match error is determined based on a third preset number of second match errors.

According to the scheme, the first intermediate frame and the second intermediate frame are divided into the plurality of areas to obtain the matching errors, so that the first matching errors of the two intermediate frames are obtained, and the accuracy of the obtained first matching errors can be improved.

In some disclosed embodiments, prediction module 43 is further configured to: the frame interpolation module predicts to obtain a first intermediate frame based on the optical flow information between the first video frame image and the two video frame images, and predicts to obtain a second intermediate frame based on the optical flow information between the second video frame image and the two video frame images; the method for performing frame interpolation processing on two frames of video frame images by using the intermediate frame comprises the following steps: and acquiring a final intermediate frame based on the first intermediate frame and the second intermediate frame, and inserting the final intermediate frame between the two video frame images.

According to the scheme, the two intermediate frames are acquired through the optical flow information and the two video frame image frames, and the final intermediate frame is acquired by utilizing the two intermediate frames, so that the accuracy of the final intermediate frame can be improved.

Referring to fig. 5, fig. 5 is a schematic structural diagram of an embodiment of an electronic device according to the present application. The electronic device 50 comprises a memory 51 and a processor 52, and the processor 52 is configured to execute program instructions stored in the memory 51 to implement the steps in the above-described embodiment of the frame interpolation method. In one particular implementation scenario, electronic device 50 may include, but is not limited to: medical equipment, a microcomputer, a desktop computer, a server, and the electronic equipment 50 may also include mobile equipment such as a notebook computer, a tablet computer, and the like, which is not limited herein.

In particular, the processor 52 is configured to control itself and the memory 51 to implement the steps in any of the above embodiments of the training method of the significance detection model. Processor 52 may also be referred to as a CPU (Central Processing Unit). Processor 52 may be an integrated circuit chip having signal processing capabilities. The Processor 52 may also be a general purpose Processor, a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), a Field Programmable Gate Array (FPGA) or other Programmable logic device, discrete Gate or transistor logic, discrete hardware components. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like. In addition, the processor 52 may be commonly implemented by an integrated circuit chip.

Referring to fig. 6, fig. 6 is a schematic structural diagram of an embodiment of a computer-readable storage medium according to the present application. The computer readable storage medium 60 stores program instructions 61 executable by the processor, the program instructions 61 for implementing the steps in the above-described embodiment of the frame interpolation method.

In some embodiments, functions of or modules included in the apparatus provided in the embodiments of the present disclosure may be used to execute the method described in the above method embodiments, and specific implementation thereof may refer to the description of the above method embodiments, and for brevity, will not be described again here.

The foregoing description of the various embodiments is intended to highlight various differences between the embodiments, and the same or similar parts may be referred to each other, and for brevity, will not be described again herein.

In the several embodiments provided in the present application, it should be understood that the disclosed method and apparatus may be implemented in other ways. For example, the above-described apparatus embodiments are merely illustrative, and for example, a division of a module or a unit is merely one type of logical division, and an actual implementation may have another division, for example, a unit or a component may be combined or integrated with another system, or some features may be omitted, or not implemented. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection of devices or units through some interfaces, and may be in an electrical, mechanical or other form.

In addition, functional units in the embodiments of the present application may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit. The integrated unit can be realized in a form of hardware, and can also be realized in a form of a software functional unit. The integrated unit, if implemented in the form of a software functional unit and sold or used as a stand-alone product, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present application may be substantially implemented or contributed to by the prior art, or all or part of the technical solution may be embodied in a software product, which is stored in a storage medium and includes instructions for causing a computer device (which may be a personal computer, a server, a network device, or the like) or a processor (processor) to execute all or part of the steps of the method according to the embodiments of the present application. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and other various media capable of storing program codes.

Claims

1. A method of frame interpolation, comprising:

acquiring two continuous frames of video frame images;

detecting whether the two frames of video frame images meet frame interpolation conditions;

predicting an intermediate frame between the two frames of video frame images in response to the two frames of video frame images satisfying the frame interpolation condition;

and performing frame interpolation processing on the two frames of video frame images by using the intermediate frame.

2. The method of claim 1, wherein the frame insertion condition comprises at least one of: the scene state of the two frames of video frame images is not a preset state, the picture motion category in the two frames of video frame images is not in a preset motion category, wherein the preset state comprises a static scene state and a scene switching state, and the motion degree of the preset motion category is greater than the preset motion degree.

3. The method of claim 2,

under the condition that the frame interpolation condition includes that the scene states of the two frames of video frame images are not in the preset state, the detecting whether the two frames of video frame images meet the frame interpolation condition includes:

judging whether the scene states of the two frames of video frame images are in a preset state or not by using the parallax of the two frames of video frame images and/or the number of the matched feature point pairs;

if the scene state of the two frames of video frame images is not a preset state, determining that the two frames of video frame images meet a frame interpolation condition;

under the condition that the frame interpolation condition includes that the picture motion category in the two frames of video frame images is not in a preset motion category, the detecting whether the two frames of video frame images meet the frame interpolation condition includes:

determining a picture motion category of the two frames of video frame images based on optical flow information between the two frames of video frame images;

and determining that the two frames of video frame images meet an interpolation condition in response to the picture motion category not belonging to the preset motion category.

4. The method according to claim 3, wherein the determining whether the scene state of the two frames of video frame images is a preset state by using the disparity and/or the number of the matching feature point pairs of the two frames of video frame images comprises:

determining the scene state of the two frames of video frame images to be a static scene state in response to the number of the matched feature point pairs being greater than or equal to a first preset number;

or the like, or, alternatively,

in response to the parallax being less than or equal to a first preset parallax, determining that the scene state of the two frames of video frame images is a static scene state;

or the like, or, alternatively,

determining the scene state of the two frames of video frame images as a scene switching state in response to the number of the matched feature point pairs being less than or equal to a second preset number;

or the like, or, alternatively,

in response to the fact that the parallax is larger than or equal to a second preset parallax, determining that the scene state of the two frames of video frame images is a scene switching state;

the first preset number is larger than the second preset number, and the first preset parallax is smaller than the second preset parallax.

5. The method of claim 3, wherein determining the picture motion class of the two frame video frame images based on optical flow information between the two frame video frame images comprises:

acquiring at least one optical flow information between the two frames of video frame images, wherein the at least one optical flow information comprises one or more of optical flow information about a foreground, optical flow information about a background, and optical flow information differences between the foreground and the background;

and determining the picture motion category of the two frames of video frame images based on the motion category corresponding to each type of optical flow information.

6. The method according to any one of claims 2 to 5, wherein the frame interpolation condition includes that the scene status of the two frames of video frame images is not a preset status, and the picture motion category in the two frames of video frame images is not in a preset motion category; the detecting whether the two frames of video frame images meet the frame interpolation condition includes:

detecting whether the scene state of the two frames of video frame images is a preset state or not;

detecting whether the picture motion category in the two frames of video frame images is in a preset motion category or not in response to the scene state not being in a preset state;

and determining that the two frames of video frame images meet the frame insertion condition in response to the picture motion category not being in the preset motion category.

7. The method according to any one of claims 2 to 6, wherein the frame interpolation condition comprises that the picture motion category in the two frame video frame images is not in a preset motion category, and before detecting that the picture motion category in the two frame video frame images is not in the preset motion category, the method further comprises:

inquiring whether two frames of historical video frame images used in the (i-1) th frame insertion meet the frame insertion condition or not, wherein the (i-1) th frame insertion is represented as the previous frame insertion of the current ith frame insertion, and i is a positive integer greater than 1;

in response to that the two frames of historical video frame images used in the (i-1) th time of interpolation frame meet the interpolation frame condition, executing the step of detecting that the picture motion category in the two frames of video frame images is not in the preset motion category.

8. The method according to claim 7, wherein said querying whether the two frames of historical video frame images used in the (i-1) th frame insertion satisfy the frame insertion condition comprises:

inquiring whether the state of the scene change mark is a first preset state or not;

if the state of the scene change mark is the first preset state, determining that the two frames of historical video frame images do not meet the frame interpolation condition;

the method further comprises the following steps:

responding to the fact that the two frames of historical video frame images do not meet the frame inserting condition, and switching the state of the scene change mark to a second preset state, wherein the second preset state is different from the first preset state; and the number of the first and second groups,

and acquiring two frames of video frame images required to be used in the (i +1) th frame insertion in response to that the two frames of historical video frame images do not meet the frame insertion condition.

9. The method of claim 7, further comprising:

and in response to that the two frames of historical video frame images do not meet the frame interpolation condition, determining and saving optical flow information between the two frames of video frame images in the current ith frame interpolation by using the optical flow information between the two frames of historical video frame images, wherein the optical flow information between the two frames of video frame images is used for determining the optical flow information between the two frames of video frame images used by the (i +1) th frame interpolation.

10. The method according to any of claims 1 to 9, wherein after said predicting an intermediate frame between said two frames of video frame images, said method further comprises:

acquiring image quality information of the intermediate frame;

judging whether the image quality information meets the quality requirement;

the frame interpolation processing of the two frames of video frame images by using the intermediate frame comprises:

and performing frame interpolation processing on the two frames of video frame images by utilizing the intermediate frame in response to the image quality information meeting the quality requirement.

11. The method according to claim 10, wherein the two video frame images comprise a first video frame image and a second video frame image, and wherein an intermediate frame between the two video frame images comprises a first intermediate frame derived based on the first video frame image and a second intermediate frame derived based on the second video frame image;

the acquiring the image quality information of the intermediate frame includes:

acquiring a first matching error between the first intermediate frame and the second intermediate frame;

the determining whether the image quality information meets the quality requirement includes:

judging whether the first matching error is smaller than or equal to a preset error threshold value or not;

and if the first matching error is smaller than or equal to the preset error threshold, determining that the image quality information meets the quality requirement.

12. The method of claim 11, wherein obtaining the first match error between the first intermediate frame and the second intermediate frame comprises:

dividing the first intermediate frame into a third preset number of first regions and dividing the second intermediate frame into a third preset number of second regions according to a preset dividing mode;

calculating a second match error between the first region and the second region;

determining the first match error based on a third preset number of the second match errors.

13. The method of claim 11, further comprising:

predicting to obtain a first intermediate frame based on optical flow information between the first video frame image and the two frames of video frame images, and predicting to obtain a second intermediate frame based on optical flow information between the second video frame image and the two frames of video frame images;

the frame interpolation processing of the two frames of video frame images by using the intermediate frame comprises the following steps:

and acquiring a final intermediate frame based on the first intermediate frame and the second intermediate frame, and inserting the final intermediate frame between the two video frame images.

14. An apparatus for frame interpolation, comprising:

the image acquisition module is used for acquiring two continuous frames of video frame images;

the detection module is used for detecting whether the two frames of video frame images meet the frame interpolation condition;

a prediction module for predicting an intermediate frame between the two frames of video frame images in response to the two frames of video frame images satisfying the frame interpolation condition;

and the frame interpolation module is used for performing frame interpolation processing on the two frames of video frame images by utilizing the intermediate frame.

15. An electronic device comprising a memory and a processor for executing program instructions stored in the memory to implement the method of any of claims 1 to 13.

16. A computer readable storage medium having stored thereon program instructions, which when executed by a processor implement the method of any of claims 1 to 13.