CN114268797B

CN114268797B - Method, device, storage medium and electronic equipment for time domain filtering of video

Info

Publication number: CN114268797B
Application number: CN202111589880.7A
Authority: CN
Inventors: 谷嘉文; 闻兴
Original assignee: Beijing Dajia Internet Information Technology Co Ltd
Current assignee: Beijing Dajia Internet Information Technology Co Ltd
Priority date: 2021-12-23
Filing date: 2021-12-23
Publication date: 2024-02-06
Anticipated expiration: 2041-12-23
Also published as: CN114268797A

Abstract

The present disclosure provides a method, apparatus, storage medium, and electronic device for temporal filtering of video. The method comprises the following steps: constructing a multi-layer video frame for the current video frame; based on the position of the current line where the coding unit of the current video frame is located, the size of the coding unit and the size of the segmented blocks of each layer of video frame, respectively determining a motion compensation area for performing motion compensation on the current line in each layer of video frame by taking the line as a division granularity; acquiring a compensation starting point of a motion compensation area of each layer of video frame, and performing motion compensation on blocks in the motion compensation area of each layer of video frame based on the compensation starting point to obtain a motion vector of the motion compensation area of each layer of video frame; temporal filtering is performed on the current line of a current video frame based on the motion vector. The method for time domain filtering of the video can effectively solve the problem of multithread blocking caused by whole-frame MCTF processing, and greatly improves the coding efficiency.

Description

Method, device, storage medium and electronic equipment for time domain filtering of video

Technical Field

The present disclosure relates to the field of video encoding and decoding, and more particularly, to a temporal filtering method, a temporal filtering apparatus, an electronic device, and a computer-readable storage medium for parallel Motion Compensated Temporal Filtering (MCTF) based on lines.

Background

Image data of video is encoded by a video encoder based on a specific data compression standard, for example, a Moving Picture Experts Group (MPEG) standard, high Efficiency Video Coding (HEVC), and Versatile Video Coding (VVC), and then stored in a recording medium or transmitted in the form of a bit stream through a communication channel.

According to development and release of hardware capable of reproducing and storing high-resolution or high-quality image contents, there is an increasing demand for codecs for efficiently encoding or decoding high-resolution or high-quality image contents. Recently, a method for efficiently compressing high resolution or high quality image contents has been implemented.

In a related art video coding apparatus/temporal filtering apparatus, a motion compensation-based temporal filtering method (Motion Compensated Temporal Filtering, MCTF) uses temporal correlation of video and temporal filtering of reference frames using inter-block reference relationships to reduce temporal redundancy information generated in a video block reference process, thereby improving overall coding efficiency.

However, this solution filters mainly at fixed frame intervals (frames modulo 8 by 0) for fixed frame structures. In the actual encoding process, the frame structure is dynamically adjusted according to the scene change to obtain higher compression efficiency. Second, since the QP parameters of the current frame need to be used in filtering, the QP decisions for most encoders are obtained by the rate control model. Finally, because the calculation is complex and the filtering process must be completed before encoding, the operation of the actual encoding will be blocked in the actual encoding process, and the encoding time will be greatly increased.

Disclosure of Invention

The present disclosure provides a temporal filtering method, apparatus, storage medium and electronic device for video to solve at least the problem of determining an optimal coding path in the related art, or not solve any of the above problems.

According to a first aspect of the present disclosure, there is provided a method for temporal filtering of video, which may comprise: constructing a multi-layer video frame for the current video frame; based on the position of the current line where the coding unit of the current video frame is located, the size of the coding unit and the size of the segmented blocks of each layer of video frame, respectively determining a motion compensation area for performing motion compensation on the current line in each layer of video frame by taking the line as a division granularity; acquiring a compensation starting point of a motion compensation area of each layer of video frame, and performing motion compensation on blocks in the motion compensation area of each layer of video frame based on the compensation starting point to obtain a motion vector of the motion compensation area of each layer of video frame; temporal filtering is performed on the current line of a current video frame based on the motion vector.

Optionally, acquiring a compensation start point of a motion compensation region of each layer video frame, and performing motion compensation on a block in the motion compensation region of each layer video frame based on the compensation start point to obtain a motion vector of the motion compensation region of each layer video frame may include: if the current layer video frame in the video frames of each layer is not the tail layer video frame, determining a compensation starting point of a motion compensation area of a next layer video frame based on a motion vector of the motion compensation area of the current layer video frame; if a current layer video frame of the respective layer video frames is a tail layer video frame, temporal filtering is performed on the current line of the current video frame based on a motion vector of a motion compensation region of the current layer video frame.

Optionally, determining, with the line as the division granularity, the motion compensation area for performing motion compensation on the current line in each layer video frame based on the position of the current line where the coding unit of the current video frame is located and the size of the coding unit and the size of the block where each layer video frame is divided, may include: based on the position of the current line where the coding unit of the current video frame is located and the size of the coding unit and the size of the block in which each layer of video frame is divided, determining an area covering the range of the current line of the current video frame as a motion compensation area of a tail layer video frame in each layer of video frames and determining a motion compensation area in each layer of video frames above the tail layer video frame as an area covering a preset range of a motion compensation area of a next layer of video frame corresponding to the area in the layer of video frame.

Optionally, motion compensation for all blocks within the motion compensation region of each layer video frame is performed in parallel.

Optionally, determining a motion compensation region for motion compensation for the current line in each layer of video frame may include: and if the current line is the first line of the current video frame, acquiring a corresponding line number from the first line of each layer of video frames and taking a region corresponding to the corresponding line number as a motion compensation region of each layer of video frames.

Optionally, determining a motion compensation region for motion compensation for the current line in each layer of video frame may include: and if the current line is a non-first line of the current video frame, acquiring the line number of each layer of video frames corresponding to the current line of each layer of video frames, and taking the area corresponding to the line number as a motion compensation area of each layer of video frames.

Optionally, determining a motion compensation region for motion compensation for the current line in each layer of video frame may include: determining whether a position of a line within a motion compensation region of each layer video frame exceeds a maximum number of lines of the layer video frame; responsive to determining that the position of a line within a motion compensation region of the layer video frame exceeds a maximum number of lines of the layer video frame, the position of the motion compensation region is determined to truncate by the maximum number of lines of the layer video frame.

Optionally, the multi-layer video frame comprises four-layer video frames; among the video frames of each layer, the first layer video frame, the second layer video frame, and the third layer video frame are divided into blocks of a first size for motion compensation, and the fourth layer video frame is divided into blocks of a second size for motion compensation, wherein the first size is 2 times the second size.

According to a second aspect of the present disclosure, there is provided an apparatus for temporal filtering of video, which may comprise: a downsampling unit configured to construct a multi-layered video frame for a current video frame; a motion compensation region determining unit configured to determine motion compensation regions for performing motion compensation for a current line in each layer video frame with a line as a division granularity, based on a position of the current line where a coding unit of the current video frame is located and a size of the coding unit and a size of a block in which each layer video frame is divided; a motion compensation unit configured to acquire a compensation start point of a motion compensation region of each layer video frame, and perform motion compensation on blocks in the motion compensation region of each layer video frame based on the compensation start point to obtain a motion vector of the motion compensation region of each layer video frame; and a filtering unit configured to perform temporal filtering on the current line of a current video frame based on the motion vector.

Alternatively, the motion compensation unit may be configured to: if the current layer video frame of the respective layer video frames is not the tail layer video frame, determining a compensation start point of a motion compensation region of a next layer video frame based on a motion vector of the motion compensation region of the current layer video frame, wherein if the current layer video frame of the respective layer video frames is the tail layer video frame, the filtering unit may perform temporal filtering on the current line of the current video frame based on the motion vector of the motion compensation region of the current layer video frame.

Alternatively, the motion compensation region determination unit may be configured to: based on the position of the current line where the coding unit of the current video frame is located and the size of the coding unit and the size of the block in which each layer of video frame is divided, determining an area covering the range of the current line of the current video frame as a motion compensation area of a tail layer video frame in each layer of video frames and determining a motion compensation area in each layer of video frames above the tail layer video frame as an area covering a preset range of a motion compensation area of a next layer of video frame corresponding to the area in the layer of video frame.

Alternatively, the motion compensation region determination unit may be configured to: and if the current line is the first line of the current video frame, acquiring a corresponding line number from the first line of each layer of video frames and taking a region corresponding to the corresponding line number as a motion compensation region of each layer of video frames.

Alternatively, the motion compensation region determination unit may be configured to: and if the current line is a non-first line of the current video frame, acquiring the line number of each layer of video frames corresponding to the current line of each layer of video frames, and taking the area corresponding to the line number as a motion compensation area of each layer of video frames.

Alternatively, the motion compensation region determination unit may be configured to: determining whether a position of a line within a motion compensation region of each layer video frame exceeds a maximum number of lines of the layer video frame; responsive to determining that the position of a line within a motion compensation region of the layer video frame exceeds a maximum number of lines of the layer video frame, the position of the motion compensation region is determined to truncate by the maximum number of lines of the layer video frame.

According to a third aspect of embodiments of the present disclosure, there is provided an electronic device comprising: at least one processor; at least one memory storing computer-executable instructions, wherein the computer-executable instructions, when executed by the at least one processor, cause the at least one processor to perform a method for temporal filtering of video as described above.

According to a fourth aspect of embodiments of the present disclosure, there is provided a computer readable storage medium, which when executed by a processor of a temporal filtering apparatus/electronic device/server, enables the temporal filtering apparatus/electronic device/server to perform a method for temporal filtering of video as described above.

According to a fifth method of an embodiment of the present disclosure, a computer program product is provided, instructions in which are executed by at least one processor in an electronic device to perform a method for temporal filtering of video as described above.

The technical scheme provided by the embodiment of the disclosure at least brings the following beneficial effects: by carrying out line-level parallel splitting processing on MCTF, the problem of multithread blocking caused by whole-frame MCTF processing is effectively solved, and the coding efficiency is greatly improved. Furthermore, the method for temporal filtering of video according to the exemplary embodiments of the present disclosure does not lose any precision during the motion vector search, and does not bring about coding loss.

It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the disclosure.

Drawings

The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the disclosure and together with the description, serve to explain the principles of the disclosure and do not constitute an undue limitation on the disclosure.

FIG. 1 is a schematic diagram illustrating a method of temporal filtering based on Motion Compensated Temporal Filtering (MCTF);

figure 2 is a schematic diagram of a method of temporal filtering of parallel MCTF based on rows in accordance with an exemplary embodiment of the present disclosure.

Figure 3 is a schematic diagram illustrating the structure of a hierarchical video frame of a pyramid structure used in a method of temporal filtering of row-based parallel MCTF according to an exemplary embodiment of the present disclosure.

Figure 4 is a flowchart illustrating a method of temporal filtering of row-based parallel MCTF in accordance with an exemplary embodiment of the present disclosure.

Figure 5 is a block diagram illustrating a temporal filtering apparatus for row-based parallel MCTF according to an exemplary embodiment of the present disclosure.

Figure 6 is a block diagram of an electronic device illustrating a method for performing temporal filtering of row-based parallel MCTF according to an exemplary embodiment of the present disclosure.

Figure 7 is a schematic diagram illustrating an electronic device for performing a method of temporal filtering of parallel line-based MCTF, according to another exemplary embodiment.

Detailed Description

In order to enable those skilled in the art to better understand the technical solutions of the present disclosure, the technical solutions of the embodiments of the present disclosure will be clearly and completely described below with reference to the accompanying drawings.

It should be noted that the terms "first," "second," and the like in the description and claims of the present disclosure and in the foregoing figures are used for distinguishing between similar objects and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used may be interchanged where appropriate such that the embodiments of the disclosure described herein may be capable of operation in sequences other than those illustrated or described herein. The embodiments described in the examples below are not representative of all embodiments consistent with the present disclosure. Rather, they are merely examples of apparatus and methods consistent with some aspects of the present disclosure as detailed in the accompanying claims.

It should be noted that, in this disclosure, "at least one of the items" refers to a case where three types of juxtaposition including "any one of the items", "a combination of any of the items", "an entirety of the items" are included. For example, "including at least one of a and B" includes three cases side by side as follows: (1) comprises A; (2) comprising B; (3) includes A and B. For example, "at least one of the first and second steps is executed", that is, three cases are juxtaposed as follows: (1) performing step one; (2) executing the second step; (3) executing the first step and the second step.

Before explaining embodiments of the present disclosure in detail, some terms or abbreviations that may be involved in the embodiments of the present disclosure are explained.

Figure 1 is a schematic diagram of a process of temporal filtering based on Motion Compensated Temporal Filtering (MCTF), and figure 2 is a schematic diagram of a process of temporal filtering based on parallel MCTF according to an exemplary embodiment of the present disclosure. In the following description, high Efficiency Video Coding (HEVC) is described as an example of applying the MCTF method of the present disclosure, but it is understood that the MCTF method according to an exemplary embodiment of the present disclosure may be applied to other video codec standards (e.g., a multi-function video coding (VVC) standard).

As shown in fig. 1, a pre-processing module pre-analyzes a video frame to obtain a frame type and a quantization parameter of the video frame, and performs a normal encoding process after obtaining the frame type and the quantization parameter, and then performs MCTF on the video frame. Normal coding processes typically cut video into lines for processing and code the video in a line-parallel manner sequentially from top to bottom.

Here, if the current video frame needs to be temporally filtered (as shown in gray boxes in fig. 1), a pyramid structure as shown in fig. 3 is constructed for both its neighboring frames (e.g., two frames each) before and after and the current video frame (a total of 5 frames). The pyramid structure contains 1/4 and 1/2 downsampled frames and video frames of the original size, denoted m_pyramidtpic [ idx ] in fig. 3, idx=0, 1, 2 representing 1, 1/2, 1/4 downsampled frames, respectively.

Then, motion compensation is performed by using the video frame and the adjacent frames with the pyramid structure, and a corresponding motion vector MV is found. Here, assuming that the current frame is N and the neighboring frames are N-2, N-1, n+1, n+2, MVs of the current frame N and the neighboring frames N-2, N-1, n+1, and n+2 are sequentially calculated. The calculation is generally as follows:

a. a 1/4 downsampled video frame is cut into blocks of a predetermined size (e.g., 16 x 16 blocks are employed in the case of HEVC coding), and blocks of a predetermined range of 8 (range=8) of co-located blocks in neighboring frames are searched for by means of sum of difference squares (SSE), and the best matching block is obtained. That is, a block having the smallest SSE between corresponding pixel values of a block in a current 1/4 downsampled video frame is searched for among blocks within a predetermined range in the 1/4 downsampled adjacent video frame as a best matching block, and a position difference (i.e., a motion vector MV 1) between the best matching block and the block in the current frame is determined. Here, a search for the best matching block is performed for all blocks in the 1/4 downsampled video frame, resulting in a motion vector MV1 for each block. That is, MV1 herein may represent a set of motion vectors between each block in a 1/4 downsampled video frame and its corresponding best matching block.

b. The 1/2 downsampled video frame is also cut into blocks of the same size as the blocks of the 1/4 downsampled video frame. Then, for a block corresponding to a block in the 1/4 downsampled video frame among the 1/2 downsampled video frames, a motion vector search after the filtering process is performed within a range=2 using 2×mv1 as a search start point based on MV1 calculated from the matching block in the 1/4 downsampled video frame, and on the basis of the search result thereof, a motion vector MV2 of the matching block is obtained by performing a search within a range=5 in a step size of 1 by means of SSE.

As an example, let the coordinate position of a block be represented by coordinates (x, y), where x in the coordinate position (x, y) represents the row coordinates of the block, y represents the column coordinates of the block, and x, y is an integer equal to or greater than 0. Thus, the coordinates of the block of the first row and first column are (0, 0), and so on. In the above search process, the block at the position (x, y) in the 1/2 downsampled video frame is searched in the above manner using the block at the position (floor (x/2)) mv1×2 in the 1/4 downsampled video frame as the search start point. For example, a block with a coordinate position of (0, 0), (1, 0), (0, 1), (1, 1) in a 1/2 downsampled video frame may use MV of a block with a coordinate position of (0, 0) in a 1/4 downsampled video frame as a search start point. The block with the coordinate position (0, 2), (0, 3), (1, 2), (1, 3) in the 1/2 downsampled video frame may use MV of the block with the coordinate position (0, 1) in the 1/4 downsampled video frame as the search start point. Likewise, MV2 herein may represent a set of motion vectors between each block in a 1/2 downsampled video frame and its corresponding best matching block.

c. First, a video frame of an original size is divided into 16×16 blocks, and then, similarly as in the above procedure b, a motion vector MV3 of a matching block is searched for in the manner in b, using 2×mv2 as a search start point for a block in the video frame of the original size, based on MV2 of a corresponding block in the video frame of 1/2 downsampling. Likewise, MV3 here may represent a set of motion vectors between each 16 x 16 block in the original size video frame and its corresponding best matching block.

As an example, the block located at (x, y) in the video frame of the original size of 16×16 blocks may use information of the block located at (floor (x/2), floor (y/2)) in the 1/2 downsampled video frame.

d. First, a video frame of an original size is divided into 8×8 blocks, and then, based on MV3 calculated in c, a motion vector of a matching block is searched in the manner in b using MV3 as a search start point of the block in the video frame of the original size. Then, on the basis of the motion vector of the matching block, search is performed in a range of range=12 in a step size of 3. Finally, in the range=3, searching is performed in a step size of 1 to obtain the final MV4. To this end, MVs of 8×8 blocks of each original video frame can be acquired.

For example, the block located at (x, y) in the video frame of the original size divided into 8x8 blocks may use information of the block located at (floor (x/2), floor (y/2)) in the video frame of the original size divided into 16 x 16 blocks.

After deriving motion compensated MVs for each block of the video frame, a temporal filtering process may be performed.

The time domain filtering process mainly comprises the following steps:

1. according to MV information of blocks of the video frame, block information after filtering in adjacent frames is calculated respectively. That is, the calculated MV is used to find the position of the corresponding block in the corresponding frame, and the interpolation filtered image is obtained by interpolation filtering.

2. Weighting the interpolation filtered results of adjacent frames according to a predetermined weight and generating a final output. That is, after the interpolation filtered image is obtained in 1 above, the interpolation filtered image is used as an intensity parameter in bilateral filtering to obtain the weight of the output pixel, and the pixels of all frames are weighted to obtain the final output image.

Since MCTF does not operate on all frame types but only on part of the frame types, and since the encoder has an inter-frame parallelism mechanism, some frames need to wait for the encoding of line 0 of the preceding frame to begin, the operation of MCTF in fig. 1 may block the actual encoding of line 0 of the current frame from being delayed, thereby affecting the time for the subsequent frames to begin encoding.

Thus, the MCTF method according to an exemplary embodiment of the present disclosure splits the MCTF process into row-level parallel processing. Specifically, the flow of MCTF is split into line-level parallelism, i.e., MCTF processing of the current line is performed before it is processed. The processing framework is shown in figure 2, where the grey parts of the illustration represent the MCTF processing for the current line.

A method of temporal filtering of row-based parallel MCTF according to an exemplary embodiment of the present disclosure will be described with reference to fig. 4. In the following description, a definition is made that the resolution of a video frame is w×h, the resolution of a 1/2 downsampled video frame is W/2×h/2, the resolution of a 1/4 downsampled video frame is W/4×h/4, a Coding Tree Unit (CTU) of a size of 64×64 in HEVC is described as an example of a coding unit, and the search block granularity in 1/4 and 1/2 downsampled video frames is 16×16 except that the motion vector search is performed at the search block granularity of 8×8 on the full resolution video frame last time. It should be appreciated that the above parameters are examples only and may vary depending on the coding standard. For example, in an encoder of the VVC standard, the MCTF method according to an exemplary embodiment of the present disclosure may be performed for CTUs of a size of 128×128.

First, in step S410, a multi-layer video frame is constructed for a current video frame.

As an example, a four-layer video frame may be constructed from a current video frame, where the first layer video frame is a 1/4 downsampled video frame of the current video frame, the second layer video frame is a 1/2 downsampled video frame of the current video frame, and the third and fourth layer video frames are full resolution video frames of the current video frame. For example, if the resolution of the current video frame is 640×640, the resolution of the 1/4 downsampled video frame of the first layer video frame is 160×160, the resolution of the second layer video frame is 320×320, and the third and fourth layer video frames are 640×640 of the original resolution. However, the above-described manner of constructing a multi-layer video frame is merely exemplary, and the present disclosure is not limited thereto.

Next, in step S420, motion compensation regions for motion compensation of the current line in each layer video frame are determined with the line as the division granularity, based on the position of the current line where the coding unit of the current video frame is located and the size of the coding unit and the size of the block in which each layer video frame is divided. The current line is the currently encoded line.

As an example, an area covering the range of the current line of the current video frame may be determined as a motion compensation area of a tail layer video frame in each layer video frame based on the position of the current line where the coding unit of the current video frame is located and the size of the coding unit and the size of the block where each layer video frame is divided, and then the motion compensation area in each layer video frame above the tail layer video frame is determined as an area covering the preset range of the motion compensation area of the next layer video frame in the corresponding area in the layer video frame.

Taking the above construction of four-layer video frames as an example, the motion compensation area of each layer of video frame can satisfy the following conditions: the motion compensation area of the fourth layer video frame covers the range of the current line of the current video frame, and the motion compensation area in each layer video frame above the fourth layer video frame covers the range of 2 lines of the motion compensation area of the next layer video frame in the corresponding area in the layer video frame. The reason for this is that the lower layer video frame of the MCTF method according to an exemplary embodiment of the present disclosure requires a motion vector of the upper layer video frame as a search start point and a range of +2 lines within the search start point range when searching for a motion vector in motion compensation. The motion compensation region of the fourth layer video frame needs to correspond to the region of the coding unit before downsampling.

Then, in step S430, a compensation start point of a motion compensation region of each layer video frame is acquired, and motion compensation is performed on blocks in the motion compensation region of each layer video frame based on the compensation start point to obtain motion vectors of the motion compensation region of each layer video frame.

In particular, if a current layer video frame of the respective layer video frames is not a tail layer video frame, a compensation start point of a motion compensation region of a next layer video frame may be determined based on a motion vector of the motion compensation region of the current layer video frame. If the current layer video frame of the respective layer video frames is a tail layer video frame, temporal filtering may be performed on the current line of the current video frame based on a motion vector of a motion compensation region of the current layer video frame.

Taking the above construction of four-layer video frames as an example, motion compensation is performed on blocks in the motion compensation regions in each layer video frame in order from the first layer video frame to obtain motion vectors of the motion compensation regions of each layer video frame, wherein the motion vectors of the motion compensation regions of the previous layer video frame are used to determine a starting point position at which motion compensation is performed on the motion compensation regions of the next layer video frame.

Motion compensation is performed on blocks in the motion compensation region in each layer video frame to obtain motion vectors of the motion compensation region of each layer video frame, wherein the motion vectors of the motion compensation region of the previous layer video frame are used to determine a starting point position at which motion compensation is performed on the motion compensation region of the next layer video frame.

Motion compensation for blocks in the motion compensation regions of the first layer video frame through the fourth layer video frame may be performed according to the MCTF process described with reference to fig. 1. According to an exemplary embodiment of the present disclosure, a motion compensation region that needs to perform motion search may be specified in each layer video frame, and then motion compensation is performed on blocks in the motion compensation region in each layer video frame to obtain motion vectors of the motion compensation region of each layer video frame. In particular, motion compensation may be performed on blocks within a first motion compensation region of a first layer video frame to obtain a first motion vector MV1 for the blocks of the first motion compensation region; performing motion compensation on blocks in a second motion compensation region of a second layer video frame with 2 x mv1 as a search start point to obtain a second motion vector MV2 of the blocks in the second motion compensation region; performing motion compensation on blocks in a third motion compensation region of a third layer video frame with 2 x mv2 as a search start point to obtain a third motion vector MV3 of the blocks in the third motion compensation region; the method includes dividing a fourth layer video frame into 1/2 blocks of a size of a block into which a third layer video frame is divided, and performing motion compensation on blocks within a fourth motion compensation region of the fourth layer video frame with MV3 as a search start point to obtain a motion vector MV4 of the blocks within the fourth motion compensation region. Here, the motion compensation regions of the video frames of each layer may include different numbers of lines of coding units.

The process of motion compensation based on the motion vector of the corresponding block of the above layer in each layer to obtain the motion vector has been described in detail above with reference to fig. 1. For example, in an encoder for the HEVC standard, when motion compensation is performed on blocks within a first motion compensation region of a first layer video frame, the first layer video frame is cut into 16×16 blocks, and blocks within a predetermined range (e.g., range=8) of co-located blocks of the blocks within the first motion compensation region in neighboring frames are searched for by means of SSE, and a best matching block therein is acquired to obtain a first motion vector MV1 with respect to the best matching block.

When motion compensation is performed on blocks in a second motion compensation region of a second layer video frame, the second layer video frame is divided into 16×16 blocks, and based on MV1 of a matching block corresponding to the blocks in the first layer video frame, motion vector search after filtering processing is performed in range=2 using 2×mv1 as a search start point, and based on the search result, motion vector MV2 of the matching block is obtained by performing search in range=5 in a step size of 1 using SSE.

When motion compensation is performed on blocks within the third motion compensation region of the third layer video frame, the third layer video frame is divided into 16×16 blocks, 2×mv2 is used as a search start point, and a motion vector MV3 of a matching block is searched for in the same manner as the second layer video frame.

The fourth layer video frame is divided into 8 x 8 blocks, and the motion vector of the matching block is searched in the same manner as in the second layer video frame with MV3 as the search start point. Then, on the basis of the motion vector of the matching block, searching is performed in a range of range=12 in a step size of 3, and then searching is performed in a range of range=3 in a step size of 1 to obtain the final MV4. To this end, MVs for each 8×8 block in the current line may be acquired.

According to an exemplary embodiment of the present disclosure, motion compensation for all blocks/lines within the motion compensation area of each layer video frame is performed in parallel. The waiting time delay between frames can be reduced by a line parallel processing mode, and the coding efficiency is improved.

Furthermore, in order for the search process not to lose any precision, the number of initial lines of the downsampled pixel image search may be increased in the case where the currently encoded line is the first line of the current video frame. That is, the number of search lines included in the motion compensation areas of the first layer video frame and the second layer video frame in the case where the currently encoded line is the first line of the current video frame is greater than the number of search lines included in the motion compensation areas of the first layer video frame and the second layer video frame in the case where the currently encoded line is the non-first line of the current video frame.

If the current line is the first line of the current video frame, a corresponding line number from the first line of each layer of video frames can be acquired and a region corresponding to the corresponding line number can be used as a motion compensation region of each layer of video frames. If the current line is not the first line of the current video frame, the line number corresponding to the current line of each layer of video frames can be obtained for each layer of video frames, and the area corresponding to the line number is used as the motion compensation area of the layer of video frames

As an example, if the current line is the first line of the current video frame, the first motion compensation region includes blocks in lines 0 to 4 of the first layer video frame, the second motion compensation region includes blocks in lines 0 to 5 of the second layer video frame, and the third and fourth motion compensation regions include blocks in lines 0 to 8 of the third and fourth layer video frames, respectively. If the current line is a non-top line of the current video frame, the first motion compensation region includes blocks in one line of the first layer video frame corresponding to the current line, the second motion compensation region includes blocks in two lines of the second layer video frame corresponding to the current line performing motion compensation, the third motion compensation region includes blocks in four lines of the third layer video frame corresponding to the current line, and the fourth motion compensation region includes blocks in eight lines of the fourth layer video frame corresponding to the current line.

For example, for the i-th line of the current code, the motion compensation splitting procedure is as follows:

1. if i=0, a motion search is first performed on a 16×16 block in lines 0 to 4 on a 1/4 downsampled image (i.e., a first layer video frame); then, on the 1/2 downsampled image (i.e. the second layer video frame), performing motion search on the 16×16 blocks in the 0 to 5 lines; then on the non-downsampled image (i.e., the third layer video frame), a motion search is performed on the 16 x 16 block in lines 0 through 7; finally, a motion search is performed on 8 x 8 blocks in lines 0 to 7 of the non-downsampled image (i.e., the fourth layer video frame).

2. If i >0, firstly, performing motion search on the 16×16 block in the i+4 line on the 1/4 downsampled image; then on the 1/2 downsampled image, performing motion search on the 16×16 blocks in the 2×i+4 and 2×i+5 rows; then on the non-downsampled image, a motion search is performed on the 16×16 block in the 4×i+4 to 4×i+7 rows; finally, a motion search is performed on 8×8 blocks in the 8×i to 8×i+7 rows of the non-downsampled image.

By increasing the number of initial lines of the downsampled pixel image search, the searching process does not lose any precision, i.e. the parallel manner does not bring any coding loss.

According to embodiments of the present disclosure, it may be determined whether a position of a line within a motion compensation region of each layer video frame exceeds a maximum number of lines of the layer video frame, and in response to determining that the position of the line within the motion compensation region of the layer video frame exceeds the maximum number of lines of the layer video frame, the position of the motion compensation region is determined to be truncated by the maximum number of lines of the layer video frame. I.e. the position of the lines within the motion compensated area of each layer video frame is truncated by the maximum number of lines of that layer video frame. Assuming that the height of the current video frame is H, for the first layer video frame, the second layer video frame, the third layer video frame, and the fourth layer video frame, their maximum lines are floor ((h+63)/64), floor ((h+31)/32), floor ((h+15)/16), floor ((h+7)/8), respectively, where floor represents a truncation rounding. When the number of lines in the motion compensation area exceeds the line limit, i.e. exceeds the maximum line, the calculation of the MV of the current resolution can be skipped. Rows exceeding the maximum number of rows may be considered out of range without processing.

Finally, in step S440, temporal filtering is performed on the current line of the current video frame based on the determined motion vector.

For example, temporal filtering may be performed on a current line of a current video frame based on a motion vector of a motion compensation region of a fourth layer video frame. After obtaining the MV of each coding unit in the current line, only the coding units in the line may be subjected to a time-domain filtering operation.

The time-domain filtering process is the same as that of the MCTF described with reference to fig. 1, i.e., block information after filtering in neighboring frames is calculated from MV information of each coding unit in the current line of the video frame, respectively, and interpolation filtered results of neighboring frames are weighted according to a predetermined weight and a final output is generated.

As described above, the method of time domain filtering according to the exemplary embodiments of the present disclosure is applicable to normal encoding and encoding in which WPP is turned on, and is applicable to a case where an initial non-determined quantization parameter QP and a case where a parallelism requirement is high. By carrying out line-level parallel splitting processing on MCTF, the problem of multithread blocking caused by whole-frame MCTF processing is effectively solved, and the coding efficiency is greatly improved. In addition, the method of temporal filtering according to the exemplary embodiments of the present disclosure does not lose any precision in the motion vector search process and does not bring about coding loss.

Figure 5 is a block diagram illustrating a temporal filtering apparatus for row-based parallel MCTF according to an exemplary embodiment of the present disclosure. It should be understood that the apparatus shown in fig. 5 may be implemented in any one of software, hardware, a combination of software and hardware.

Temporal filtering apparatus 500 may include a downsampling unit 510, a motion compensation region determination unit 520, a motion compensation unit 530, and a filtering unit 540. Each module in the time domain filtering apparatus 500 may be implemented by one or more modules, and names of the corresponding modules may vary according to types of the modules. In various embodiments, some modules in the temporal filtering apparatus 500 may be omitted, or additional modules may also be included. Furthermore, modules/elements according to various embodiments of the present disclosure may be combined to form a single entity, and thus functions of the respective modules/elements prior to combination may be equivalently performed.

The downsampling unit 510 may construct a multi-layer video frame for the current video frame.

The downsampling unit 510 may construct four layers of video frames from the current video frame, wherein the first layer of video frames are 1/4 downsampled video frames of the current video frame, the second layer of video frames are 1/2 downsampled video frames of the current video frame, and the third and fourth layers of video frames are full resolution video frames of the current video frame.

The motion compensation region determination unit 520 may determine the motion compensation region for motion compensation of the current line in each layer video frame based on the position of the current line where the encoding unit of the current video frame is located and the size of the encoding unit and the size of the block in which each layer video frame is divided, respectively, with the line as a division granularity.

The motion compensation unit 530 may acquire a compensation start point of a motion compensation region of each layer video frame and perform motion compensation on blocks in the motion compensation region of each layer video frame based on the compensation start point to obtain motion vectors of the motion compensation region of each layer video frame.

The motion compensation unit 530 may sequentially perform motion compensation on blocks in the motion compensation regions in the respective layers of video frames starting from the first layer of video frame to obtain motion vectors of the motion compensation regions of the respective layers of video frames, wherein the motion vectors of the motion compensation regions of the previous layer of video frame are used to determine a starting point position at which motion compensation is performed on the motion compensation regions of the next layer of video frame.

The filtering unit 540 may perform temporal filtering on a current line of a current video frame based on the obtained motion vector. For example, the filtering unit 540 may perform temporal filtering on a current line of a current video frame based on a motion vector of a motion compensation region of a fourth layer video frame.

According to an example of the present disclosure, if a current layer video frame among the respective layer video frames is not a tail layer video frame, the motion compensation unit 530 may determine a compensation start point of a motion compensation region of a next layer video frame based on a motion vector of the motion compensation region of the current layer video frame.

If the current layer video frame of the respective layer video frames is the tail layer video frame, the filtering unit 540 may perform temporal filtering on the current line of the current video frame based on the motion vector of the motion compensation region of the current layer video frame.

According to an example of the present disclosure, the motion compensation region determination unit 520 may determine a region covering a range of a current line of a current video frame as a motion compensation region of a tail layer video frame of each layer video frame based on a position of a current line where a coding unit of the current video frame is located and a size of the coding unit and a size of a block into which each layer video frame is divided, and determine a motion compensation region in each layer video frame above the tail layer video frame as a region covering a preset range of a motion compensation region of a next layer video frame corresponding to the region in the layer video frame.

If the current line is the first line of the current video frame, the motion compensation region determination unit 520 may acquire, for each of the layer video frames, a corresponding line number from the first line of the layer video frame and take a region corresponding to the corresponding line number as a motion compensation region of the layer video frame.

If the current line is a non-first line of the current video frame, the motion compensation region determination unit 520 may acquire, for each of the layer video frames, a line number of the layer video frame corresponding to the current line, and take a region corresponding to the line number as a motion compensation region of the layer video frame.

Alternatively, the motion compensation region of the fourth layer video frame may cover the range of the current line of the current video frame, and the motion compensation region in each layer video frame above the fourth layer video frame may cover the range of 2 lines of the motion compensation region of the next layer video frame in the corresponding region in that layer video frame.

Alternatively, the motion compensation region determining unit 520 may determine whether the position of a line within the motion compensation region of each layer video frame exceeds the maximum number of lines of the layer video frame, and in response to determining that the position of a line within the motion compensation region of the layer video frame exceeds the maximum number of lines of the layer video frame, determine the position of the motion compensation region to be truncated by the maximum number of lines of the layer video frame.

The motion compensation unit 530 may perform motion compensation on blocks within a first motion compensation region of a first layer video frame to obtain a first motion vector MV1 of the blocks of the first motion compensation region; performing motion compensation on blocks in a second motion compensation region of the second layer video frame with 2 x mv1 as a search start point to obtain a second motion vector MV2 of the blocks in the second motion compensation region; performing motion compensation on blocks in a third motion compensation region of the third layer video frame with 2 x mv2 as a search start point to obtain a third motion vector MV3 of the blocks in the third motion compensation region; the fourth layer video frame is divided into 1/2 blocks of the size of the block into which the third layer video frame is divided, and motion compensation is performed on the blocks within the fourth motion compensation region of the fourth layer video frame with MV3 as a search start point to obtain a motion vector MV4 of the blocks within the fourth motion compensation region.

Alternatively, the motion compensation unit 530 may perform motion compensation on all blocks within the motion compensation region of each layer video frame in parallel.

Alternatively, the number of search lines included in the motion compensation areas of the first layer video frame and the second layer video frame in the case where the current encoded line is the first line of the current video frame may be greater than the number of search lines included in the motion compensation areas of the first layer video frame and the second layer video frame in the case where the current line is the non-first line of the current video frame.

Alternatively, if the current line is the first line of the current video frame, the first motion compensation region may include blocks in lines 0 to 4 of the first layer video frame, the second motion compensation region may include blocks in lines 0 to 5 of the second layer video frame, and the third and fourth motion compensation regions may include blocks in lines 0 to 8 of the third and fourth layer video frames, respectively.

Alternatively, if the current line is a non-first line of the current video frame, the first motion compensation region may include blocks in one line of the first layer video frame corresponding to the current line, the second motion compensation region may include blocks in two lines of the second layer video frame corresponding to the current line to perform motion compensation, the third motion compensation region may include blocks in four lines of the third layer video frame corresponding to the current line, and the fourth motion compensation region may include blocks in eight lines of the fourth layer video frame corresponding to the current line.

Alternatively, the position of the line within the motion compensation area of each layer video frame may be truncated by the maximum line number of the layer video frame.

Alternatively, the multi-layer video frames of the component may include four-layer video frames, wherein the first layer video frame, the second layer video frame, and the third layer video frame may be divided into blocks of a first size for motion compensation, and the fourth layer video frame is divided into blocks of a second size for motion compensation. The first size may be 2 times the second size, for example, the first size is 16 x 16 and the second size is 8 x 8.

The operation and functions of the respective modules of the filtering apparatus 500 have been described in detail above with reference to fig. 4, and a description thereof will not be repeated.

Fig. 6 is a block diagram illustrating a structure of an electronic device for time domain filtering according to an exemplary embodiment of the present disclosure. The electronic device 600 may be, for example: smart phones, tablet computers, MP4 (Moving Picture Experts Group Audio Layer IV, motion picture expert compression standard audio layer 4) players, notebook computers or desktop computers. Electronic device 600 may also be referred to by other names of user devices, portable terminals, laptop terminals, desktop terminals, etc.

In general, the electronic device 600 includes: a processor 601 and a memory 602.

Processor 601 may include one or more processing cores, such as a 4-core processor, an 8-core processor, and the like. The processor 601 may be implemented in at least one hardware form of DSP (Digital Signal Processing ), FPGA (Field Programmable Gate Array, field programmable gate array), PLA (Programmable Logic Array ). The processor 601 may also include a main processor, which is a processor for processing data in an awake state, also called a CPU (Central Processing Unit ), and a coprocessor; a coprocessor is a low-power processor for processing data in a standby state. In some embodiments, the processor 601 may integrate a GPU (Graphics Processing Unit, image processor) for rendering and drawing of content required to be displayed by the display screen. In some embodiments, the processor 601 may also include an AI (Artificial Intelligence ) processor for processing computing operations related to machine learning.

The memory 602 may include one or more computer-readable storage media, which may be non-transitory. The memory 602 may also include high-speed random access memory, as well as non-volatile memory, such as one or more magnetic disk storage devices, flash memory storage devices. In some embodiments, a non-transitory computer readable storage medium in memory 602 is used to store at least one instruction for execution by processor 601 to implement the method of time domain filtering provided by the method embodiment of the present disclosure as shown in fig. 3.

In some embodiments, the electronic device 600 may further optionally include: a peripheral interface 603, and at least one peripheral. The processor 601, memory 602, and peripheral interface 603 may be connected by a bus or signal line. The individual peripheral devices may be connected to the peripheral device interface 603 via buses, signal lines or a circuit board. Specifically, the peripheral device includes: at least one of radio frequency circuitry 604, a touch display 605, a camera 606, audio circuitry 607, a positioning component 608, and a power supply 609.

Peripheral interface 603 may be used to connect at least one Input/Output (I/O) related peripheral to processor 601 and memory 602. In some embodiments, the processor 601, memory 602, and peripheral interface 603 are integrated on the same chip or circuit board; in some other embodiments, either or both of the processor 601, memory 602, and peripheral interface 603 may be implemented on separate chips or circuit boards, which is not limited in this embodiment.

The Radio Frequency circuit 604 is configured to receive and transmit RF (Radio Frequency) signals, also known as electromagnetic signals. The radio frequency circuit 604 communicates with a communication network and other communication devices via electromagnetic signals. The radio frequency circuit 604 converts an electrical signal into an electromagnetic signal for transmission, or converts a received electromagnetic signal into an electrical signal. Optionally, the radio frequency circuit 604 includes: antenna systems, RF transceivers, one or more amplifiers, tuners, oscillators, digital signal processors, codec chipsets, subscriber identity module cards, and so forth. The radio frequency circuit 604 may communicate with other terminals via at least one wireless communication protocol. The wireless communication protocol includes, but is not limited to: metropolitan area networks, various generations of mobile communication networks (2G, 3G, 4G, and 5G), wireless local area networks, and/or WiFi (Wireless Fidelity ) networks. In some embodiments, the radio frequency circuitry 604 may also include NFC (Near Field Communication, short range wireless communication) related circuitry, which is not limited by the present disclosure.

The display screen 605 is used to display a UI (User Interface). The UI may include graphics, text, icons, video, and any combination thereof. When the display 605 is a touch display, the display 605 also has the ability to collect touch signals at or above the surface of the display 605. The touch signal may be input as a control signal to the processor 601 for processing. At this point, the display 605 may also be used to provide virtual buttons and/or virtual keyboards, also referred to as soft buttons and/or soft keyboards. In some embodiments, the display 605 may be one, disposed on the front panel of the electronic device 600; in other embodiments, the display 605 may be at least two, respectively disposed on different surfaces of the terminal 600 or in a folded design; in still other embodiments, the display 605 may be a flexible display, disposed on a curved surface or a folded surface of the terminal 600. Even more, the display 605 may be arranged in a non-rectangular irregular pattern, i.e., a shaped screen. The display 605 may be made of LCD (Liquid Crystal Display ), OLED (Organic Light-Emitting Diode) or other materials.

The camera assembly 606 is used to capture images or video. Optionally, the camera assembly 606 includes a front camera and a rear camera. Typically, the front camera is disposed on the front panel of the terminal and the rear camera is disposed on the rear surface of the terminal. In some embodiments, the at least two rear cameras are any one of a main camera, a depth camera, a wide-angle camera and a tele camera, so as to realize that the main camera and the depth camera are fused to realize a background blurring function, and the main camera and the wide-angle camera are fused to realize a panoramic shooting and Virtual Reality (VR) shooting function or other fusion shooting functions. In some embodiments, camera assembly 606 may also include a flash. The flash lamp can be a single-color temperature flash lamp or a double-color temperature flash lamp. The dual-color temperature flash lamp refers to a combination of a warm light flash lamp and a cold light flash lamp, and can be used for light compensation under different color temperatures.

The audio circuit 607 may include a microphone and a speaker. The microphone is used for collecting sound waves of users and environments, converting the sound waves into electric signals, and inputting the electric signals to the processor 601 for processing, or inputting the electric signals to the radio frequency circuit 604 for voice communication. For the purpose of stereo acquisition or noise reduction, a plurality of microphones may be respectively disposed at different portions of the terminal 600. The microphone may also be an array microphone or an omni-directional pickup microphone. The speaker is used to convert electrical signals from the processor 601 or the radio frequency circuit 604 into sound waves. The speaker may be a conventional thin film speaker or a piezoelectric ceramic speaker. When the speaker is a piezoelectric ceramic speaker, not only the electric signal can be converted into a sound wave audible to humans, but also the electric signal can be converted into a sound wave inaudible to humans for ranging and other purposes. In some embodiments, the audio circuit 607 may also include a headphone jack.

The location component 608 is used to locate the current geographic location of the electronic device 600 to enable navigation or LBS (Location Based Service, location-based services). The positioning component 608 may be a positioning component based on the United states GPS (Global Positioning System ), the Beidou system of China, the Granati system of Russia, or the Galileo system of the European Union.

The power supply 609 is used to power the various components in the electronic device 600. The power source 609 may be alternating current, direct current, disposable battery or rechargeable battery. When the power source 609 includes a rechargeable battery, the rechargeable battery may support wired or wireless charging. The rechargeable battery may also be used to support fast charge technology.

In some embodiments, the electronic device 600 further includes one or more sensors 610. The one or more sensors 610 include, but are not limited to: acceleration sensor 611, gyroscope sensor 612, pressure sensor 613, fingerprint sensor 614, optical sensor 615, and proximity sensor 616.

The acceleration sensor 611 can detect the magnitudes of accelerations on three coordinate axes of the coordinate system established with the terminal 600. For example, the acceleration sensor 611 may be used to detect components of gravitational acceleration in three coordinate axes. The processor 601 may control the touch display screen 605 to display a user interface in a landscape view or a portrait view according to the gravitational acceleration signal acquired by the acceleration sensor 611. The acceleration sensor 611 may also be used for the acquisition of motion data of a game or a user.

The gyro sensor 612 may detect a body direction and a rotation angle of the terminal 600, and the gyro sensor 612 may collect a 3D motion of the user on the terminal 600 in cooperation with the acceleration sensor 611. The processor 601 may implement the following functions based on the data collected by the gyro sensor 612: motion sensing (e.g., changing UI according to a tilting operation by a user), image stabilization at shooting, game control, and inertial navigation.

The pressure sensor 613 may be disposed at a side frame of the terminal 600 and/or at a lower layer of the touch screen 605. When the pressure sensor 613 is disposed at a side frame of the terminal 600, a grip signal of the terminal 600 by a user may be detected, and a left-right hand recognition or a shortcut operation may be performed by the processor 601 according to the grip signal collected by the pressure sensor 613. When the pressure sensor 613 is disposed at the lower layer of the touch screen 605, the control of the operability control on the UI is realized by the processor 601 according to the pressure operation of the user on the touch screen 605. The operability controls include at least one of a button control, a scroll bar control, an icon control, and a menu control.

The fingerprint sensor 614 is used for collecting the fingerprint of the user, and the processor 601 identifies the identity of the user according to the fingerprint collected by the fingerprint sensor 614, or the fingerprint sensor 614 identifies the identity of the user according to the collected fingerprint. Upon recognizing that the user's identity is a trusted identity, the processor 601 authorizes the user to perform relevant sensitive operations including unlocking the screen, viewing encrypted information, downloading software, paying for and changing settings, etc. The fingerprint sensor 614 may be provided on the front, back, or side of the electronic device 600. When a physical key or vendor Logo is provided on the electronic device 600, the fingerprint sensor 614 may be integrated with the physical key or vendor Logo.

The optical sensor 615 is used to collect ambient light intensity. In one embodiment, processor 601 may control the display brightness of touch display 605 based on the intensity of ambient light collected by optical sensor 615. Specifically, when the intensity of the ambient light is high, the display brightness of the touch display screen 605 is turned up; when the ambient light intensity is low, the display brightness of the touch display screen 605 is turned down. In another embodiment, the processor 601 may also dynamically adjust the shooting parameters of the camera assembly 606 based on the ambient light intensity collected by the optical sensor 615.

A proximity sensor 616, also referred to as a distance sensor, is typically provided on the front panel of the electronic device 600. The proximity sensor 616 is used to capture the distance between the user and the front of the electronic device 600. In one embodiment, when the proximity sensor 616 detects a gradual decrease in the distance between the user and the front face of the terminal 600, the processor 601 controls the touch display 605 to switch from the bright screen state to the off screen state; when the proximity sensor 616 detects that the distance between the user and the front of the electronic device 600 gradually increases, the processor 601 controls the touch display 605 to switch from the off-screen state to the on-screen state.

Those skilled in the art will appreciate that the structure shown in fig. 6 is not limiting of the electronic device 600 and may include more or fewer components than shown, or may combine certain components, or may employ a different arrangement of components.

Fig. 7 is a block diagram illustrating another electronic device 700. For example, the electronic device 700 may be provided as a server. Referring to fig. 7, an electronic device 700 includes one or more processing processors 710 and memory 720. Memory 720 may include one or more programs for performing the above methods of time-domain filtering. The electronic device 700 may also include a power supply component 730 configured to perform power management of the electronic device 700, a wired or wireless network interface 740 configured to connect the electronic device 700 to a network, and an input output (I/O) interface 750. The electronic device 700 may operate based on an operating system stored in the memory 720, such as Windows ServerTM, mac OS XTM, unixTM, linuxTM, freeBSDTM, or the like.

According to an embodiment of the present disclosure, there may also be provided a computer-readable storage medium storing instructions, wherein the instructions, when executed by at least one processor, cause the at least one processor to perform a method of time-domain filtering according to the present disclosure. Examples of the computer readable storage medium herein include: read-only memory (ROM), random-access programmable read-only memory (PROM), electrically erasable programmable read-only memory (EEPROM), random-access memory (RAM), dynamic random-access memory (DRAM), static random-access memory (SRAM), flash memory, nonvolatile memory, CD-ROM, CD-R, CD + R, CD-RW, CD+RW, DVD-ROM, DVD-R, DVD + R, DVD-RW, DVD+RW, DVD-RAM, BD-ROM, BD-R, BD-R LTH, BD-RE, blu-ray or optical disk storage, hard Disk Drives (HDD), solid State Disks (SSD), card memory (such as multimedia cards, secure Digital (SD) cards or ultra-fast digital (XD) cards), magnetic tape, floppy disks, magneto-optical data storage, hard disks, solid state disks, and any other means configured to store computer programs and any associated data, data files and data structures in a non-transitory manner and to provide the computer programs and any associated data, data files and data structures to a processor or computer to enable the processor or computer to execute the programs. The computer programs in the computer readable storage media described above can be run in an environment deployed in a computer device, such as a client, host, proxy device, server, etc., and further, in one example, the computer programs and any associated data, data files, and data structures are distributed across networked computer systems such that the computer programs and any associated data, data files, and data structures are stored, accessed, and executed in a distributed fashion by one or more processors or computers.

In accordance with embodiments of the present disclosure, a computer program product may also be provided, instructions in which are executable by a processor of a computer device to perform the method of time-domain filtering described above.

According to the method and device for time domain filtering, the electronic equipment and the computer readable storage medium, the problem of multithread blocking caused by whole-frame MCTF processing can be effectively solved through line-level parallel splitting processing of the MCTF, and the coding efficiency is greatly improved. In addition, the method of temporal filtering according to the exemplary embodiments of the present disclosure does not lose any precision in the motion vector search process and does not bring about coding loss.

Other embodiments of the disclosure will be apparent to those skilled in the art from consideration of the specification and practice of the disclosure disclosed herein. This application is intended to cover any adaptations, uses, or adaptations of the disclosure following, in general, the principles of the disclosure and including such departures from the present disclosure as come within known or customary practice within the art to which the disclosure pertains. It is intended that the specification and examples be considered as exemplary only, with a true scope and spirit of the disclosure being indicated by the following claims.

It is to be understood that the present disclosure is not limited to the precise arrangements and instrumentalities shown in the drawings, and that various modifications and changes may be effected without departing from the scope thereof. The scope of the present disclosure is limited only by the appended claims.

Claims

1. A method for temporal filtering of video, comprising:

constructing a multi-layer video frame for the current video frame;

based on the position of the current line where the coding unit of the current video frame is located, the size of the coding unit and the size of the segmented blocks of each layer of video frame, respectively determining a motion compensation area for performing motion compensation on the current line in each layer of video frame by taking the line as a division granularity;

acquiring a compensation starting point of a motion compensation area of each layer of video frame, and performing motion compensation on blocks in the motion compensation area of each layer of video frame based on the compensation starting point to obtain a motion vector of the motion compensation area of each layer of video frame;

temporal filtering is performed on the current line of a current video frame based on the motion vector,

wherein determining a motion compensation region for motion compensation for the current line in each layer of video frames comprises:

determining a region covering a range of a current line of a current video frame as a motion compensation region of a tail layer video frame in each layer video frame based on a position of the current line where the coding unit of the current video frame is located and a size of the coding unit and a size of a block in which each layer video frame is divided, and determining a motion compensation region in each layer video frame above the tail layer video frame as a region covering a preset range of a motion compensation region of a next layer video frame in a corresponding region in the layer video frame,

Wherein motion compensation for all lines within the motion compensation area of each layer video frame is performed in parallel.

2. The method of claim 1, wherein obtaining a compensation start point of the motion compensation region of each layer video frame and performing motion compensation on blocks in the motion compensation region of each layer video frame based on the compensation start point to obtain motion vectors of the motion compensation region of each layer video frame, comprises:

if the current layer video frame in the video frames of each layer is not the tail layer video frame, determining a compensation starting point of a motion compensation area of a next layer video frame based on a motion vector of the motion compensation area of the current layer video frame;

if a current layer video frame of the respective layer video frames is a tail layer video frame, temporal filtering is performed on the current line of the current video frame based on a motion vector of a motion compensation region of the current layer video frame.

3. The method of claim 1, wherein motion compensation for all blocks within the motion compensation region of each layer video frame is performed in parallel.

4. The method of claim 1, wherein determining a motion compensation region for motion compensation for the current line in each layer video frame comprises:

And if the current line is the first line of the current video frame, acquiring a preset line number from the first line of each layer of video frames and taking a region corresponding to the preset line number as a motion compensation region of each layer of video frames.

5. The method of claim 1, wherein determining a motion compensation region for motion compensation for the current line in each layer video frame comprises:

and if the current line is a non-first line of the current video frame, acquiring the line number of each layer of video frames corresponding to the current line of each layer of video frames, and taking the area corresponding to the line number as a motion compensation area of each layer of video frames.

6. The method of claim 1, wherein determining a motion compensation region for motion compensation for the current line in each layer video frame comprises:

determining whether a position of a line within a motion compensation region of each layer video frame exceeds a maximum number of lines of the layer video frame;

responsive to determining that the position of a line within a motion compensation region of the layer video frame exceeds a maximum number of lines of the layer video frame, the position of the motion compensation region is determined to truncate by the maximum number of lines of the layer video frame.

7. The method of claim 1, wherein the multi-layer video frame comprises a four-layer video frame; among the video frames of each layer, the first layer video frame, the second layer video frame, and the third layer video frame are divided into blocks of a first size for motion compensation, and the fourth layer video frame is divided into blocks of a second size for motion compensation, wherein the first size is 2 times the second size.

8. An apparatus for temporal filtering of video, comprising:

a downsampling unit configured to construct a multi-layered video frame for a current video frame;

a motion compensation region determining unit configured to determine motion compensation regions for performing motion compensation for a current line in each layer video frame with a line as a division granularity, based on a position of the current line where a coding unit of the current video frame is located and a size of the coding unit and a size of a block in which each layer video frame is divided;

a motion compensation unit configured to acquire a compensation start point of a motion compensation region of each layer video frame, and perform motion compensation on blocks in the motion compensation region of each layer video frame based on the compensation start point to obtain a motion vector of the motion compensation region of each layer video frame;

A filtering unit configured to perform temporal filtering on the current line of a current video frame based on the motion vector,

wherein the motion compensation region determination unit is configured to: determining a region covering a range of a current line of a current video frame as a motion compensation region of a tail layer video frame in each layer video frame based on a position of the current line where the coding unit of the current video frame is located and a size of the coding unit and a size of a block in which each layer video frame is divided, and determining a motion compensation region in each layer video frame above the tail layer video frame as a region covering a preset range of a motion compensation region of a next layer video frame in a corresponding region in the layer video frame,

9. The apparatus of claim 8, wherein the motion compensation unit is configured to:

if the current layer video frame of the video frames of each layer is not the tail layer video frame, determining a compensation start point of a motion compensation region of a next layer video frame based on a motion vector of the motion compensation region of the current layer video frame,

wherein if a current layer video frame of the respective layer video frames is a tail layer video frame, the filtering unit performs temporal filtering on the current line of the current video frame based on a motion vector of a motion compensation region of the current layer video frame.

10. The apparatus of claim 8, wherein motion compensation for all blocks within a motion compensation region of each layer video frame is performed in parallel.

11. The apparatus of claim 8, wherein the motion compensation region determination unit is configured to:

12. The apparatus of claim 8, wherein the motion compensation region determination unit is configured to:

13. The apparatus of claim 8, wherein the motion compensation region determination unit is configured to:

14. The apparatus of claim 8, wherein the multi-layer video frame comprises a four-layer video frame; among the video frames of each layer, the first layer video frame, the second layer video frame, and the third layer video frame are divided into blocks of a first size for motion compensation, and the fourth layer video frame is divided into blocks of a second size for motion compensation, wherein the first size is 2 times the second size.

15. An electronic device, comprising:

at least one processor;

at least one memory storing computer-executable instructions,

wherein the computer executable instructions, when executed by the at least one processor, cause the at least one processor to perform the method for temporal filtering of video of any one of claims 1 to 7.

16. A computer readable storage medium, which when executed by a processor of a temporal filtering apparatus/electronic device/server, enables the temporal filtering apparatus/electronic device/server to perform the method for temporal filtering of video according to any one of claims 1 to 7.