WO2019184639A1 - 一种双向帧间预测方法及装置 - Google Patents
一种双向帧间预测方法及装置 Download PDFInfo
- Publication number
- WO2019184639A1 WO2019184639A1 PCT/CN2019/076086 CN2019076086W WO2019184639A1 WO 2019184639 A1 WO2019184639 A1 WO 2019184639A1 CN 2019076086 W CN2019076086 W CN 2019076086W WO 2019184639 A1 WO2019184639 A1 WO 2019184639A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- current image
- image block
- motion
- block
- motion vector
- Prior art date
Links
Images
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/134—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
- H04N19/157—Assigned coding mode, i.e. the coding mode being predefined or preselected to be further used for selection of another element or parameter
- H04N19/159—Prediction type, e.g. intra-frame, inter-frame or bidirectional frame prediction
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/169—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
- H04N19/17—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object
- H04N19/176—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object the region being a block, e.g. a macroblock
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/44—Decoders specially adapted therefor, e.g. video decoders which are asymmetric with respect to the encoder
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/50—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
- H04N19/503—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving temporal prediction
- H04N19/51—Motion estimation or motion compensation
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/50—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
- H04N19/503—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving temporal prediction
- H04N19/51—Motion estimation or motion compensation
- H04N19/513—Processing of motion vectors
- H04N19/517—Processing of motion vectors by encoding
- H04N19/52—Processing of motion vectors by encoding by predictive encoding
Definitions
- the embodiments of the present invention relate to the field of video coding and decoding technologies, and in particular, to a bidirectional inter prediction method and apparatus.
- Video coding compression technology mainly uses block-based hybrid video coding to divide a video image into multiple blocks, in block prediction, through intra prediction, inter prediction, and transform.
- Video encoding compression is implemented by steps such as (transform), quantization, entropy encoding, and in-loop filtering (mainly de-blocking filtering).
- the inter prediction may also be referred to as motion compensation prediction (MCP), that is, the motion information of the block is obtained first, and then the predicted pixel value of the block is determined according to the motion information.
- MCP motion compensation prediction
- the process of calculating the motion information of a block is called motion estimation (ME), and the process of determining the predicted pixel value of the block from the motion information is called motion compensation (MC).
- MCP motion compensation prediction
- ME motion estimation
- MC motion compensation
- inter prediction includes forward prediction, backward prediction, and bidirectional prediction.
- the forward prediction block of the current image block is obtained according to the forward prediction according to the motion information
- the backward prediction block of the current image block is obtained according to the backward prediction according to the motion information
- the weighted prediction technique based on the bidirectional prediction is performed.
- the pixel values of the same pixel position in the forward prediction block and the backward prediction block are weighted and predicted to obtain a prediction block of the current image block, or the bi-directional optical flow (BIO) is based on forward prediction.
- the block and backward prediction block determine the prediction block of the current image block.
- the advantage of the weighted prediction technique is that the calculation is simple. However, when the weighted prediction technique is applied to the block-based motion compensation, the image prediction effect with complex texture is poor, and the compression efficiency is not high.
- BIO technology can improve the compression ratio through pixel-level motion refinement, BIO technology has high computational complexity, which greatly affects the encoding and decoding speed, and in some cases, can achieve or exceed BIO by using weighted prediction technology.
- the compression effect of the technology Therefore, how to choose the motion compensation technology in bidirectional prediction for bidirectional inter prediction to achieve the best trade-off between compression ratio and computational complexity is an urgent problem to be solved.
- the embodiment of the present application provides a bidirectional interframe prediction method and apparatus, which solves the problem of how to select a bidirectional predictive motion compensation technique for bidirectional interframe prediction to achieve an optimal tradeoff between compression ratio and computational complexity.
- a first aspect of the embodiments of the present application provides a bidirectional inter prediction method, including: after acquiring motion information of a current image block, acquiring an initial prediction block of a current image block according to motion information, and then, according to the initial prediction block, The attribute information determines a motion compensation mode of the current image block, or determines a motion compensation mode of the current image block according to the motion information and the attribute information of the initial prediction block, or determines a motion compensation mode of the current image block according to the motion information and the attribute information of the current image block. Finally, the current image block is motion compensated according to the determined motion compensation mode and the initial prediction block.
- the current image block is an image block to be encoded or an image block to be decoded.
- the motion compensation method is a weighted prediction technique based on bidirectional prediction or an optical flow technique based on bidirectional prediction.
- the bidirectional inter-frame prediction method performs motion compensation on the current image block, and determines a suitable motion compensation mode according to the feature of the current image block and the characteristics of the initial prediction block of the current image block, which has a high compression ratio.
- the characteristics of the code and the low complexity of the code thus effectively achieving the best balance of compression ratio and complexity.
- the motion information described in this embodiment of the present application may include a first reference frame index, a second reference frame index, a first motion vector, and a second motion vector.
- the acquiring the initial prediction block of the current image block according to the motion information includes: determining, according to the first reference frame index and the first motion vector, the first initial prediction block of the current image block. And determining, according to the second reference frame index and the second motion vector, a second initial prediction block of the current image block, where the first reference frame index is used to indicate an index of a frame where the forward reference block of the current image block is located, first The motion vector is used to represent the motion displacement of the current image block relative to the forward reference block.
- the attribute information of the first initial prediction block includes pixel values of M*N pixel points
- the second reference frame index is used to indicate the backward direction of the current image block.
- the second motion vector is used to represent the motion displacement of the current image block relative to the backward reference block
- the attribute information of the second initial prediction block includes the pixel value of the M*N pixel points, where N is greater than or equal to An integer of 1, M is an integer greater than or equal to 1.
- the method for determining the motion compensation of the current image block according to the attribute information of the initial prediction block in the embodiment of the present application includes: first, according to the first initial prediction block.
- the pixel value of the M*N pixel points and the pixel value of the M*N pixel points of the second initial prediction block are M*N pixel difference values, and then the texture of the current image block is determined according to the M*N pixel difference values. Complexity, and then determine the motion compensation method according to the texture complexity of the current image block.
- determining the texture complexity of the current image block according to the M*N pixel difference values includes: calculating a sum of absolute values of M*N pixel difference values. The sum of the absolute values of the M*N pixel difference values is determined as the texture complexity of the current image block.
- determining the texture complexity of the current image block according to the M*N pixel difference values includes: calculating an average value of M*N pixel difference values; The average of the M*N pixel difference values is determined as the texture complexity of the current image block.
- determining the texture complexity of the current image block according to the M*N pixel difference values includes: calculating a standard deviation of the M*N pixel difference values; The standard deviation of the M*N pixel difference values is determined as the texture complexity of the current image block.
- determining the motion compensation manner according to the texture complexity of the current image block specifically, determining whether the texture complexity of the current image block is less than a first threshold, If the texture complexity of the current image block is less than or equal to the first threshold
- the motion compensation method is an optical flow technology based on bidirectional prediction.
- the motion amplitude of the current image block in the embodiment of the present application is determined by the motion information, and the motion compensation mode is determined according to the motion information and the attribute information of the initial prediction block.
- the method includes: determining a first motion amplitude of the current image block according to the first motion vector, and determining a second motion amplitude of the current image block according to the second motion vector; according to the first motion amplitude, the second motion amplitude, and the attribute information of the initial prediction block. Determine the motion compensation method.
- the foregoing determining a motion compensation manner according to the first motion amplitude, the second motion amplitude, and the attribute information of the initial prediction block, where the attribute information of the initial prediction block may be a pixel point The pixel value.
- the attribute information of the initial prediction block may be a pixel point The pixel value.
- Determining the motion compensation method includes: obtaining M*N pixel difference values according to pixel values of M*N pixel points of the first initial prediction block and pixel values of M*N pixel points of the second initial prediction block; * N pixel difference values determine the texture complexity of the current image block; determine the selection probability according to the texture complexity of the current image block, the first motion amplitude, the second motion magnitude, and the first mathematical model; or, according to the texture of the current image block
- the complexity, the first motion amplitude, and the second motion magnitude query first mapping table to determine a selection probability, where the first mapping table includes a correspondence between the selection probability and a texture complexity of the current image block, a first motion amplitude, and a second motion amplitude;
- the motion compensation method is determined according to the selection probability.
- the motion information includes a first motion vector and a second motion vector, and determining a motion compensation manner of the current image block according to the motion information and the attribute information of the current image block, including: according to the current The size of the image block, the horizontal component of the first motion vector, the vertical component of the first motion vector, the horizontal component of the second motion vector, the vertical component of the second motion vector, and the second mathematical model determine a selection probability, the first motion vector including a horizontal component of the first motion vector and a vertical component of the first motion vector, the second motion vector including a horizontal component of the second motion vector and a vertical component of the second motion vector; or, according to a size of the current image block, the first motion vector
- the horizontal component, the vertical component of the first motion vector, the horizontal component of the second motion vector, and the vertical component of the second motion vector query the second mapping table to determine the selection probability
- the second mapping table includes the selection value and the size of the current image block,
- the determining the motion compensation manner according to the selection probability includes: determining whether the selection probability is greater than a second threshold, and the second threshold is greater than or equal to 0 and less than or equal to 1. Any real number; if the selection probability is greater than the second threshold, determining the motion compensation mode is an optical flow technology based on bidirectional prediction; if the selection probability is less than or equal to the second threshold, determining the motion compensation mode is a weighted prediction technique based on bidirectional prediction.
- a second aspect of the embodiments of the present application provides an encoding method, including: the bidirectional inter prediction method described in any aspect above is used in an encoding process, where a current image block is an image block to be encoded.
- a third aspect of the embodiments of the present application provides a decoding method, including: the bidirectional inter prediction method according to any of the foregoing aspects, in the decoding process, the current image block is an image block to be decoded.
- a fourth aspect of the embodiments of the present application provides a bidirectional inter prediction apparatus, including: a motion estimation unit, a determining unit, and a motion compensation unit.
- the foregoing motion estimation unit is configured to acquire motion information of a current image block, where the current image block is an image block to be encoded or an image block to be decoded; and the determining unit is configured to acquire an initial prediction block of the current image block according to the motion information; The determining unit is further configured to determine a motion compensation mode of the current image block according to the attribute information of the initial prediction block, or according to the motion information and the attribute information of the initial prediction block, or according to the motion information and the attribute information of the current image block, and the motion compensation mode.
- a weighted prediction technique based on bidirectional prediction or an optical flow technology based on bidirectional prediction; the motion compensation unit is configured to perform motion compensation on the current image block according to the determined motion compensation mode and the initial prediction block.
- the bidirectional inter-frame prediction method performs motion compensation on the current image block, and determines a suitable motion compensation mode according to the feature of the current image block and the characteristics of the initial prediction block of the current image block, which has a high compression ratio.
- the characteristics of the code and the low complexity of the code thus effectively achieving the best balance of compression ratio and complexity.
- the motion information described in this embodiment of the present application includes a first reference frame index, a second reference frame index, a first motion vector, and a second motion vector.
- the determining unit is specifically configured to: determine, according to the first reference frame index and the first motion vector, a first initial prediction block of the current image block, where the first reference frame index is used.
- the first motion vector is used to represent the motion displacement of the current image block relative to the forward reference block, and the attribute information of the first initial prediction block includes M*N pixels.
- N is an integer greater than or equal to 1
- M is an integer greater than or equal to 1; and determining a second initial prediction block of the current image block according to the second reference frame index and the second motion vector, the second reference frame index being used An index indicating a frame in which the backward reference block of the current image block is located, a second motion vector for indicating a motion displacement of the current image block relative to the backward reference block, and attribute information of the second initial prediction block including M*N pixels Pixel values.
- the determining unit is specifically configured to: according to the pixel value of the M*N pixel points of the first initial prediction block and the M*N of the second initial prediction block.
- the pixel values of the pixels are obtained by M*N pixel difference values; the texture complexity of the current image block is determined according to the M*N pixel difference values; and the motion compensation mode is determined according to the texture complexity of the current image block.
- the determining unit is specifically configured to: calculate a sum of absolute values of M*N pixel difference values; and absolute values of M*N pixel difference values The sum is determined as the texture complexity of the current image block.
- the determining unit is specifically configured to: calculate an average value of M*N pixel difference values; and determine an average value of M*N pixel difference values as The texture complexity of the current image block.
- the determining unit is specifically configured to: calculate a standard deviation of M*N pixel difference values; and determine a standard deviation of the M*N pixel difference values as The texture complexity of the current image block.
- the determining unit is specifically configured to: determine whether a texture complexity of the current image block is less than a first threshold, and the first threshold is any real number greater than 0; The texture complexity of the current image block is less than the first threshold, and the motion compensation mode is determined as a weighted prediction technique based on bidirectional prediction; if the texture complexity of the current image block is greater than or equal to the first threshold, determining that the motion compensation mode is based on bidirectional prediction Streaming technology.
- the motion amplitude of the current image block in the embodiment of the present application is determined by the motion information, and the determining unit is specifically configured to: determine the current image according to the first motion vector. a first motion amplitude of the block, and determining a second motion amplitude of the current image block according to the second motion vector; determining a motion compensation manner according to the first motion amplitude, the second motion amplitude, and the attribute information of the initial prediction block.
- the determining unit is specifically configured to: according to the pixel value of the M*N pixel points of the first initial prediction block and the M* of the second initial prediction block.
- the pixel values of the N pixels are obtained by M*N pixel difference values; the texture complexity of the current image block is determined according to the M*N pixel difference values; according to the texture complexity of the current image block, the first motion amplitude, and the second motion
- the amplitude and the first mathematical model determine the selection probability; or, the first mapping table is determined according to the texture complexity of the current image block, the first motion amplitude, and the second motion amplitude to determine the selection probability, and the first mapping table includes the selection probability and the current image block. Corresponding relationship between texture complexity, first motion amplitude and second motion amplitude; determining motion compensation method according to selection probability.
- the motion information includes a first motion vector and a second motion vector
- the determining unit is specifically configured to: according to a size of the current image block, a horizontal component of the first motion vector, a vertical component of the first motion vector, a horizontal component of the second motion vector, a vertical component of the second motion vector, and a second mathematical model determining a selection probability, the first motion vector including a horizontal component of the first motion vector and a first motion vector a vertical component, the second motion vector comprising a horizontal component of the second motion vector and a vertical component of the second motion vector; or, according to a size of the current image block, a horizontal component of the first motion vector, a vertical component of the first motion vector,
- the horizontal component of the second motion vector and the vertical component of the second motion vector query the second mapping table to determine the selection probability, and the second mapping table includes the selection value and the size of the current image block, the horizontal component of the first motion vector, and the first motion vector.
- the determining unit is specifically configured to: determine whether the selection probability is greater than a second threshold, and the second threshold is any real number greater than or equal to 0 and less than or equal to 1; The selection probability is greater than the second threshold, and the motion compensation mode is determined to be an optical flow technology based on bidirectional prediction; if the selection probability is less than or equal to the second threshold, the motion compensation mode is determined to be a weighted prediction technique based on bidirectional prediction.
- a fifth aspect of the embodiments of the present application provides a terminal, where the terminal includes: one or more processors, a memory, and a communication interface; the memory and the communication interface are connected to one or more processors; and the terminal communicates with other devices through the communication interface.
- the memory is for storing computer program code, the computer program code comprising instructions for performing a bidirectional inter prediction method of any of the above aspects when the one or more processors execute the instructions.
- a computer program product comprising instructions for causing a computer to perform a bidirectional inter prediction method of any of the above aspects when the computer program product is run on a computer is provided.
- a seventh aspect of the embodiments of the present application provides a computer readable storage medium, comprising instructions for causing a terminal to perform a bidirectional inter prediction method of any of the above aspects when the instruction is run on the terminal.
- An eighth aspect of the embodiments of the present application provides a video encoder including a nonvolatile storage medium and a central processing unit, the nonvolatile storage medium storing an executable program, a central processing unit and a nonvolatile storage medium Connected, when the central processor executes an executable program, the video encoder performs the bi-directional inter-frame prediction method of any of the above aspects.
- a ninth aspect of the embodiments of the present application provides a video decoder including a nonvolatile storage medium and a central processing unit, the nonvolatile storage medium storing an executable program, a central processing unit and a nonvolatile storage medium Connected, when the central processor executes an executable program, the video decoder performs the bi-directional inter-frame prediction method of any of the above aspects.
- the name of the bidirectional inter-frame prediction device and the terminal is not limited to the device itself. In actual implementation, these devices may appear under other names. As long as the functions of the respective devices are similar to the embodiments of the present application, they fall within the scope of the claims and their equivalents.
- FIG. 1 is a simplified schematic diagram of a video transmission system architecture according to an embodiment of the present disclosure
- FIG. 2 is a simplified schematic diagram of a video encoder according to an embodiment of the present application.
- FIG. 3 is a simplified schematic diagram of a video decoder according to an embodiment of the present application.
- FIG. 4 is a flowchart of a bidirectional inter prediction method according to an embodiment of the present application.
- FIG. 5 is a schematic diagram of motion of a current image block according to an embodiment of the present application.
- FIG. 6 is a flowchart of another bidirectional inter prediction method according to an embodiment of the present disclosure.
- FIG. 7 is a flowchart of still another bidirectional inter prediction method according to an embodiment of the present application.
- FIG. 8 is a schematic diagram of obtaining M*N pixel difference values according to an embodiment of the present disclosure.
- FIG. 9 is a flowchart of still another bidirectional inter prediction method according to an embodiment of the present application.
- FIG. 10 is a flowchart of still another bidirectional inter prediction method according to an embodiment of the present application.
- FIG. 11 is a schematic structural diagram of a bidirectional inter-frame prediction apparatus according to an embodiment of the present disclosure.
- FIG. 12 is a schematic structural diagram of another bidirectional inter prediction apparatus according to an embodiment of the present disclosure.
- the words “exemplary” or “such as” are used to mean an example, illustration, or illustration. Any embodiment or design described as “exemplary” or “for example” in the embodiments of the present application should not be construed as preferred or advantageous over other embodiments or designs. Rather, the use of the words “exemplary” or “such as” is intended to present the concepts in a particular manner.
- Video encoding The process of compressing a video (image sequence) into a code stream.
- Video decoding The process of restoring a code stream into a reconstructed image according to specific grammar rules and processing methods.
- video In most coding frameworks, video consists of a series of pictures, one called a frame. The image is divided into at least one strip, each strip being divided into image blocks. Video encoding or video decoding is in units of image blocks. For example, encoding processing or decoding processing may be performed from left to right, top to bottom, and line by line from the upper left corner position of the image.
- the image block may be a macro block (MB) in the video codec standard H.264, or may be a coding unit (CU) in a high efficiency video coding (HEVC) standard. This embodiment of the present application does not specifically limit this.
- an image block in which encoding processing or decoding processing is being performed is referred to as a current image block
- an image in which the current image block is located is referred to as a current frame (current image).
- the current frame can be classified into an I frame, a P frame, and a B frame according to the prediction type of the current image block.
- An I frame is a frame encoded as a separate still image, providing random access points in the video stream.
- a P frame is a frame predicted by a previous I frame or P frame adjacent thereto, and can be used as a reference frame of a next P frame or a B frame.
- the B frame is a frame obtained by bidirectional prediction using the nearest two frames (which may be I frames or P frames) as reference frames.
- the current frame refers to a bidirectional prediction frame (B frame).
- video is mainly encoded by motion compensation inter-frame prediction technology to improve the compression ratio.
- Inter prediction refers to prediction performed by the correlation between the current frame and its reference frame in units of coded image blocks or decoded image blocks, and one or more reference frames may exist in the current frame. Specifically, the prediction block of the current image block is generated according to the pixels in the reference frame of the current image block.
- the encoding end when encoding the current image block in the current frame, first randomly selects one or more reference frames from the encoded frame of the video image, and obtains a prediction block corresponding to the current image block from the reference frame, and then calculates Predicting the residual value between the block and the current image block, and performing quantization quantization on the residual value; when decoding the current image block in the current frame, the decoding end first acquires the predicted image block corresponding to the current image block, and then And obtaining a residual value of the predicted image block and the current image block in the received code stream, and reconstructing the current image block according to the residual value and the prediction block decoding.
- the temporal correlation between the current frame and other frames in the video is not only reflected in the temporal correlation between the current frame and the frame encoded before it, but also in the temporal correlation between the current frame and the frame encoded after it. . Based on this, bidirectional inter prediction can be considered when performing video coding to obtain a better coding effect.
- the prediction block of the current image block may be generated from only one reference block, or the prediction block of the current image block may be generated according to the two reference blocks.
- the above-described prediction block for generating a current image block from one reference block is referred to as uni-directional inter prediction, and the above-described prediction block for generating a current image block from two reference blocks is referred to as bidirectional inter prediction.
- Two reference image blocks in bi-directional inter prediction may be from the same reference frame or different reference frames.
- the bi-directional inter-prediction may refer to the correlation between the current video frame and the video frame previously encoded and played before it, and the current video frame and the video encoded before and after the video frame. Inter-prediction by correlation between frames.
- Forward inter-prediction refers to inter-prediction using the correlation between the current video frame and a video frame that was previously encoded and played before it.
- Backward inter prediction refers to inter prediction using the correlation between the current video frame and a video frame that was previously encoded and played after it.
- Motion compensation is a method of describing the difference between adjacent frames (the adjacent ones here are adjacent to the coding relationship, and the two frames are not necessarily adjacent in the playback order), and the reference block of the current image block is found according to the motion information, and the current
- the process in which the reference block of the image block is processed to obtain the prediction block of the current image block belongs to a loop in the inter prediction process.
- a weighted prediction technique based on bidirectional prediction is required to perform weighted prediction on the pixel values of the same pixel position in the forward prediction block of the current image block and the backward prediction block of the current image block to obtain a prediction block of the current image block.
- the optical flow technique based on bidirectional prediction can determine the prediction block of the current image block according to the forward prediction block of the current image block and the backward prediction block of the current image block.
- the weighted prediction technique based on bidirectional prediction is simple in calculation and low in compression efficiency; the optical flow technology based on bidirectional prediction has high compression efficiency and high computational complexity. Therefore, how to choose the motion compensation technology in bidirectional prediction to achieve the best trade-off between compression ratio and computational complexity is an urgent problem to be solved.
- the embodiment of the present application provides a bi-directional inter-frame prediction method.
- the basic principle is: after acquiring the motion information of the current image block, first acquiring an initial prediction block of the current image block according to the motion information, and then according to the initial prediction.
- the image block is motion compensated.
- the current image block is an image block to be encoded or an image block to be decoded.
- the motion compensation method is a weighted prediction technique based on bidirectional prediction or an optical flow technique based on bidirectional prediction.
- the appropriate motion compensation mode is determined according to the characteristics of the current image block and the characteristics of the initial prediction block of the current image block, which not only takes into account the high compression ratio, but also takes into account the coding and decoding complexity.
- the low feature effectively achieves the best balance of compression ratio and complexity.
- the bidirectional inter prediction method provided by the embodiment of the present application is applicable to a video transmission system.
- 1 is a simplified schematic diagram of an architecture of a video transmission system 100 to which embodiments of the present application may be applied. As shown in FIG. 1, the video transmission system includes a source device and a destination device.
- the source device includes a video source 101, a video encoder 102, and an output interface 103.
- video source 101 can include a video capture device (eg, a video camera), a video archive containing previously captured video data, a video input interface to receive video data from a video content provider, and/or for A computer graphics system that produces video data, or a combination of the above-described video data sources.
- the video source 101 is configured to acquire video data, perform pre-encoding processing on the captured video data, convert the optical signal into a digitized image sequence, and transmit the digitized image sequence to the video encoder 102.
- Video encoder 102 is used to encode a sequence of images from video source 101 to obtain a code stream.
- Output interface 103 can include a modulator/demodulator (modem) and/or a transmitter.
- the output interface 103 is configured to transmit the code stream encoded by the video encoder 102.
- the source device transmits the encoded code stream directly to the destination device via output interface 103.
- the encoded code stream can also be stored on a storage medium or file server for later access by the destination device for decoding and/or playback.
- storage device 107 For example, storage device 107.
- the destination device includes an input interface 104, a video decoder 105, and a display device 106.
- input interface 104 includes a receiver and/or a modem.
- the input interface 104 can receive the code stream transmitted by the output interface 103 via the network 108 and transmit the code stream to the video decoder 105.
- Network 108 can be an IP network, including routers and switches.
- Video decoder 105 is operative to decode the code stream received by input interface 104 to reconstruct an image sequence.
- Video encoder 102 and video decoder 105 may operate in accordance with video compression standards (eg, the High Efficiency Video Codec H.265 standard).
- Display device 106 can be integral with the destination device or can be external to the destination device. In general, display device 106 displays the decoded video data. Display device 106 can include a variety of display devices, such as liquid crystal displays, plasma displays, organic light emitting diode displays, or other types of display devices.
- the destination device may further include a rendering module for rendering the reconstructed image sequence decoded by the video decoder 105 to improve the display effect of the video.
- the bidirectional inter prediction method described in this embodiment of the present application may be performed by the video encoder 102 and the video decoder 105 in the video transmission system shown in FIG. 1.
- a video encoder and a video decoder will be briefly described below with reference to FIGS. 2 and 3.
- Video encoder 200 includes an inter predictor 201, an intra predictor 202, a summer 203, a transformer 204, a quantizer 205, and an entropy encoder 206.
- video encoder 200 also includes inverse quantizer 207, inverse transformer 208, summer 209, and filter unit 210.
- the inter predictor 201 includes a motion estimation unit and a motion compensation unit.
- the intra predictor 202 includes a selection intra prediction unit and an intra prediction unit.
- Filter unit 210 is intended to represent one or more loop filters, such as a deblocking filter, an adaptive loop filter (ALF), and a sample adaptive offset (SAO) filter.
- ALF adaptive loop filter
- SAO sample adaptive offset
- filter unit 210 is illustrated as an in-loop filter in FIG. 2, in other implementations, filter unit 210 can be implemented as a post-loop filter.
- video encoder 200 may also include a video data store, a splitting unit (not shown).
- the video data store can store video data to be encoded by components of video encoder 200.
- the video data stored in the video data storage can be obtained from a video source.
- the DPB 107 can be a reference image memory that stores reference video data for encoding video data in the intra, inter coding mode by the video encoder 200.
- the video data memory and DPB 107 may be formed by any of a variety of memory devices, such as dynamic random access memory (DRAM) including synchronous DRAM (SDRAM), magnetoresistive RAM (MRAM), resistive RAM (RRAM), or Other types of memory devices.
- DRAM dynamic random access memory
- SDRAM synchronous DRAM
- MRAM magnetoresistive RAM
- RRAM resistive RAM
- the video data store and DPB 107 may be provided by the same memory device or a separate memory device.
- the video data store can be on-chip with other components of video encoder 100, or off-chip relative to those components.
- Video encoder 200 receives the video data and stores the video data in a video data store.
- the segmentation unit divides the video data into a plurality of image blocks, and the image blocks may be further divided into smaller blocks, such as image block segmentation based on a quadtree structure or a binary tree structure. This segmentation may also include segmentation into slices, tiles, or other larger cells.
- Video encoder 200 generally illustrates the components that encode image blocks within a video strip to be encoded. A stripe can be divided into multiple image blocks (and possibly into a collection of image blocks called slices).
- the current image block may be inter-predicted by the inter predictor 201.
- Inter-frame prediction refers to finding a matching reference block for the current image block in the current image in the reconstructed image, thereby obtaining motion information of the current image block, and then calculating pixel values of the pixel points in the current image block according to the motion information. Prediction information (predicted block).
- Prediction information predicted block
- the process of calculating motion information is called motion estimation.
- the motion estimation process needs to try multiple reference blocks in the reference picture for the current picture block, which one or which reference blocks are ultimately used for prediction, or rate-distortion optimization (RDO) or other methods.
- RDO rate-distortion optimization
- the process of calculating the predicted block of the current image block is called motion compensation.
- the bidirectional inter prediction method described in this embodiment of the present application may be performed by the inter predictor 201.
- the current image block may also be intra predicted by the intra predictor 202.
- Intra prediction refers to predicting the pixel value of a pixel in a current image block by using the pixel value of a pixel in the reconstructed image block in the image in which the current image block is located.
- the video encoder 200 forms a residual image block by subtracting the prediction block from the current image block to be encoded.
- Summer 203 represents one or more components that perform this subtraction.
- the residual video data in the residual block may be included in one or more transform units (TUs) and applied to the transformer 204.
- Transformer 204 transforms the residual video data into residual transform coefficients using transforms such as discrete cosine transforms or conceptually similar transforms.
- Transformer 204 can convert residual video data from a pixel value domain to a transform domain, such as a frequency domain.
- Transformer 204 can send the resulting transform coefficients to quantizer 205.
- Quantizer 205 quantizes the transform coefficients to further reduce the bit rate.
- quantizer 205 can then perform a scan of the matrix containing the quantized transform coefficients.
- entropy encoder 206 may perform a scan.
- entropy coder 206 After quantization, entropy coder 206 entropy encodes the quantized transform coefficients. For example, entropy encoder 206 may perform context adaptive variable length coding (CAVLC), context adaptive binary arithmetic coding (CABAC), syntax based context adaptive binary arithmetic coding (SBAC), probability interval segmentation entropy (PIPE) Encoding or another entropy encoding method or technique.
- CAVLC context adaptive variable length coding
- CABAC context adaptive binary arithmetic coding
- SBAC syntax based context adaptive binary arithmetic coding
- PIPE probability interval segmentation entropy
- the encoded code stream may be transmitted to video decoder 300, or archived for later transmission or retrieved by video decoder 300.
- the entropy encoder 206 may also entropy encode the syntax elements of the current image block to be encoded.
- the inverse quantizer 207 and the inverse variator 208 apply inverse quantization and inverse transform, respectively, to reconstruct the residual block in the pixel domain, for example, for later use as a reference block of the reference image.
- Summer 209 adds the reconstructed residual block to the prediction block generated by inter predictor 201 or intra predictor 202 to produce a reconstructed image block.
- Filter unit 210 may be adapted to reconstructed image blocks to reduce distortion, such as block artifacts.
- the reconstructed image block is then stored as a reference block in the decoded image buffer and can be used by the inter predictor 201 as a reference block for inter prediction of subsequent video frames or blocks in the image.
- video encoder 200 may directly quantize the residual signal without the need to be processed by transformer 204, and accordingly need not be processed by inverse transformer 208; or, for some image blocks Or the image frame, video encoder 200 does not generate residual data, and accordingly does not need to be processed by transformer 203, quantizer 205, inverse quantizer 207, and inverse transformer 208; alternatively, video encoder 200 can reconstruct the reconstructed image
- the block is stored directly as a reference block without being processed by filter unit 210; alternatively, quantizer 205 and inverse quantizer 207 in video encoder 200 may be combined.
- the video encoder 200 is for outputting video to the post-processing entity 211.
- Post-processing entity 211 represents an example of a video entity that can process encoded video data from video encoder 200, such as a Media Perception Network Element (MANE) or stitching/editing device.
- MEM Media Perception Network Element
- post-processing entity 211 can be an instance of a network entity.
- post-processing entity 211 and video encoder 200 may be portions of a separate device, while in other cases, the functionality described with respect to post-processing entity 211 may be the same device including video encoder 200. carried out.
- post-processing entity 211 is an example of storage device 107 of FIG.
- FIG. 3 is a simplified schematic diagram of a video decoder 300 in accordance with an embodiment of the present application.
- the video decoder 300 includes an entropy decoder 301, an inverse quantizer 302, an inverse transformer 303, a summer 304, a filter unit 305, an inter predictor 306, and an intra predictor 307.
- Video decoder 300 may perform a decoding process that is substantially reciprocal with respect to the encoding process described with respect to video encoder 200 from FIG. First, the residual information is obtained by the entropy decoder 301, the inverse quantizer 302, and the inverse transformer 303, and the decoded code stream determines whether the current image block uses intra prediction or inter prediction.
- the intra predictor 307 constructs the prediction information according to the used intra prediction method using the pixel values of the pixels in the surrounding reconstructed region. If it is inter prediction, the inter predictor 306 needs to parse out the motion information, and uses the parsed motion information to determine the reference block in the reconstructed image, and uses the pixel value of the pixel in the block as the prediction information. The prediction information plus the residual information is filtered to obtain reconstruction information.
- the bidirectional inter-frame prediction method described in this embodiment of the present application is applicable not only to a wireless application scenario, but also to video codec supporting multiple multimedia applications such as the following applications: aerial television broadcasting, cable television transmission, satellite television transmission, and streaming Transmission of video transmissions (e.g., via the Internet), encoding of video data stored on a data storage medium, decoding of video data stored on a data storage medium, or other application.
- a video codec system can be configured to support one-way or two-way video transmission to support applications such as video streaming, video playback, video broadcasting, and/or video telephony.
- the bidirectional inter prediction method provided by the embodiment of the present application may be performed by a bidirectional inter prediction apparatus, may also be performed by a video codec apparatus, may also be performed by a video codec, and may also be executed by other devices having a video codec function.
- the embodiment of the present application does not specifically limit this.
- bidirectional inter prediction method will be described below by taking a bidirectional inter prediction apparatus as an execution subject as an example.
- FIG. 4 is a schematic flowchart diagram of a bidirectional inter-frame prediction method according to an embodiment of the present application.
- the bi-directional inter-frame prediction method shown in FIG. 4 can occur both in the encoding process and in the decoding process.
- the bi-directional inter-frame prediction method shown in FIG. 4 can occur during the inter-frame prediction process at the time of encoding and decoding.
- the bidirectional inter prediction method includes:
- the bidirectional inter prediction device acquires motion information of the current image block.
- the current image block is an image block to be encoded or an image block to be decoded. If the current image block is an image block to be encoded, the motion information of the current image block can be obtained from the motion estimation. If the current image block is an image block to be decoded, the motion information of the current image block can be obtained according to the code stream decoding.
- the motion information mainly includes prediction direction information of the current image block, a reference frame index of the current image block, and a motion vector of the current image block.
- the prediction direction information of the current image block includes forward prediction, backward prediction, and bidirectional prediction.
- the reference frame index of the current image block indicates the index of the frame in which the reference block of the current image block is located.
- the reference frame index of the current image block includes a forward reference frame index of the current image block and a backward reference frame index of the current image block, depending on the prediction direction.
- the motion vector of the current image block represents the motion displacement of the current image block relative to the reference block.
- the motion vector includes a horizontal component (denoted as MV x ) and a vertical component (denoted as MV y ).
- the horizontal component represents the motion displacement of the current image block in the horizontal direction with respect to the reference block.
- the vertical component represents the motion displacement of the current image block in the vertical direction with respect to the reference block.
- the bi-predicted motion information includes a first reference frame index, a second reference frame index, a first motion vector, and a second motion vector.
- the first reference frame index is used to indicate an index of a frame in which the forward reference block of the current image block is located.
- the first motion vector is used to represent the motion displacement of the current image block relative to the forward reference block.
- the second reference frame index is used to indicate an index of a frame in which the backward reference block of the current image block is located.
- the second motion vector is used to represent the motion displacement of the current image block relative to the backward reference block.
- B represents the current image block.
- the frame in which the current image block is located is the current frame.
- A denotes a forward reference block.
- the frame in which the forward reference block is located is a forward reference frame.
- C denotes a backward reference block.
- the frame in which the backward reference block is located is a backward reference frame.
- 0 means forward and 1 means backward.
- MV0 represents a forward motion vector
- MV0 (MV0 x , MV0 y ), where MV0 x represents the horizontal component of the forward motion vector and MV0 y represents the vertical component of the forward motion vector.
- MV1 represents a backward motion vector
- MV1 (MV1 x , MV1 y )
- MV1 x represents the horizontal component of the forward motion vector
- MV1 y represents the vertical component of the forward motion vector.
- the broken line indicates the motion trajectory of the current image block B.
- the bidirectional inter prediction apparatus acquires an initial prediction block of the current image block according to the motion information.
- the process of acquiring the initial prediction block of the current image block according to the motion information may refer to the prior art, and the initial prediction block of the current image block includes a forward prediction block and a backward prediction block.
- S402 can be implemented by the following detailed steps.
- the bidirectional inter prediction apparatus determines a first initial prediction block of the current image block according to the first reference frame index and the first motion vector.
- the bidirectional inter prediction apparatus may determine, according to the first reference frame index, a first reference frame in which the first reference block of the current image block is located, and then determine, in the first reference frame, the first of the current image block according to the first motion vector.
- the first reference block is subjected to sub-pixel interpolation to obtain a first initial prediction block.
- the first initial prediction block may refer to a forward prediction block of the current image block.
- the first reference frame index is a forward reference frame index.
- the forward reference frame in which the forward reference block A of the current image block B is located is first determined according to the forward reference frame index, and then forwarded according to the coordinates (i, j) of the current image block.
- (i, j) represents the coordinates of the point in the upper left corner of the current image block B in the current frame.
- the coordinate origin of the current frame is the point of the upper left corner of the current frame in which the current image block B is located.
- (i', j') represents the coordinates of the point in the upper left corner of the block B' in the forward reference frame.
- the coordinate origin of the forward reference frame is the point at the upper left corner of the forward reference frame where block B' is located.
- the bidirectional inter prediction apparatus determines a second initial prediction block of the current image block according to the second reference frame index and the second motion vector.
- the bidirectional inter prediction apparatus may determine, according to the second reference frame index, a second reference frame in which the second reference block of the current image block is located, and then determine, in the second reference frame, the second image block according to the second motion vector.
- the second reference block is subjected to sub-pixel interpolation to obtain a second initial prediction block.
- the second initial prediction block may refer to a backward prediction block of the current image block.
- the process of determining the backward prediction block of the current image block is the same as the process of determining the forward prediction block of the current image block, but the reference direction is different, and the specific method may refer to the description in S601. If the current image block is not bi-directionally predicted, the forward prediction block or the backward prediction block obtained at this time is the prediction block of the current image block.
- the bidirectional inter prediction apparatus determines a motion compensation mode of the current image block according to the attribute information of the initial prediction block.
- the attribute information of the initial prediction block includes a size of the initial prediction block, a number of pixel points included in the initial prediction block, and a pixel value of a pixel point included in the initial prediction block.
- the initial prediction block herein includes a first initial prediction block and a second initial prediction block.
- the manner of obtaining the first initial prediction block and the second initial prediction block may refer to the description of S402.
- the embodiment of the present application here describes how to determine the motion compensation mode of the current image block according to the attribute information of the initial prediction block, with the pixel value of the pixel point included in the initial prediction block.
- the current image block includes M*N pixels
- the first initial prediction block includes M*N pixels
- the second initial prediction block includes M*N pixels.
- N is an integer greater than or equal to 1
- M is an integer greater than or equal to 1
- M and N may or may not be equal.
- S403a can be implemented by the following detailed steps.
- the bidirectional inter prediction apparatus obtains M*N pixel difference values according to pixel values of M*N pixel points of the first initial prediction block and pixel values of M*N pixel points of the second initial prediction block.
- the bidirectional inter prediction apparatus may obtain a M*N pixel difference value according to a difference between a pixel value of M*N pixel points of the first initial prediction block and a pixel value of M*N pixel points of the second initial prediction block.
- the M*N pixel difference values are obtained by sequentially subtracting the pixel values of the respective pixel points included in the first initial prediction block from the pixel values at the corresponding positions in the second initial prediction block.
- the corresponding position described herein refers to the position of the same coordinate point in the same coordinate system.
- M*N pixel difference values are also equivalent to composing an intermediate prediction block.
- the current image block includes 4*4 pixels, ie b 0,0 , b 0,1 , b 0,2 , b 0,3 ....b 3,0 , b 3,1 , b 3,2 , b 3,3 .
- the first initial prediction block includes 4*4 pixels, ie, a 0,0 , a 0,1 , a 0,2 , a 0,3 ....a 3,0 , a 3,1 , a 3, 2 , a 3,3 .
- the second initial prediction block includes 4*4 pixels, ie, c 0,0 , c 0,1 , c 0,2 , c 0,3 ....c 3,0 , c 3,1 , c 3, 2 , c 3,3 .
- i is used as the abscissa
- j is used as the ordinate j to establish the two-dimensional Cartesian coordinate system.
- the pixel point a 0,0 in the first initial prediction block corresponds to the pixel point b 0,0 of the same coordinate node (0,0) in the same position in the second initial prediction block, and is subtracted by a 0,0.
- D(i, j) represents the pixel difference value of the pixel point of the coordinate (i, j), that is, the pixel difference value of the pixel point of the i-th row and the j-th column.
- A(i, j) represents the pixel value of the pixel point of the (i, j) coordinate included in the first initial prediction block.
- B(i, j) represents the pixel value of the pixel point of the (i, j) coordinate included in the second initial prediction block.
- Abs() indicates an absolute value operation.
- i is an integer and i takes 0 to M-1.
- j is an integer, j takes 0 to N-1.
- 4*4 pixels corresponding to 4*4 pixel differences can form an intermediate prediction block, and the intermediate prediction block includes 4*4 pixels, ie, d 0,0 , d 0,1 , d 0,2 , d 0,3 ....d 3,0 ,d 3,1 ,d 3,2 ,d 3,3 .
- the bidirectional inter prediction apparatus determines a texture complexity of the current image block according to the M*N pixel difference values.
- the bidirectional inter prediction apparatus may obtain M*N pixel difference values according to pixel values of M*N pixel points of the first initial prediction block and M*N pixel points of the second initial prediction block, and then The texture complexity of the current image block is determined according to the M*N pixel difference values.
- the texture complexity of the current image block may be determined according to the sum of the M*N pixel difference values. It should be understood that the sum of the M*N pixel difference values herein may also refer to the sum of the absolute values of the M*N pixel difference values.
- the texture complexity of the current image block is the sum of M*N pixel difference values. Formulate the texture complexity, the formula is among them, Indicates texture complexity.
- the Sum of Absolute Differences (SAD) represents the sum of the absolute values of the M*N pixel difference values.
- the texture complexity of the current image block may be determined according to an average of the M*N pixel difference values.
- the texture complexity of the current image block is the average of the M*N pixel difference values.
- Formulate the texture complexity the formula is Where ⁇ represents the average of the M*N pixel differences. M*N indicates the number of pixels.
- the texture complexity of the current image block may be determined according to the standard deviation of the M*N pixel difference values.
- the texture complexity of the current image block is the standard deviation of the M*N pixel differences.
- Formulate the texture complexity the formula is Where ⁇ represents the standard deviation of the M*N pixel difference values.
- the bidirectional inter prediction apparatus determines a motion compensation manner according to a texture complexity of the current image block.
- the bidirectional inter prediction apparatus may determine the motion compensation mode according to the texture complexity of the current image block compared with a preset threshold. For example, determining whether the texture complexity of the current image block is less than a first threshold, and if the texture complexity of the current image block is less than the first threshold, determining that the motion compensation mode is a weighted prediction technique based on bidirectional prediction; if the texture of the current image block is complex The degree is greater than or equal to the first threshold, and the motion compensation mode is determined to be an optical flow technology based on bidirectional prediction.
- the first threshold is any real number greater than 0, such as 150 or 200. In practical applications, the first threshold may be adjusted according to the codec parameters, the specific codec, and the target codec time.
- the value of the first threshold may be set in advance or set in a high level syntax.
- the high-level syntax can be specified in a parameter set such as a sequence parameter set (SPS), a picture parameter set (PPS), or a slice header.
- the bidirectional inter prediction apparatus determines a motion compensation mode of the current image block according to the motion information and the attribute information of the initial prediction block.
- the motion compensation mode may be determined together with the attribute information of the initial prediction block according to the motion amplitude of the current image block.
- the magnitude of the motion of the current image block can be determined by the motion information.
- the attribute information of the initial prediction block may be obtained according to the foregoing S701 and S702, and details are not described herein again.
- the embodiment of the present application may further include the following detailed steps.
- the bidirectional inter prediction apparatus determines a first motion amplitude of the current image block according to the first motion vector, and determines a second motion amplitude of the current image block according to the second motion vector.
- the first motion amplitude is expressed by a formula, and the formula is Where MV0 x represents the horizontal component of the first motion vector (forward motion vector). MV0 y represents the vertical component of the first motion vector (forward motion vector). Formulaize the second motion amplitude, the formula is Where MV1 x represents the horizontal component of the second motion vector (backward motion vector). MV1 y represents the vertical component of the second motion vector (backward motion vector).
- sequence of the steps of the bi-directional inter-frame prediction method provided in this embodiment of the present application may be appropriately adjusted, and the steps may also be correspondingly increased or decreased according to the situation.
- sequence between S901, S701, and S702 may be For the interchange, S901 can be executed first, and then S701 and S702 can be performed. Any method that can be easily conceived by those skilled in the art within the technical scope of the present application should be covered by the scope of the present application. Therefore, I will not repeat them.
- the bidirectional inter prediction apparatus determines a selection probability according to a texture complexity of the current image block, a first motion amplitude, a second motion amplitude, and a first mathematical model.
- the first mathematical model can be a first logistic regression model.
- the first logistic regression model is as follows:
- ⁇ 0 , ⁇ 1 , ⁇ 2 and ⁇ 3 are parameters of the first logistic regression model.
- a typical value for ⁇ 0 is 2.06079643.
- a typical value for ⁇ 1 is -0.01175306.
- a typical value for ⁇ 2 is -0.00122516.
- a typical value for ⁇ 3 is -0.0008786.
- Substituting dist0 and dist1 into the first logistic regression model respectively yields a selection probability y.
- the parameters of the first logistic regression model may be set in advance or set in a high level syntax.
- the high-level syntax can be specified in SPS, PPS, slice header and other parameter sets.
- a first mapping table may be predefined at the time of encoding.
- the first mapping table saves each of the possible values of the texture complexity, the first motion amplitude, and the second motion amplitude of the current image block, and the corresponding selection probability y.
- the value of the selection probability y can be obtained by looking up the table.
- the bidirectional inter prediction apparatus determines a motion compensation mode according to the selection probability.
- the motion compensation mode can be determined by comparing the selection probability with a preset threshold. For example, determining whether the selection probability is greater than the second threshold, if the selection probability is greater than the second threshold, determining that the motion compensation mode is an optical flow technology based on bidirectional prediction; if the selection probability is less than or equal to the second threshold, determining that the motion compensation mode is based on two-way Predictive weighted prediction techniques.
- the second threshold is any real number greater than or equal to 0 and less than or equal to 1. For example, the second threshold may have a value of 0.7.
- the bidirectional inter prediction apparatus determines the motion compensation mode of the current image block according to the motion information and the attribute information of the current image block.
- the attribute information of the current image block includes the size of the current image block, the number of pixel points included in the current image block, and the pixel value of the pixel point included in the current image block.
- the bidirectional inter prediction apparatus determines the motion compensation mode based on the motion information and the attribute information of the current image block, taking the size of the current image block as an example in conjunction with the drawings. Since the current image block is composed of pixel dot arrays of pixels, the bidirectional inter prediction device can obtain the size of the current image block according to the pixel points. It can be understood that the size of the current image block is the width and height of the current image block. As shown in FIG. 10, S403c can be implemented by the following detailed steps.
- the bidirectional inter prediction apparatus is configured according to a size of the current image block, a horizontal component of the first motion vector, a vertical component of the first motion vector, a horizontal component of the second motion vector, a vertical component of the second motion vector, and a second mathematical model. Determine the probability of selection.
- the second mathematical model can be a second logistic regression model.
- the second logistic regression model is as follows:
- ⁇ 0 , ⁇ 1 , ⁇ 2 , ⁇ 3 , ⁇ 4 , ⁇ 5 and ⁇ 6 are parameters of the second logistic regression model.
- a typical value for ⁇ 0 is -0.18929861.
- a typical value for ⁇ 1 is 4.81815386e-03.
- a typical value for ⁇ 2 is 4.66279123e-03.
- the typical value of ⁇ 3 is -7.664996930e-05.
- a typical value for ⁇ 4 is 1.23565538e-04.
- a typical value for ⁇ 5 is -4.25855176e-05.
- a typical value for ⁇ 6 is 1.44069088e-04.
- W represents the width of the prediction block of the current image block.
- H represents the height of the prediction block of the current image block.
- MV0 x represents the horizontal component of the first motion vector (forward motion vector).
- MV0 y represents the vertical component of the first motion vector (forward motion vector).
- MV1 x represents the horizontal component of the second motion vector (backward motion vector).
- MV1 y represents the vertical component of the second motion vector (backward motion vector).
- the selection probability is obtained by substituting the size of the current image block, the horizontal component of the first motion vector, the vertical component of the first motion vector, the horizontal component of the second motion vector, and the vertical component of the second motion vector into the second logistic regression model, respectively. y.
- the parameters of the second logistic regression model may be set in advance or set in a high level syntax.
- the high-level syntax can be specified in SPS, PPS, slice header and other parameter sets.
- a second mapping table may be predefined at the time of encoding.
- the second mapping table saves the size of the current image block, the horizontal component of the first motion vector, the vertical component of the first motion vector, the horizontal component of the second motion vector, and each possible value of the vertical component of the second motion vector, And the value of the corresponding selection probability y.
- the value of the selection probability y can be obtained by looking up the table.
- the bidirectional inter prediction apparatus determines a motion compensation mode according to the selection probability.
- the bidirectional inter prediction apparatus performs motion compensation on the current image block according to the determined motion compensation mode and the initial prediction block.
- the initial prediction block includes a first initial prediction block and a second initial prediction block.
- the weighted prediction technique based on bidirectional prediction and the motion compensation of the current image block by the initial prediction block, and the optical flow technology based on the bidirectional prediction and the motion compensation of the current image block by the initial prediction block can refer to the specific implementation manner of the prior art. The application examples are not described herein again.
- the selected motion compensation mode may be written into the syntax element of the current image block. . There is no need to repeat the decision action during decoding, and it is only necessary to directly select the motion compensation method according to the syntax element.
- a current syntax element (Bio_flag) is assigned to the current image block, which takes up 1 bit in the code stream.
- Bio_flag When the value of Bio_flag is 0, it indicates that the motion compensation mode is a weighted prediction technique based on bidirectional prediction; when the value of Bio_flag is 1, it indicates that the motion compensation mode is an optical flow technology based on bidirectional prediction.
- the initial value of Bio_flag is 0.
- the decoding end parses the code stream, the value of the syntax element (Bio_flag) of the current decoded block is obtained.
- the motion compensation method used for bidirectional motion compensation is determined according to the value of Bio_flag. If the Bio_flag value is 0, the motion compensation mode is a weighted prediction technique based on bidirectional prediction; if the Bio_flag value is 1, the motion compensation mode is an optical flow technology based on bidirectional prediction.
- the decision method used by the bi-level inter-prediction apparatus to determine the motion compensation mode may also be set by setting a syntax element in a higher-level syntax.
- the decision method is the first decision method, the second decision method, or the third decision method.
- the first method of determining is to determine the motion compensation mode of the current image block according to the attribute information of the initial prediction block.
- the second decision method is based on determining the motion compensation mode of the current image block based on the motion information and the attribute information of the initial prediction block.
- the third decision method determines the motion compensation mode of the current image block based on the motion information and the attribute information of the current image block.
- Syntax elements can be set in SPS, PPS, slice header and other parameter sets.
- the syntax element can be a select mode (select_mode) that occupies 2 bits in the code stream.
- select_mode a select mode that occupies 2 bits in the code stream.
- the initial value of the syntax element select_mode is 0.
- select_mode the value of select_mode and the decision method indicated by it are as follows:
- select_mode Judgment method 0 First method of judgment 1 Second decision method 2 Third method of judgment
- the motion compensation mode is determined according to the specified decision method. If the determined decision method is the first decision method, the bidirectional interframe predictor performs bidirectional interframe prediction according to the first decision method. If the determined decision method is the second decision method, the bidirectional inter prediction apparatus performs bidirectional inter prediction according to the second decision method. If the determined decision method is the third decision method, the bidirectional inter prediction apparatus performs bidirectional inter prediction according to the third decision method.
- the bidirectional inter-frame prediction method in the embodiment of the present application performs motion compensation on the current image block, and determines a suitable motion compensation mode according to the feature of the current image block and the feature of the prediction block of the current image block, which not only takes into account the high compression ratio.
- the characteristics of the code and the low complexity of the code thus effectively achieving the best balance of compression ratio and complexity.
- each network element such as a bidirectional inter prediction apparatus, includes hardware structures and/or software modules corresponding to the execution of the respective functions in order to implement the above functions.
- each network element such as a bidirectional inter prediction apparatus
- the present application can be implemented in a combination of hardware or hardware and computer software in combination with the algorithmic steps of the various examples described in the embodiments disclosed herein. Whether a function is implemented in hardware or computer software to drive hardware depends on the specific application and design constraints of the solution. A person skilled in the art can use different methods to implement the described functions for each particular application, but such implementation should not be considered to be beyond the scope of the present application.
- the embodiment of the present application may divide the function module into the bidirectional inter prediction device according to the foregoing method example.
- each function module may be divided according to each function, or two or more functions may be integrated into one processing module.
- the above integrated modules can be implemented in the form of hardware or in the form of software functional modules. It should be noted that the division of the module in the embodiment of the present application is schematic, and is only a logical function division, and the actual implementation may have another division manner.
- FIG. 11 is a schematic diagram showing a possible composition of the bidirectional inter prediction apparatus involved in the above and the embodiments, as shown in FIG. 11, the bidirectional inter prediction apparatus is shown in FIG.
- the motion estimation unit 1101, the determination unit 1102, and the motion compensation unit 1103 may be included.
- the motion estimation unit 1101 is configured to support the bidirectional inter prediction apparatus to execute S401 in the bidirectional inter prediction method shown in FIG. 4, S401 in the bidirectional inter prediction method shown in FIG. 6, and the bidirectional shown in FIG. S401 in the inter prediction method, S401 in the bidirectional inter prediction method shown in FIG. 9, and S401 in the bidirectional inter prediction method shown in FIG.
- the determining unit 1102 is configured to support the bidirectional inter prediction apparatus to execute S402, S403a, S403b, and S403c in the bidirectional inter prediction method shown in FIG. 4, and S601, S602, and S403a in the bidirectional inter prediction method shown in FIG. , S403b and S403c, S601, S602, S701-S703, S403b, and S403c in the bidirectional inter prediction method shown in FIG. 7, S601, S602, S701-S703, and S901 in the bidirectional inter prediction method shown in FIG. -S903 and S403c, S601, S602, S403a, S403b, S1001, and S1002 in the bidirectional inter prediction method shown in FIG.
- the motion compensation unit 1103 is configured to support the bidirectional inter prediction apparatus to perform S404 in the bidirectional inter prediction method shown in FIG. 4, S404 in the bidirectional inter prediction method shown in FIG. 6, and the bidirectional interframe shown in FIG. S404 in the prediction method, S404 in the bidirectional inter prediction method shown in FIG. 9, and S404 in the bidirectional inter prediction method shown in FIG.
- the bidirectional inter prediction apparatus provided in the embodiment of the present application is configured to perform the bidirectional inter prediction method, and thus the same effect as the bidirectional inter prediction method described above can be achieved.
- FIG. 12 shows another possible composition diagram of the bidirectional inter prediction apparatus involved in the above embodiment.
- the bidirectional inter prediction apparatus includes a processing module 1201 and a communication module 1202.
- the processing module 1201 is configured to control and manage the action of the bidirectional inter prediction apparatus.
- the processing module 1201 is configured to support the bidirectional inter prediction apparatus to execute S402, S403a, S403b, and S403c in the bidirectional inter prediction method shown in FIG. S601, S602, S403a, S403b, and S403c in the bidirectional inter prediction method shown in FIG. 6, S601, S602, S701-S703, S403b, and S403c in the bidirectional inter prediction method shown in FIG. 7, FIG.
- Communication module 1202 is for supporting communication between the bi-directional inter prediction device and other network entities, such as with the functional modules or network entities shown in FIG. 1 or 3.
- the bidirectional inter prediction apparatus may further include a storage module 1203 for storing program codes and data of the bidirectional inter prediction apparatus.
- the processing module 1201 can be a processor or a controller. It is possible to implement or carry out the various illustrative logical blocks, modules and circuits described in connection with the present disclosure.
- the processor can also be a combination of computing functions, for example, including one or more microprocessor combinations, a combination of a DSP and a microprocessor, and the like.
- the communication module 1202 may be a transceiver circuit or a communication interface or the like.
- the storage module 1203 may be a memory.
- the bidirectional inter prediction apparatus 11 and the bidirectional inter prediction apparatus 12 can perform the bidirectional inter prediction method shown in any of the above-mentioned FIG. 4, FIG. 6, FIG. 7, FIG. 9, and FIG.
- the device 11 and the bi-directional inter prediction device 12 may specifically be a video encoding device, a video decoding device, or other device having a video codec function.
- the bidirectional inter prediction apparatus 11 and the bidirectional inter prediction apparatus 12 can be used for motion compensation in the encoding process as well as motion compensation in the decoding process.
- the application also provides a terminal, the terminal comprising: one or more processors, a memory, and a communication interface.
- the memory, communication interface is coupled to one or more processors; the memory is for storing computer program code, and the computer program code includes instructions for performing bidirectional inter prediction of embodiments of the present application when one or more processors execute the instructions method.
- the terminals here can be video display devices, smart phones, laptops, and other devices that can process video or play video.
- the present application also provides a video encoder, including a non-volatile storage medium, and a central processing unit, the non-volatile storage medium storing an executable program, the central processing unit and the non-volatile storage The medium is connected, and the executable program is executed to implement the bidirectional inter prediction method of the embodiment of the present application.
- the present application also provides a video decoder including a nonvolatile storage medium, and a central processing unit, the nonvolatile storage medium storing an executable program, the central processing unit and the nonvolatile storage The medium is connected, and the executable program is executed to implement the bidirectional inter prediction method of the embodiment of the present application.
- Another embodiment of the present application also provides a computer readable storage medium including one or more program codes, the one or more programs including instructions, when a processor in a terminal is executing the program code
- the terminal performs the bidirectional inter prediction method shown in any of the above-described FIGS. 4, 6, 7, 9, and 10.
- a computer program product comprising computer executable instructions stored in a computer readable storage medium; at least one processor of the terminal Reading the storage medium to read the computer to execute the instruction, and the at least one processor executing the computer to execute the instruction to cause the terminal to perform the bidirectional inter prediction as shown in any of the above-mentioned FIG. 4, FIG. 6, FIG. 7, FIG. 9, and FIG. The steps of the bi-directional inter-prediction device in the method.
- the computer can be a general purpose computer, a special purpose computer, a computer network, or other programmable device.
- the computer instructions can be stored in a computer readable storage medium or transferred from one computer readable storage medium to another computer readable storage medium, for example, the computer instructions can be from a website site, computer, server or data center Transfer to another website site, computer, server, or data center by wire (eg, coaxial cable, fiber optic, digital subscriber line (DSL), or wireless (eg, infrared, wireless, microwave, etc.).
- the computer readable storage medium can be any available media that can be accessed by a computer or a data storage device such as a server, data center, or the like that includes one or more available media.
- the usable medium can be a magnetic medium (eg, a floppy disk, a hard disk, a magnetic tape), an optical medium (eg, a DVD), or a semiconductor medium (such as a solid state disk (SSD)).
- the disclosed apparatus and method may be implemented in other manners.
- the device embodiments described above are merely illustrative.
- the division of the modules or units is only a logical function division.
- there may be another division manner for example, multiple units or components may be used.
- the combination may be integrated into another device, or some features may be ignored or not performed.
- the mutual coupling or direct coupling or communication connection shown or discussed may be an indirect coupling or communication connection through some interface, device or unit, and may be in an electrical, mechanical or other form.
- the units described as separate components may or may not be physically separated, and the components displayed as units may be one physical unit or multiple physical units, that is, may be located in one place, or may be distributed to multiple different places. . Some or all of the units may be selected according to actual needs to achieve the purpose of the solution of the embodiment.
- each functional unit in each embodiment of the present application may be integrated into one processing unit, or each unit may exist physically separately, or two or more units may be integrated into one unit.
- the above integrated unit can be implemented in the form of hardware or in the form of a software functional unit.
- the integrated unit if implemented in the form of a software functional unit and sold or used as a standalone product, may be stored in a readable storage medium.
- the technical solution of the embodiments of the present application may be embodied in the form of a software product in the form of a software product in essence or in the form of a contribution to the prior art, and the software product is stored in a storage medium.
- a number of instructions are included to cause a device (which may be a microcontroller, chip, etc.) or a processor to perform all or part of the steps of the methods described in various embodiments of the present application.
- the foregoing storage medium includes: a U disk, a mobile hard disk, a read-only memory (ROM), a random access memory (RAM), a magnetic disk, or an optical disk, and the like, which can store program code. .
Landscapes
- Engineering & Computer Science (AREA)
- Multimedia (AREA)
- Signal Processing (AREA)
- Compression Or Coding Systems Of Tv Signals (AREA)
Abstract
本申请实施例公开了一种双向帧间预测方法及装置,涉及视频编解码技术领域,解决了对于双向帧间预测如何选择双向预测运动补偿技术,来达到压缩比与计算复杂度的最佳权衡的问题。具体方案为:首先,获取当前图像块的运动信息,并根据运动信息获取当前图像块的初始预测块;然后,根据初始预测块的属性信息、或者根据运动信息和初始预测块的属性信息、或者根据运动信息和当前图像块的属性信息确定当前图像块的运动补偿方式,最后,根据确定的运动补偿方式以及初始预测块对当前图像块进行运动补偿。运动补偿方式为基于双向预测的加权预测技术或基于双向预测的光流技术。本申请实施例用于双向帧间预测的过程。
Description
本申请要求于2018年03月30日提交中国专利局、申请号为201810276300.0、申请名称为“一种双向帧间预测方法及装置”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
本申请实施例涉及视频编解码技术领域,尤其涉及一种双向帧间预测方法及装置。
视频编码压缩技术主要采用基于块的混合视频编码,将一帧视频图像划分为多个块(block),以块为单位,通过帧内预测(intra prediction)、帧间预测(inter prediction)、变换(transform)、量化(quantization)、熵编码(entropy encode)和环内滤波(in-loop filtering)(主要为去块滤波(de-blocking filtering))等步骤实现视频编码压缩。帧间预测也可以称为运动补偿预测(motion compensation prediction,MCP),即先得到块的运动信息,然后根据运动信息确定该块的预测像素值。计算块的运动信息的过程称为运动估计(motion estimation,ME),根据运动信息确定该块的预测像素值的过程称为运动补偿(motion compensation,MC)。根据预测方向的不同,帧间预测包括前向预测、后向预测和双向预测。
对于双向预测,首先,根据运动信息按照前向预测得到当前图像块的前向预测块,以及根据运动信息按照后向预测得到当前图像块的后向预测块,然后,基于双向预测的加权预测技术将前向预测块和后向预测块中相同像素位置的像素值经过加权预测得到当前图像块的预测块,或者,基于双向预测的光流技术(bi-directional optical flow,BIO)根据前向预测块和后向预测块确定当前图像块的预测块。
加权预测技术的优点为计算简单,但是,将加权预测技术运用在基于块级的运动补偿时,会导致纹理复杂的图像预测效果差,压缩效率不高。虽然,BIO技术能够通过像素级的运动细化来提高压缩比,但是,BIO技术计算复杂度高,极大地影响了编解码速度,而且在一些情况下,使用加权预测技术也能达到甚至超过BIO技术的压缩效果。因此,对于双向帧间预测如何选择双向预测时的运动补偿技术,来达到压缩比与计算复杂度的最佳权衡是一个亟待解决的问题。
发明内容
本申请实施例提供一种双向帧间预测方法及装置,解决了对于双向帧间预测如何选择双向预测运动补偿技术,来达到压缩比与计算复杂度的最佳权衡的问题。
为达到上述目的,本申请实施例采用如下技术方案:
本申请实施例的第一方面,提供一种双向帧间预测方法,包括:在获取到当前图像块的运动信息之后,先根据运动信息获取当前图像块的初始预测块,然后根据初始预测块的属性信息确定当前图像块的运动补偿方式,或者根据运动信息和初始预测块的属性信息确定当前图像块的运动补偿方式,或者根据运动信息和当前图像块的属性 信息确定当前图像块的运动补偿方式,最后根据确定的运动补偿方式以及初始预测块对当前图像块进行运动补偿。其中,当前图像块为待编码图像块或待解码图像块。运动补偿方式为基于双向预测的加权预测技术或基于双向预测的光流技术。
本申请实施例提供的双向帧间预测方法,在对当前图像块进行运动补偿,根据当前图像块的特征和当前图像块的初始预测块的特征确定合适的运动补偿方式,既兼顾了压缩比高的特点,又兼顾了编解码复杂度低的特点,从而,有效地达到了压缩比和复杂度的最佳平衡。
本申请实施例所述的运动信息可以包括第一参考帧索引、第二参考帧索引、第一运动矢量和第二运动矢量。结合第一方面,在一种可能的实现方式中,根据运动信息获取当前图像块的初始预测块,具体包括:根据第一参考帧索引和第一运动矢量确定当前图像块的第一初始预测块,并根据第二参考帧索引和第二运动矢量确定当前图像块的第二初始预测块,其中,第一参考帧索引用于表示当前图像块的前向参考块所在的帧的索引,第一运动矢量用于表示当前图像块相对前向参考块的运动位移,第一初始预测块的属性信息包括M*N个像素点的像素值,第二参考帧索引用于表示当前图像块的后向参考块所在的帧的索引,第二运动矢量用于表示当前图像块相对后向参考块的运动位移,第二初始预测块的属性信息包括M*N个像素点的像素值,N为大于等于1的整数,M为大于等于1的整数。
结合上述可能的实现方式,在一种可能的实现方式中,本申请实施例所述的根据初始预测块的属性信息确定当前图像块的运动补偿方式,具体包括:先根据第一初始预测块的M*N个像素点的像素值与第二初始预测块的M*N个像素点的像素值得到M*N个像素差值,然后,根据M*N个像素差值确定当前图像块的纹理复杂度,再根据当前图像块的纹理复杂度确定运动补偿方式。
可选的,在本申请的另一种可能的实现方式中,上述根据M*N个像素差值确定当前图像块的纹理复杂度,包括:计算M*N个像素差值的绝对值之和;将M*N个像素差值的绝对值之和确定为当前图像块的纹理复杂度。
可选的,在本申请的另一种可能的实现方式中,上述根据M*N个像素差值确定当前图像块的纹理复杂度,包括:计算M*N个像素差值的平均值;将M*N个像素差值的平均值确定为当前图像块的纹理复杂度。
可选的,在本申请的另一种可能的实现方式中,上述根据M*N个像素差值确定当前图像块的纹理复杂度,包括:计算M*N个像素差值的标准差;将M*N个像素差值的标准差确定为当前图像块的纹理复杂度。
可选的,在本申请的另一种可能的实现方式中,上述根据当前图像块的纹理复杂度确定运动补偿方式,具体包括:判断当前图像块的纹理复杂度是否小于第一阈值,第一阈值为大于0的任意实数;若当前图像块的纹理复杂度小于第一阈值,确定运动补偿方式为基于双向预测的加权预测技术;若当前图像块的纹理复杂度大于或等于第一阈值,确定运动补偿方式为基于双向预测的光流技术。
结合上述可能的实现方式,在一种可能的实现方式中,本申请实施例所述的当前图像块的运动幅度由运动信息确定,根据运动信息和初始预测块的属性信息确定运动补偿方式,具体包括:根据第一运动矢量确定当前图像块的第一运动幅度,并根据第 二运动矢量确定当前图像块的第二运动幅度;根据第一运动幅度、第二运动幅度、初始预测块的属性信息确定运动补偿方式。
可选的,在本申请的另一种可能的实现方式中,上述根据第一运动幅度、第二运动幅度、初始预测块的属性信息确定运动补偿方式,初始预测块的属性信息可以是像素点的像素值。获取初始预测块的属性信息的方式可以参考上述可能的实现方式。确定运动补偿方式方法包括:根据第一初始预测块的M*N个像素点的像素值与第二初始预测块的M*N个像素点的像素值得到M*N个像素差值;根据M*N个像素差值确定当前图像块的纹理复杂度;根据当前图像块的纹理复杂度、第一运动幅度、第二运动幅度和第一数学模型确定选择概率;或者,根据当前图像块的纹理复杂度、第一运动幅度和第二运动幅度查询第一映射表确定选择概率,第一映射表包括选择概率与当前图像块的纹理复杂度、第一运动幅度和第二运动幅度的对应关系;根据选择概率确定运动补偿方式。
结合第一方面,在一种可能的实现方式中,运动信息包括第一运动矢量和第二运动矢量,根据运动信息和当前图像块的属性信息确定当前图像块的运动补偿方式,包括:根据当前图像块的尺寸、第一运动矢量的水平分量、第一运动矢量的垂直分量、第二运动矢量的水平分量、第二运动矢量的垂直分量和第二数学模型确定选择概率,第一运动矢量包括第一运动矢量的水平分量和第一运动矢量的垂直分量,第二运动矢量包括第二运动矢量的水平分量和第二运动矢量的垂直分量;或者,根据当前图像块的尺寸、第一运动矢量的水平分量、第一运动矢量的垂直分量、第二运动矢量的水平分量和第二运动矢量的垂直分量查询第二映射表确定选择概率,第二映射表包括选择数值与当前图像块的尺寸、第一运动矢量的水平分量、第一运动矢量的垂直分量、第二运动矢量的水平分量和第二运动矢量的垂直分量的对应关系;根据选择概率确定运动补偿方式。
可选的,在本申请的另一种可能的实现方式中,上述根据选择概率确定运动补偿方式,具体包括:判断选择概率是否大于第二阈值,第二阈值为大于等于0且小于等于1的任意实数;若选择概率大于第二阈值,确定运动补偿方式为基于双向预测的光流技术;若选择概率小于或等于第二阈值,确定运动补偿方式为基于双向预测的加权预测技术。
本申请实施例的第二方面,提供一种编码方法,包括:上述任意方面所述的双向帧间预测方法用于编码过程中,当前图像块为待编码图像块。
本申请实施例的第三方面,提供一种解码方法,包括:上述任意方面所述的双向帧间预测方法用于解码过程中,当前图像块为待解码图像块。
本申请实施例的第四方面,提供一种双向帧间预测装置,包括:运动估计单元、确定单元和运动补偿单元。
具体的,上述运动估计单元,用于获取当前图像块的运动信息,当前图像块为待编码图像块或待解码图像块;上述确定单元,用于根据运动信息获取当前图像块的初始预测块;上述确定单元,还用于根据初始预测块的属性信息、或者根据运动信息和初始预测块的属性信息、或者根据运动信息和当前图像块的属性信息确定当前图像块的运动补偿方式,运动补偿方式为基于双向预测的加权预测技术或基于双向预测的光 流技术;上述运动补偿单元,用于根据确定的运动补偿方式以及初始预测块对当前图像块进行运动补偿。
本申请实施例提供的双向帧间预测方法,在对当前图像块进行运动补偿,根据当前图像块的特征和当前图像块的初始预测块的特征确定合适的运动补偿方式,既兼顾了压缩比高的特点,又兼顾了编解码复杂度低的特点,从而,有效地达到了压缩比和复杂度的最佳平衡。
本申请实施例所述的运动信息包括第一参考帧索引、第二参考帧索引、第一运动矢量和第二运动矢量。结合第四方面,在一种可能的实现方式中,上述确定单元,具体用于:根据第一参考帧索引和第一运动矢量确定当前图像块的第一初始预测块,第一参考帧索引用于表示当前图像块的前向参考块所在的帧的索引,第一运动矢量用于表示当前图像块相对前向参考块的运动位移,第一初始预测块的属性信息包括M*N个像素点的像素值,N为大于等于1的整数,M为大于等于1的整数;并根据第二参考帧索引和第二运动矢量确定当前图像块的第二初始预测块,第二参考帧索引用于表示当前图像块的后向参考块所在的帧的索引,第二运动矢量用于表示当前图像块相对后向参考块的运动位移,第二初始预测块的属性信息包括M*N个像素点的像素值。
结合上述可能的实现方式,在一种可能的实现方式中,上述确定单元,具体用于:根据第一初始预测块的M*N个像素点的像素值与第二初始预测块的M*N个像素点的像素值得到M*N个像素差值;根据M*N个像素差值确定当前图像块的纹理复杂度;根据当前图像块的纹理复杂度确定运动补偿方式。
可选的,在本申请的另一种可能的实现方式中,上述确定单元,具体用于:计算M*N个像素差值的绝对值之和;将M*N个像素差值的绝对值之和确定为当前图像块的纹理复杂度。
可选的,在本申请的另一种可能的实现方式中,上述确定单元,具体用于:计算M*N个像素差值的平均值;将M*N个像素差值的平均值确定为当前图像块的纹理复杂度。
可选的,在本申请的另一种可能的实现方式中,上述确定单元,具体用于:计算M*N个像素差值的标准差;将M*N个像素差值的标准差确定为当前图像块的纹理复杂度。
可选的,在本申请的另一种可能的实现方式中,上述确定单元,具体用于:判断当前图像块的纹理复杂度是否小于第一阈值,第一阈值为大于0的任意实数;若当前图像块的纹理复杂度小于第一阈值,确定运动补偿方式为基于双向预测的加权预测技术;若当前图像块的纹理复杂度大于或等于第一阈值,确定运动补偿方式为基于双向预测的光流技术。
结合上述可能的实现方式,在一种可能的实现方式中,本申请实施例所述的当前图像块的运动幅度由运动信息确定,上述确定单元,具体用于:根据第一运动矢量确定当前图像块的第一运动幅度,并根据第二运动矢量确定当前图像块的第二运动幅度;根据第一运动幅度、第二运动幅度、初始预测块的属性信息确定运动补偿方式。
可选的,在本申请的另一种可能的实现方式中,上述确定单元,具体用于:根据第一初始预测块的M*N个像素点的像素值与第二初始预测块的M*N个像素点的像素 值得到M*N个像素差值;根据M*N个像素差值确定当前图像块的纹理复杂度;根据当前图像块的纹理复杂度、第一运动幅度、第二运动幅度和第一数学模型确定选择概率;或者,根据当前图像块的纹理复杂度、第一运动幅度和第二运动幅度查询第一映射表确定选择概率,第一映射表包括选择概率与当前图像块的纹理复杂度、第一运动幅度和第二运动幅度的对应关系;根据选择概率确定运动补偿方式。
结合第四方面,在一种可能的实现方式中,运动信息包括第一运动矢量和第二运动矢量,上述确定单元,具体用于:根据当前图像块的尺寸、第一运动矢量的水平分量、第一运动矢量的垂直分量、第二运动矢量的水平分量、第二运动矢量的垂直分量和第二数学模型确定选择概率,第一运动矢量包括第一运动矢量的水平分量和第一运动矢量的垂直分量,第二运动矢量包括第二运动矢量的水平分量和第二运动矢量的垂直分量;或者,根据当前图像块的尺寸、第一运动矢量的水平分量、第一运动矢量的垂直分量、第二运动矢量的水平分量和第二运动矢量的垂直分量查询第二映射表确定选择概率,第二映射表包括选择数值与当前图像块的尺寸、第一运动矢量的水平分量、第一运动矢量的垂直分量、第二运动矢量的水平分量和第二运动矢量的垂直分量的对应关系;根据选择概率确定运动补偿方式。
可选的,在本申请的另一种可能的实现方式中,上述确定单元,具体用于:判断选择概率是否大于第二阈值,第二阈值为大于等于0且小于等于1的任意实数;若选择概率大于第二阈值,确定运动补偿方式为基于双向预测的光流技术;若选择概率小于或等于第二阈值,确定运动补偿方式为基于双向预测的加权预测技术。
本申请实施例的第五方面,提供一种终端,终端包括:一个或多个处理器、存储器和通信接口;存储器、通信接口与一个或多个处理器连接;终端通过通信接口与其他设备通信,存储器用于存储计算机程序代码,计算机程序代码包括指令,当一个或多个处理器执行指令时,终端执行上述任意方面的双向帧间预测方法。
本申请实施例的第六方面,提供一种包含指令的计算机程序产品,当计算机程序产品在计算机上运行时,使得计算机执行上述任意方面的双向帧间预测方法。
本申请实施例的第七方面,提供一种计算机可读存储介质,包括指令,当指令在终端上运行时,使得终端执行上述任意方面的双向帧间预测方法。
本申请实施例的第八方面,提供一种视频编码器,包括非易失性存储介质以及中央处理器,非易失性存储介质存储有可执行程序,中央处理器与非易失性存储介质连接,当中央处理器执行可执行程序时,视频编码器执行上述任意方面的双向帧间预测方法。
本申请实施例的第九方面,提供一种视频解码器,包括非易失性存储介质以及中央处理器,非易失性存储介质存储有可执行程序,中央处理器与非易失性存储介质连接,当中央处理器执行可执行程序时,视频解码器执行上述任意方面的双向帧间预测方法。
另外,上述任意方面的设计方式所带来的技术效果可参见第一方面中不同设计方式所带来的技术效果,此处不再赘述。
本申请实施例中,双向帧间预测装置、终端的名字对设备本身不构成限定,在实际实现中,这些设备可以以其他名称出现。只要各个设备的功能和本申请实施例类似, 属于本申请权利要求及其等同技术的范围之内。
图1为本申请实施例提供的一种视频传输系统架构的简化示意图;
图2为本申请实施例的视频编码器的简化示意图;
图3为本申请实施例的视频解码器的简化示意图;
图4为本申请实施例提供的一种双向帧间预测方法的流程图;
图5为本申请实施例提供的一种当前图像块的运动示意图;
图6为本申请实施例提供的另一种双向帧间预测方法的流程图;
图7为本申请实施例提供的又一种双向帧间预测方法的流程图;
图8为本申请实施例提供的一种获取M*N个像素差值的示意图;
图9为本申请实施例提供的再一种双向帧间预测方法的流程图;
图10为本申请实施例提供的再一种双向帧间预测方法的流程图;
图11为本申请实施例提供的一种双向帧间预测装置的组成示意图;
图12为本申请实施例提供的另一种双向帧间预测装置的组成示意图。
本申请的说明书和权利要求书中的术语“第一”和“第二”等是用于区别不同对象,而不是用于限定特定顺序。
在本申请实施例中,“示例性的”或者“例如”等词用于表示作例子、例证或说明。本申请实施例中被描述为“示例性的”或者“例如”的任何实施例或设计方案不应被解释为比其它实施例或设计方案更优选或更具优势。确切而言,使用“示例性的”或者“例如”等词旨在以具体方式呈现相关概念。
为了方便理解本申请实施例,首先在此介绍本申请实施例涉及到的相关要素。
视频编码(video encoding):将视频(图像序列)压缩成码流的处理过程。
视频解码(video decoding):将码流按照特定的语法规则和处理方法恢复成重建图像的处理过程。
在大多数的编码框架中,视频包括一系列图像(picture),一个图像称为一帧(frame)。图像被划分为至少一个条带,每个条带又被划分为图像块(block)。视频编码或视频解码以图像块为单位。例如,可从图像的左上角位置开始从左到右、从上到下、一行一行进行编码处理或解码处理。这里,图像块可以为视频编解码标准H.264中的宏块(macro block,MB),也可以为高效视频编码(figh efficiency video coding,HEVC)标准中的编码单元(coding unit,CU),本申请实施例对此不作具体限定。
本申请实施例中,正在进行编码处理或解码处理的图像块称为当前图像块(current block),当前图像块所在的图像称为当前帧(当前图像)。
在视频编码时,根据当前图像块的预测类型,当前帧可分为I帧、P帧和B帧。I帧是作为独立静态图像编码的帧,在视频流中提供随机存取点。P帧是由与其相邻的前一个I帧或者P帧预测而得到的帧,能作为接下来的P帧或B帧的参考帧。B帧是采用最邻近的前后两帧(可以是I帧或P帧)作为参考帧,进行双向预测得到的帧。本申请实施例中,当前帧指双向预测帧(B帧)。
由于视频中连续的若干帧图像之间存在较强时间相关性,也就是说,相邻帧之间 包含了很多冗余,所以在进行视频编码时,常利用各个帧之间的时间相关性来减少帧间的冗余,达到压缩数据的目的。目前,主要采用运动补偿的帧间预测技术对视频进行编码,来提高压缩比。
帧间预测指以编码图像块或解码图像块为单位,利用当前帧与其参考帧之间的相关性完成的预测,当前帧可以存在一个或多个参考帧。具体的,根据当前图像块的参考帧中的像素,生成当前图像块的预测块。
具体的,编码端在对当前帧中的当前图像块进行编码时,首先从视频图像已编码的帧中任意选取一个以上参考帧,并从参考帧中获取当前图像块对应的预测块,然后计算预测块与当前图像块之间的残差值,对该残差值进行量化编码;解码端在对当前帧中的当前图像块进行解码时,首先获取当前图像块对应的预测图像块,然后从接收到的码流中获取预测图像块与当前图像块的残差值,根据该残差值和预测块解码重构当前图像块。
视频中当前帧与其他帧之间的时间相关性不仅表现在当前帧与在其之前编码的帧之间存在时间相关性,也表现在当前帧与在其之后编码的帧之间存在时间相关性。基于此,在进行视频编码时,可以考虑双向帧间预测,以得到较佳的编码效果。
一般的,对于当前图像块而言,可以仅根据一个参考块生成当前图像块的预测块,也可以根据两个参考块生成当前图像块的预测块。上述根据一个参考块生成当前图像块的预测块称为单向帧间预测,上述根据两个参考块生成当前图像块的预测块称为双向帧间预测。双向帧间预测中的两个参考图像块可来自于同一个参考帧或者不同的参考帧。
可选的,双向帧间预测可以是指利用当前视频帧与在其之前编码且在其之前播放的视频帧之间的相关性,和当前视频帧与在其之前编码且在其之后播放的视频帧之间的相关性进行的帧间预测。
可以看出,上述双向帧间预测涉及两个方向的帧间预测,一般称为:前向帧间预测和后向帧间预测。前向帧间预测是指利用当前视频帧与在其之前编码且在其之前播放的视频帧之间的相关性进行的帧间预测。后向帧间预测是指利用当前视频帧与在其之前编码且在其之后播放的视频帧之间的相关性进行的帧间预测。
运动补偿是一种描述相邻帧(相邻在这里表示在编码关系上相邻,在播放顺序上两帧未必相邻)差别的方法,是根据运动信息找到当前图像块的参考块,对当前图像块的参考块经过处理得到当前图像块的预测块的过程,属于帧间预测过程中的一环。
对于双向帧间预测,需要基于双向预测的加权预测技术将当前图像块的前向预测块和当前图像块的后向预测块中相同像素位置的像素值经过加权预测才能得到当前图像块的预测块,或者,基于双向预测的光流技术根据当前图像块的前向预测块和当前图像块的后向预测块才能确定当前图像块的预测块。但是,基于双向预测的加权预测技术计算简单,压缩效率不高;基于双向预测的光流技术压缩效率高,计算复杂度也高。因此,如何选择双向预测时的运动补偿技术,来达到压缩比与计算复杂度的最佳权衡是一个亟待解决的问题。
针对上述问题,本申请实施例提供一种双向帧间预测方法,其基本原理是:在获取到当前图像块的运动信息之后,先根据运动信息获取当前图像块的初始预测块,然 后根据初始预测块的属性信息、或者根据运动信息和初始预测块的属性信息、或者根据运动信息和当前图像块的属性信息确定当前图像块的运动补偿方式,再根据确定的运动补偿方式以及初始预测块对当前图像块进行运动补偿。其中,当前图像块为待编码图像块或待解码图像块。运动补偿方式为基于双向预测的加权预测技术或基于双向预测的光流技术。从而,在对当前图像块进行运动补偿,根据当前图像块的特征和当前图像块的初始预测块的特征确定合适的运动补偿方式,既兼顾了压缩比高的特点,又兼顾了编解码复杂度低的特点,有效地达到了压缩比和复杂度的最佳平衡。
下面将结合附图对本申请实施例的实施方式进行详细描述。
本申请实施例提供的双向帧间预测方法适用于视频传输系统。图1示出的是可以应用本申请实施例的视频传输系统100架构的简化示意图。如图1所示,该视频传输系统包括源装置和目的装置。
源装置包含视频源101、视频编码器102及输出接口103。
在一些实例中,视频源101可包含视频俘获装置(例如,视频相机)、含有先前俘获的视频数据的视频存档、用以从视频内容提供者接收视频数据的视频输入接口,及/或用于产生视频数据的计算机图形系统,或上述视频数据源的组合。视频源101用于采集视频数据,并对采集到的视频数据进行编码前的处理,将光信号转化为数字化的图像序列,并将数字化的图像序列传输至视频编码器102。
视频编码器102用于编码来自视频源101的图像序列,得到码流。
输出接口103可包含调制器/解调器(调制解调器)及/或发射器。输出接口103用于将视频编码器102编码得到的码流发送出去。
在一些实例中,源装置经由输出接口103将编码得到的码流直接发射到目的装置。编码得到的码流还可存储于存储媒体或文件服务器上以供目的装置稍后存取以用于解码及/或播放。例如,存储装置107。
目的装置包含输入接口104、视频解码器105及显示装置106。
在一些实例中,输入接口104包含接收器及/或调制解调器。输入接口104可以接收输出接口103发送的经由网络108传输的码流,并将码流传输到视频解码器105。网络108可以IP网络,包括路由器和交换机等。
视频解码器105用于对输入接口104接收到的码流进行解码,重建图像序列。视频编码器102及视频解码器105可根据视频压缩标准(例如,高效率视频编解码H.265标准)而操作。
显示装置106可与目的装置整合或可在目的装置外部。一般来说,显示装置106显示解码后的视频数据。显示装置106可包括多种显示装置,例如液晶显示器、等离子体显示器、有机发光二极管显示器或其它类型的显示装置。
目的装置还可以包括渲染模块用于对视频解码器105解码得到的重建图像序列进行渲染,以提高视频的显示效果。
具体的,本申请实施例所述的双向帧间预测方法可以由图1所示的视频传输系统中的视频编码器102及视频解码器105执行。
下面结合图2和图3对视频编码器及视频解码器进行简单的介绍。
图2为本申请实施例的视频编码器200的简化示意图。视频编码器200包括帧间 预测器201、帧内预测器202、求和器203、变换器204、量化器205和熵编码器206。为了图像块重构,视频编码器200还包含反量化器207、反变换器208、求和器209和滤波器单元210。帧间预测器201包括运动估计单元和运动补偿单元。帧内预测器202包括选择帧内预测单元和帧内预测单元。滤波器单元210既定表示一或多个环路滤波器,例如去块滤波器、自适应环路滤波器(ALF)和样本自适应偏移(SAO)滤波器。尽管在图2中将滤波器单元210示出为环路内滤波器,但在其它实现方式下,可将滤波器单元210实施为环路后滤波器。在一种示例下,视频编码器200还可以包括视频数据存储器、分割单元(图中未示意)。视频数据存储器可存储待由视频编码器200的组件编码的视频数据。可从视频源获得存储在视频数据存储器中的视频数据。DPB107可为参考图像存储器,其存储用于由视频编码器200在帧内、帧间译码模式中对视频数据进行编码的参考视频数据。视频数据存储器和DPB107可由多种存储器装置中的任一者形成,例如包含同步DRAM(SDRAM)的动态随机存取存储器(DRAM)、磁阻式RAM(MRAM)、电阻式RAM(RRAM),或其它类型的存储器装置。视频数据存储器和DPB107可由同一存储器装置或单独存储器装置提供。在各种实例中,视频数据存储器可与视频编码器100的其它组件一起在芯片上,或相对于那些组件在芯片外。
视频编码器200接收视频数据,并将视频数据存储在视频数据存储器中。分割单元将所述视频数据分割成若干图像块,而且这些图像块可以被进一步分割为更小的块,例如基于四叉树结构或者二叉树结构的图像块分割。此分割还可包含分割成条带(slice)、片(tile)或其它较大单元。视频编码器200通常说明编码待编码的视频条带内的图像块的组件。条带可分成多个图像块(并且可能分成被称作片的图像块集合)。
在对视频数据进行分割得到当前图像块之后,可以通过帧间预测器201对当前图像块进行帧间预测。帧间预测是指在已重建的图像中,为当前图像中的当前图像块寻找匹配的参考块,从而得到当前图像块的运动信息,然后根据运动信息计算出当前图像块中像素点的像素值的预测信息(预测块)。其中,计算运动信息的过程称为运动估计。运动估计过程需要为当前图像块在参考图像中尝试多个参考块,最终使用哪一个或者哪几个参考块用作预测则使用率失真优化(rate-distortion optimization,RDO)或者其他方法确定。计算出当前图像块的预测块的过程称为运动补偿。具体的,本申请实施例所述的双向帧间预测方法可以由帧间预测器201执行。
在对视频数据进行分割得到当前图像块之后,还可以通过帧内预测器202对当前图像块进行帧内预测。帧内预测是指利用当前图像块所在的图像内已重建图像块内的像素点的像素值对当前图像块内像素点的像素值进行预测。
在视频数据经帧间预测器201和帧内预测器202产生当前图像块的预测块之后,视频编码器200通过从待编码的当前图像块减去预测块来形成残差图像块。求和器203表示执行此减法运算的一或多个组件。残差块中的残差视频数据可包含在一或多个变换单元(transform unit,TU)中,并应用于变换器204。变换器204使用例如离散余弦变换或概念上类似的变换等变换将残差视频数据变换成残差变换系数。变换器204可将残差视频数据从像素值域转换到变换域,例如频域。
变换器204可将所得变换系数发送到量化器205。量化器205量化变换系数以进一步减小位速率。在一些实例中,量化器205可接着执行对包含经量化的变换系数的 矩阵的扫描。或者,熵编码器206可执行扫描。
在量化之后,熵编码器206对经量化变换系数进行熵编码。举例来说,熵编码器206可执行上下文自适应可变长度编码(CAVLC)、上下文自适应二进制算术编码(CABAC)、基于语法的上下文自适应二进制算术编码(SBAC)、概率区间分割熵(PIPE)编码或另一熵编码方法或技术。在由熵编码器206熵编码之后,可将经编码码流发射到视频解码器300,或经存档以供稍后发射或由视频解码器300检索。熵编码器206还可对待编码的当前图像块的语法元素进行熵编码。
反量化器207和反变化器208分别应用逆量化和逆变换以在像素域中重构残差块,例如以供稍后用作参考图像的参考块。求和器209将经重构的残差块添加到由帧间预测器201或帧内预测器202产生的预测块,以产生经重构图像块。滤波器单元210可以适用于经重构图像块以减小失真,诸如方块效应(block artifacts)。然后,该经重构图像块作为参考块存储在经解码图像缓冲器中,可由帧间预测器201用作参考块以对后续视频帧或图像中的块进行帧间预测。
应当理解的是,视频编码器200的其它的结构变化可用于编码视频流。例如,对于某些图像块或者图像帧,视频编码器200可以直接地量化残差信号而不需要经变换器204处理,相应地也不需要经反变换器208处理;或者,对于某些图像块或者图像帧,视频编码器200没有产生残差数据,相应地不需要经变换器203、量化器205、反量化器207和反变换器208处理;或者,视频编码器200可以将经重构图像块作为参考块直接地进行存储而不需要经滤波器单元210处理;或者,视频编码器200中量化器205和反量化器207可以合并在一起。
视频编码器200用于将视频输出到后处理实体211。后处理实体211表示可处理来自视频编码器200的经编码视频数据的视频实体的实例,例如媒体感知网络元件(MANE)或拼接/编辑装置。在一些情况下,后处理实体211可为网络实体的实例。在一些视频编码系统中,后处理实体211和视频编码器200可为单独装置的若干部分,而在其它情况下,相对于后处理实体211所描述的功能性可由包括视频编码器200的相同装置执行。在某一实例中,后处理实体211是图1的存储装置107的实例。
图3为本申请实施例的视频解码器300的简化示意图。视频解码器300包括熵解码器301、反量化器302、反变换器303、求和器304、滤波器单元305、帧间预测器306和帧内预测器307。视频解码器300可执行大体上与相对于来自图2的视频编码器200描述的编码过程互逆的解码过程。首先,利用熵解码器301、反量化器302和反变换器303得到残差信息,解码码流确定当前图像块使用的是帧内预测还是帧间预测。如果是帧内预测,则帧内预测器307利用周围已重建区域内像素点的像素值按照所使用的帧内预测方法构建预测信息。如果是帧间预测,则帧间预测器306需要解析出运动信息,并使用所解析出的运动信息在已重建的图像中确定参考块,并将块内像素点的像素值作为预测信息,使用预测信息加上残差信息经过滤波操作便可以得到重建信息。
本申请实施例所述的双向帧间预测方法不仅适用于无线应用场景,还可以应用于支持以下应用等多种多媒体应用的视频编解码:空中电视广播、有线电视发射、卫星电视发射、流式传输视频发射(例如,经由因特网)、存储于数据存储媒体上的视频数 据的编码、存储于数据存储媒体上的视频数据的解码,或其它应用。在一些实例中,视频编解码系统可经配置以支持单向或双向视频发射,以支持例如视频流式传输、视频播放、视频广播及/或视频电话等应用。
本申请实施例提供的双向帧间预测方法可以由双向帧间预测装置执行,也可以由视频编解码装置执行,还可以由视频编解码器执行,还可以由其它具有视频编解码功能的设备执行,本申请实施例对此不作具体限定。
为了便于说明,下面以双向帧间预测装置为执行主体为例对双向帧间预测方法进行说明。
图4为本申请实施例提供的双向帧间预测方法的流程示意图。图4所示的双向帧间预测方法既可以发生在编码过程,也可以发生在解码过程。例如,图4所示的双向帧间预测方法可以发生在编解码时的帧间预测过程。如图4所示,该双向帧间预测方法包括:
S401、双向帧间预测装置获取当前图像块的运动信息。
当前图像块为待编码图像块或待解码图像块。若当前图像块为待编码图像块,当前图像块的运动信息可以根据运动估计获得。若当前图像块为待解码图像块,当前图像块的运动信息可以根据码流解码获得。
运动信息主要包括当前图像块的预测方向信息、当前图像块的参考帧索引和当前图像块的运动矢量。当前图像块的预测方向信息包括前向预测、后向预测和双向预测。当前图像块的参考帧索引指示当前图像块的参考块所在的帧的索引。根据预测方向的不同,当前图像块的参考帧索引包括当前图像块的前向参考帧索引和当前图像块的后向参考帧索引。当前图像块的运动矢量表示当前图像块相对于参考块的运动位移。
运动矢量包括水平分量(记作MV
x)和垂直分量(记作MV
y)。水平分量表示当前图像块相对于参考块在水平方向上的运动位移。垂直分量表示当前图像块相对于参考块在垂直方向上的运动位移。若预测方向信息指示前向预测或者后向预测时,运动矢量只有一个,若预测方向信息指示双向预测时,运动矢量有两个。例如,双向预测的运动信息包括第一参考帧索引、第二参考帧索引、第一运动矢量和第二运动矢量。第一参考帧索引用于表示当前图像块的前向参考块所在的帧的索引。第一运动矢量用于表示当前图像块相对前向参考块的运动位移。第二参考帧索引用于表示当前图像块的后向参考块所在的帧的索引。第二运动矢量用于表示当前图像块相对后向参考块的运动位移。
示例的,如图5所示,B表示当前图像块。当前图像块所在的帧为当前帧。A表示前向参考块。前向参考块所在的帧为前向参考帧。C表示后向参考块。后向参考块所在的帧为后向参考帧。0表示前向,1表示后向。MV0表示前向运动矢量,MV0=(MV0
x,MV0
y),其中,MV0
x表示前向运动矢量的水平分量,MV0
y表示前向运动矢量的垂直分量。MV1表示后向运动矢量,MV1=(MV1
x,MV1
y),其中,MV1
x表示前向运动矢量的水平分量,MV1
y表示前向运动矢量的垂直分量。虚线表示当前图像块B的运动轨迹。
S402、双向帧间预测装置根据运动信息获取当前图像块的初始预测块。
根据运动信息获取当前图像块的初始预测块的过程具体的可以参考现有技术,当 前图像块的初始预测块包括前向预测块和后向预测块。示例的,如图6所示,S402可以由以下详细步骤实现。
S601、双向帧间预测装置根据第一参考帧索引和第一运动矢量确定当前图像块的第一初始预测块。
首先,双向帧间预测装置根据第一参考帧索引可以确定当前图像块的第一参考块所在的第一参考帧,然后,在第一参考帧中根据第一运动矢量确定当前图像块的第一参考块,第一参考块经过亚像素插值得到第一初始预测块。第一初始预测块可以指当前图像块的前向预测块。
假设第一参考帧索引为前向参考帧索引。示例的,如图5所示,首先根据前向参考帧索引确定当前图像块B的前向参考块A所在的前向参考帧,然后,根据当前图像块的坐标(i,j)在前向参考帧中找到同样的坐标点(i′,j′),再根据当前图像块B的长和宽确定前向参考帧中的块B’,并根据当前图像块B的前向运动矢量MV0=(MV0
x,MV0
y)将块B’移动到前向参考块A,前向参考块A经过亚像素插值后得到当前图像块B的前向预测块。(i,j)表示当前图像块B左上角的点在当前帧中的坐标。当前帧的坐标原点为当前图像块B所在的当前帧的左上角的点。(i′,j′)表示块B’左上角的点在前向参考帧中的坐标。前向参考帧的坐标原点为块B’所在的前向参考帧的左上角的点。
S602、双向帧间预测装置根据第二参考帧索引和第二运动矢量确定当前图像块的第二初始预测块。
首先,双向帧间预测装置根据第二参考帧索引可以确定当前图像块的第二参考块所在的第二参考帧,然后,在第二参考帧中根据第二运动矢量确定当前图像块的第二参考块,第二参考块经过亚像素插值得到第二初始预测块。第二初始预测块可以指当前图像块的后向预测块。
需要说明的是,确定当前图像块的后向预测块的过程与确定当前图像块的前向预测块的过程相同,只是参考方向不同,具体的方法可以参考S601中的阐述。若当前图像块不是双向预测时,此时得到的前向预测块或后向预测块就是当前图像块的预测块。
S403a、双向帧间预测装置根据初始预测块的属性信息确定当前图像块的运动补偿方式。
初始预测块的属性信息包括初始预测块的尺寸、初始预测块包括的像素点的个数和初始预测块包括的像素点的像素值。另外,由于本申请实施例所述的方法是针对双向帧间预测的,因此,这里的初始预测块包括第一初始预测块和第二初始预测块。第一初始预测块和第二初始预测块的获取方式可以参考S402的阐述。本申请实施例在此以初始预测块包括的像素点的像素值为例对如何根据初始预测块的属性信息确定当前图像块的运动补偿方式进行说明。
示例的,假设当前图像块包括M*N个像素点,第一初始预测块包括M*N个像素点,第二初始预测块包括M*N个像素点。N为大于等于1的整数,M为大于等于1的整数,M与N可以相等,也可以不相等。如图7所示,S403a可以由以下详细步骤实现。
S701、双向帧间预测装置根据第一初始预测块的M*N个像素点的像素值与第二初始预测块的M*N个像素点的像素值得到M*N个像素差值。
双向帧间预测装置可以根据第一初始预测块的M*N个像素点的像素值与第二初始预测块的M*N个像素点的像素值做差,得到M*N个像素差值。应当理解的是,M*N个像素差值是第一初始预测块包括的各个像素点的像素值依次减去第二初始预测块中对应位置上的像素值得到的。这里所述的对应位置是指相对同样的坐标系中相同坐标点的位置。M*N个像素差值也相当于组成一个中间预测块。
示例的,如图8所示,假设M=4,N=4。当前图像块包括4*4个像素点,即b
0,0,b
0,1,b
0,2,b
0,3....b
3,0,b
3,1,b
3,2,b
3,3。第一初始预测块包括4*4个像素点,即a
0,0,a
0,1,a
0,2,a
0,3....a
3,0,a
3,1,a
3,2,a
3,3。第二初始预测块包括4*4个像素点,即c
0,0,c
0,1,c
0,2,c
0,3....c
3,0,c
3,1,c
3,2,c
3,3。以a
0,0、b
0,0和c
0,0作为坐标原点,以i作为横坐标,以j作为纵坐标j建立二维直角坐标系。例如,第一初始预测块中的像素点a
0,0对应第二初始预测块中相同位置的像素点为相同坐标节点(0,0)的像素点b
0,0,用a
0,0减去b
0,0得到坐标节点(0,0)的像素差值。根据第一初始预测块的4*4个像素点的像素值与第二初始预测块的4*4个像素点的像素值做差,得到4*4个像素差值。用于公式表示像素差值,公式为D(i,j)=abs(A(i,j)-B(i,j)),其中,(i,j)表示像素点的块内的坐标。D(i,j)表示坐标为(i,j)的像素点的像素差值,即第i行第j列的像素点的像素差值。A(i,j)表示第一初始预测块包括的坐标为(i,j)的像素点的像素值。B(i,j)表示第二初始预测块包括的坐标为(i,j)的像素点的像素值。abs()表示求绝对值运算。i为整数,i取0至M-1。j为整数,j取0至N-1。4*4个像素差值对应的4*4个像素点可以组成一个中间预测块,中间预测块包括4*4个像素点,即d
0,0,d
0,1,d
0,2,d
0,3....d
3,0,d
3,1,d
3,2,d
3,3。
S702、双向帧间预测装置根据M*N个像素差值确定当前图像块的纹理复杂度。
双向帧间预测装置在根据第一初始预测块的M*N个像素点的像素值与第二初始预测块的M*N个像素点的像素值得到M*N个像素差值之后,可以再根据M*N个像素差值确定当前图像块的纹理复杂度。
在一种可能的实现方式中,可以根据M*N个像素差值之和来确定当前图像块的纹理复杂度。应当理解的,这里M*N个像素差值之和也可以指M*N个像素差值的绝对值之和。当前图像块的纹理复杂度为M*N个像素差值之和。用公式表示纹理复杂度,公式为
其中,
表示纹理复杂度。误差绝对值和(Sum of Absolute Differences,SAD)表示M*N个像素差值的绝对值之和。
在另一种可能的实现方式中,可以根据M*N个像素差值的平均值来确定当前图像块的纹理复杂度。当前图像块的纹理复杂度为M*N个像素差值的平均值。用公式表示纹理复杂度,公式为
其中,μ表示M*N个像素差值的平均值。M*N表示像素点的个数。
在第三种可能的实现方式中,可以根据M*N个像素差值的标准差来确定当前图像块的纹理复杂度。当前图像块的纹理复杂度为M*N个像素差值的标准差。用公式表示纹理复杂度,公式为
其中,σ表示M*N个像素差值的标准差。
S703、双向帧间预测装置根据当前图像块的纹理复杂度确定运动补偿方式。
双向帧间预测装置可以根据当前图像块的纹理复杂度与预设阈值进行比较,来确定运动补偿方式。示例的,判断当前图像块的纹理复杂度是否小于第一阈值,若当前图像块的纹理复杂度小于第一阈值,确定运动补偿方式为基于双向预测的加权预测技术;若当前图像块的纹理复杂度大于或等于第一阈值,确定运动补偿方式为基于双向预测的光流技术。第一阈值为大于0的任意实数,如150或200等。在实际应用中,需根据编解码参数、具体的编解码器及目标编解码时间可以对第一阈值做相应调整。需要说明的是,第一阈值的取值可以通过预先设置或者在高层语法中设置。在高层语法可以是序列参数集(sequence parameter set,SPS)、图像参数集(picture parameter set,PPS)或片头(slice header)等参数集中指定。
S403b、双向帧间预测装置根据运动信息和初始预测块的属性信息确定当前图像块的运动补偿方式。
在双向帧间预测装置根据初始预测块的属性信息确定当前图像块的运动补偿方式时,还可以根据当前图像块的运动幅度来与初始预测块的属性信息共同确定运动补偿方式。当前图像块的运动幅度可以由运动信息确定。初始预测块的属性信息可以根据上述S701和S702得到,本申请实施例在此不再赘述。
示例的,如图9所示,在双向帧间预测装置根据M*N个像素差值确定当前图像块的纹理复杂度,即S702之后,本申请实施例还可以包括以下详细步骤。
S901、双向帧间预测装置根据第一运动矢量确定当前图像块的第一运动幅度,并根据第二运动矢量确定当前图像块的第二运动幅度。
示例的,用公式表示第一运动幅度,公式为
其中,MV0
x表示第一运动矢量(前向运动矢量)的水平分量。MV0
y表示第一运动矢量(前向运动矢量)的垂直分量。用公式表示第二运动幅度,公式为
其中,MV1
x表示第二运动矢量(后向运动矢量)的水平分量。MV1
y表示第二运动矢量(后向运动矢量)的垂直分量。
需要说明的是,本申请实施例提供的双向帧间预测方法步骤的先后顺序可以进行适当调整,步骤也可以根据情况进行相应增减,示例的,如S901、S701和S702之间的前后顺序可以互换,即可先执行S901,再执行S701和S702,任何熟悉本技术领域的技术人员在本申请揭露的技术范围内,可轻易想到变化的方法,都应涵盖在本申请的保护范围之内,因此不再赘述。
S902、双向帧间预测装置根据当前图像块的纹理复杂度、第一运动幅度、第二运动幅度和第一数学模型确定选择概率。
示例的,第一数学模型可以是第一逻辑回归模型。第一逻辑回归模型如下:
其中,ω
0,ω
1,ω
2和ω
3为第一逻辑回归模型的参数。ω
0的典型取值为2.06079643。ω
1的典型取值为-0.01175306。ω
2的典型取值为-0.00122516。ω
3的典型取值为-0.0008786。将dist0和dist1分别代入第一逻辑回归模型可以得出选择概率y。需要说明的是,第一逻辑回归模型的参数可以通过预先设置或者在高层语法中设置。在高层 语法可以是SPS,PPS,slice header等参数集中指定。
可选的,除了通过逻辑回归模型计算得到选择概率y以外,还可以在编码时预定义一个第一映射表。第一映射表中保存当前图像块的纹理复杂度、第一运动幅度和第二运动幅度的各个可能的取值,以及对应的选择概率y的取值。编码时可以通过查表的方式得到选择概率y的值。
S903、双向帧间预测装置根据选择概率确定运动补偿方式。
可以根据选择概率与预设阈值进行比较,来确定运动补偿方式。示例的,判断选择概率是否大于第二阈值,若选择概率大于第二阈值,确定运动补偿方式为基于双向预测的光流技术;若选择概率小于或等于第二阈值,确定运动补偿方式为基于双向预测的加权预测技术。第二阈值为大于等于0且小于等于1的任意实数。例如,第二阈值的取值可以为0.7。
S403c、双向帧间预测装置或者根据运动信息和当前图像块的属性信息确定当前图像块的运动补偿方式。
当前图像块的属性信息包括当前图像块的尺寸、当前图像块包括的像素点的个数和当前图像块包括的像素点的像素值。下面以当前图像块的尺寸为例结合附图详细解释双向帧间预测装置根据运动信息和当前图像块的属性信息确定运动补偿方式。由于当前图像块由像素点组成的像素点阵列,双向帧间预测装置便可以根据像素点得到当前图像块的尺寸。可理解的,当前图像块的尺寸为当前图像块的宽和高。如图10所示,S403c可以由以下详细步骤实现。
S1001、双向帧间预测装置根据当前图像块的尺寸、第一运动矢量的水平分量、第一运动矢量的垂直分量、第二运动矢量的水平分量、第二运动矢量的垂直分量和第二数学模型确定选择概率。
示例的,第二数学模型可以是第二逻辑回归模型。第二逻辑回归模型如下:
y=1/(1+exp(-1×(ω
0+ω
1·H+ω
2·W+ω
3·MV0
x+ω
4·MV0
y+ω
5·MV1
x+ω
6·MV1
y)))
其中,ω
0,ω
1,ω
2、ω
3、ω
4、ω
5和ω
6为第二逻辑回归模型的参数。ω
0的典型取值为-0.18929861。ω
1的典型取值为4.81715386e-03。ω
2的典型取值为4.66279123e-03。ω
3的典型取值为-7.46496930e-05。ω
4的典型取值为1.23565538e-04。ω
5的典型取值为-4.25855176e-05。ω
6的典型取值为1.44069088e-04。W表示当前图像块的预测块的宽。H表示当前图像块的预测块的高。MV0
x表示第一运动矢量(前向运动矢量)的水平分量。MV0
y表示第一运动矢量(前向运动矢量)的垂直分量。MV1
x表示第二运动矢量(后向运动矢量)的水平分量。MV1
y表示第二运动矢量(后向运动矢量)的垂直分量。将当前图像块的尺寸、第一运动矢量的水平分量、第一运动矢量的垂直分量、第二运动矢量的水平分量和第二运动矢量的垂直分量分别代入第二逻辑回归模型可以得出选择概率y。需要说明的是,第二逻辑回归模型的参数可以通过预先设置或者在高层语法中设置。在高层语法可以是SPS,PPS,slice header等参数集中指定。
可选的,除了通过第二逻辑回归模型计算得到选择概率y以外,还可以在编码时预定义一个第二映射表。第二映射表中保存当前图像块的尺寸、第一运动矢量的水平分量、第一运动矢量的垂直分量、第二运动矢量的水平分量、第二运动矢量的垂直分量的各个可能的取值,以及对应的选择概率y的取值。编码时可以通过查表的方式得 到选择概率y的值。
S1002、双向帧间预测装置根据选择概率确定运动补偿方式。
对于S1002的详细解释可以参考S903中的阐述,本申请实施例在此不再赘述。
S404、双向帧间预测装置根据确定的运动补偿方式以及初始预测块对当前图像块进行运动补偿。
初始预测块包括第一初始预测块和第二初始预测块。基于双向预测的加权预测技术以及初始预测块对当前图像块进行运动补偿,以及基于双向预测的光流技术以及初始预测块对当前图像块进行运动补偿都可以参考现有技术的具体实现方式,本申请实施例在此不再赘述。
进一步的,在双向帧间预测装置通过上述实施例所述的双向帧间预测方法确定双向运动补偿所使用的运动补偿方式后,可以将所选的运动补偿方式写入当前图像块的语法元素中。解码时无需重复进行判决动作,只需根据语法元素直接选择运动补偿方式。
示例的,为当前图像块分配一个语法元素(Bio_flag),该语法元素在码流中占1比特。当Bio_flag取值为0时,表示运动补偿方式为基于双向预测的加权预测技术;当Bio_flag取值为1时,表示运动补偿方式为基于双向预测的光流技术。Bio_flag的初始值为0。当解码端解析码流后,得到当前解码块的语法元素(Bio_flag)的值。根据Bio_flag的值确定双向运动补偿所使用的运动补偿方式。若Bio_flag取值为0,运动补偿方式为基于双向预测的加权预测技术;若Bio_flag取值为1,运动补偿方式为基于双向预测的光流技术。
可选的,还可以通过在高层语法设置语法元素用以指定双向帧间预测装置确定运动补偿方式所用的判决方法。例如,判决方法为第一种判决方法、第二种判决方法或第三种判决方法。第一种判决方法为根据初始预测块的属性信息确定当前图像块的运动补偿方式。第二种判决方法为根据根据运动信息和初始预测块的属性信息确定当前图像块的运动补偿方式。第三种判决方法根据根据运动信息和当前图像块的属性信息确定当前图像块的运动补偿方式。第一种判决方法、第二种判决方法和第三种判决方法的具体实现方式可以参考上述实施例的详细阐述,本申请实施例在此不再赘述。语法元素可以设置在SPS,PPS,slice header等参数集中。
示例的,语法元素可以是选择模式(select_mode),该语法元素在码流中占2比特。语法元素select_mode的初始值为0。如表1所示,select_mode的取值及其所指示的判决方法如下表所示:
表1
select_mode的取值 | 判决方法 |
0 | 第一种判决方法 |
1 | 第二种判决方法 |
2 | 第三种判决方法 |
在双向帧间预测装置获取到当前图像块的运动信息之后,根据指定的判决方法确定运动补偿方式。如果确定的判决方法为第一种判决方法,双向帧间预测装置按照第一种判决方法进行双向帧间预测。如果确定的判决方法为第二种判决方法,双向帧间 预测装置按照第二种判决方法进行双向帧间预测。如果确定的判决方法为第三种判决方法,双向帧间预测装置按照第三种判决方法进行双向帧间预测。
本申请实施例所述的双向帧间预测方法,在对当前图像块进行运动补偿,根据当前图像块的特征和当前图像块的预测块的特征确定合适的运动补偿方式,既兼顾了压缩比高的特点,又兼顾了编解码复杂度低的特点,从而,有效地达到了压缩比和复杂度的最佳平衡。
上述主要从各个网元之间交互的角度对本申请实施例提供的方案进行了介绍。可以理解的是,各个网元,例如双向帧间预测装置为了实现上述功能,其包含了执行各个功能相应的硬件结构和/或软件模块。本领域技术人员应该很容易意识到,结合本文中所公开的实施例描述的各示例的算法步骤,本申请能够以硬件或硬件和计算机软件的结合形式来实现。某个功能究竟以硬件还是计算机软件驱动硬件的方式来执行,取决于技术方案的特定应用和设计约束条件。专业技术人员可以对每个特定的应用来使用不同方法来实现所描述的功能,但是这种实现不应认为超出本申请的范围。
本申请实施例可以根据上述方法示例对双向帧间预测装置进行功能模块的划分,例如,可以对应各个功能划分各个功能模块,也可以将两个或两个以上的功能集成在一个处理模块中。上述集成的模块既可以采用硬件的形式实现,也可以采用软件功能模块的形式实现。需要说明的是,本申请实施例中对模块的划分是示意性的,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式。
在采用对应各个功能划分各个功能模块的情况下,图11示出了上述和实施例中涉及的双向帧间预测装置的一种可能的组成示意图,如图11所示,该双向帧间预测装置可以包括:运动估计单元1101、确定单元1102、运动补偿单元1103。
其中,运动估计单元1101,用于支持双向帧间预测装置执行图4所示的双向帧间预测方法中的S401,图6所示的双向帧间预测方法中的S401,图7所示的双向帧间预测方法中的S401,图9所示的双向帧间预测方法中的S401,图10所示的双向帧间预测方法中的S401。
确定单元1102,用于支持双向帧间预测装置执行图4所示的双向帧间预测方法中的S402、S403a、S403b和S403c,图6所示的双向帧间预测方法中的S601、S602、S403a、S403b和S403c,图7所示的双向帧间预测方法中的S601、S602、S701-S703、S403b和S403c,图9所示的双向帧间预测方法中的S601、S602、S701-S703、S901-S903和S403c,图10所示的双向帧间预测方法中的S601、S602、S403a、S403b、S1001和S1002。
运动补偿单元1103,用于支持双向帧间预测装置执行图4所示的双向帧间预测方法中的S404,图6所示的双向帧间预测方法中的S404,图7所示的双向帧间预测方法中的S404,图9所示的双向帧间预测方法中的S404,图10所示的双向帧间预测方法中的S404。
需要说明的是,上述方法实施例涉及的各步骤的所有相关内容均可以援引到对应功能模块的功能描述,在此不再赘述。
本申请实施例提供的双向帧间预测装置,用于执行上述双向帧间预测方法,因此可以达到与上述双向帧间预测方法相同的效果。
在采用集成的单元的情况下,图12示出了上述实施例中所涉及的双向帧间预测装置的另一种可能的组成示意图。如图12所示,该双向帧间预测装置包括:处理模块1201和通信模块1202。
处理模块1201用于对双向帧间预测装置的动作进行控制管理,例如,处理模块1201用于支持双向帧间预测装置执行图4所示的双向帧间预测方法中的S402、S403a、S403b和S403c,图6所示的双向帧间预测方法中的S601、S602、S403a、S403b和S403c,图7所示的双向帧间预测方法中的S601、S602、S701-S703、S403b和S403c,图9所示的双向帧间预测方法中的S601、S602、S701-S703、S901-S903和S403c,图10所示的双向帧间预测方法中的S601、S602、S403a、S403b、S1001和S1002、和/或用于本文所描述的技术的其它过程。通信模块1202用于支持双向帧间预测装置与其他网络实体的通信,例如与图1或图3中示出的功能模块或网络实体之间的通信。双向帧间预测装置还可以包括存储模块1203,用于存储双向帧间预测装置的程序代码和数据。
其中,处理模块1201可以是处理器或控制器。其可以实现或执行结合本申请公开内容所描述的各种示例性的逻辑方框,模块和电路。处理器也可以是实现计算功能的组合,例如包含一个或多个微处理器组合,DSP和微处理器的组合等等。通信模块1202可以是收发电路或通信接口等。存储模块1203可以是存储器。
其中,上述方法实施例涉及的各场景的所有相关内容均可以援引到对应功能模块的功能描述,在此不再赘述。
上述双向帧间预测装置11和双向帧间预测装置12均可执行上述图4、图6、图7、图9和图10中任一附图所示的双向帧间预测方法,双向帧间预测装置11和双向帧间预测装置12具体可以是视频编码装置、视频解码装置或者其他具有视频编解码功能的设备。双向帧间预测装置11和双向帧间预测装置12既可以用于在编码过程中进行运动补偿,也可以用于在解码过程中进行运动补偿。
本申请还提供一种终端,该终端包括:一个或多个处理器、存储器、通信接口。该存储器、通信接口与一个或多个处理器耦合;存储器用于存储计算机程序代码,计算机程序代码包括指令,当一个或多个处理器执行指令时,终端执行本申请实施例的双向帧间预测方法。
这里的终端可以是视频显示设备,智能手机,便携式电脑以及其它可以处理视频或者播放视频的设备。
本申请还提供一种视频编码器,包括非易失性存储介质,以及中央处理器,所述非易失性存储介质存储有可执行程序,所述中央处理器与所述非易失性存储介质连接,并执行所述可执行程序以实现本申请实施例的双向帧间预测方法。
本申请还提供一种视频解码器,包括非易失性存储介质,以及中央处理器,所述非易失性存储介质存储有可执行程序,所述中央处理器与所述非易失性存储介质连接,并执行所述可执行程序以实现本申请实施例的双向帧间预测方法。
本申请另一实施例还提供一种计算机可读存储介质,该计算机可读存储介质包括一个或多个程序代码,该一个或多个程序包括指令,当终端中的处理器在执行该程序代码时,该终端执行上述图4、图6、图7、图9和图10中任一附图所示的双向帧间预测方法。
在本申请的另一实施例中,还提供一种计算机程序产品,该计算机程序产品包括计算机执行指令,该计算机执行指令存储在计算机可读存储介质中;终端的至少一个处理器可以从计算机可读存储介质读取该计算机执行指令,至少一个处理器执行该计算机执行指令使得终端实施执行上述图4、图6、图7、图9和图10中任一附图所示的双向帧间预测方法中的双向帧间预测装置的步骤。
所述计算机可以是通用计算机、专用计算机、计算机网络、或者其他可编程装置。所述计算机指令可以存储在计算机可读存储介质中,或者从一个计算机可读存储介质向另一个计算机可读存储介质传输,例如,所述计算机指令可以从一个网站站点、计算机、服务器或数据中心通过有线(例如同轴电缆、光纤、数字用户线(DSL))或无线(例如红外、无线、微波等)方式向另一个网站站点、计算机、服务器或数据中心传输。所述计算机可读存储介质可以是计算机能够存取的任何可用介质或者是包含一个或多个可用介质集成的服务器、数据中心等数据存储设备。该可用介质可以是磁性介质,(例如,软盘,硬盘、磁带)、光介质(例如,DVD)或者半导体介质(例如固态硬盘solid state disk(SSD))等。
通过以上的实施方式的描述,所属领域的技术人员可以清楚地了解到,为描述的方便和简洁,仅以上述各功能模块的划分进行举例说明,实际应用中,可以根据需要而将上述功能分配由不同的功能模块完成,即将装置的内部结构划分成不同的功能模块,以完成以上描述的全部或者部分功能。
在本申请所提供的几个实施例中,应该理解到,所揭露的装置和方法,可以通过其它的方式实现。例如,以上所描述的装置实施例仅仅是示意性的,例如,所述模块或单元的划分,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式,例如多个单元或组件可以结合或者可以集成到另一个装置,或一些特征可以忽略,或不执行。另一点,所显示或讨论的相互之间的耦合或直接耦合或通信连接可以是通过一些接口,装置或单元的间接耦合或通信连接,可以是电性,机械或其它的形式。
所述作为分离部件说明的单元可以是或者也可以不是物理上分开的,作为单元显示的部件可以是一个物理单元或多个物理单元,即可以位于一个地方,或者也可以分布到多个不同地方。可以根据实际的需要选择其中的部分或者全部单元来实现本实施例方案的目的。
另外,在本申请各个实施例中的各功能单元可以集成在一个处理单元中,也可以是各个单元单独物理存在,也可以两个或两个以上单元集成在一个单元中。上述集成的单元既可以采用硬件的形式实现,也可以采用软件功能单元的形式实现。
所述集成的单元如果以软件功能单元的形式实现并作为独立的产品销售或使用时,可以存储在一个可读取存储介质中。基于这样的理解,本申请实施例的技术方案本质上或者说对现有技术做出贡献的部分或者该技术方案的全部或部分可以以软件产品的形式体现出来,该软件产品存储在一个存储介质中,包括若干指令用以使得一个设备(可以是单片机,芯片等)或处理器(processor)执行本申请各个实施例所述方法的全部或部分步骤。而前述的存储介质包括:U盘、移动硬盘、只读存储器(read-only memory,ROM)、随机存取存储器(random access memory,RAM)、磁碟或者光盘等各种可以存储程序代码的介质。
以上所述,仅为本申请的具体实施方式,但本申请的保护范围并不局限于此,任何在本申请揭露的技术范围内的变化或替换,都应涵盖在本申请的保护范围之内。因此,本申请的保护范围应以所述权利要求的保护范围为准。
Claims (29)
- 一种双向帧间预测方法,其特征在于,包括:获取当前图像块的运动信息,所述当前图像块为待编码图像块或待解码图像块;根据所述运动信息获取所述当前图像块的初始预测块;根据所述初始预测块的属性信息、或者根据所述运动信息和所述初始预测块的属性信息、或者根据所述运动信息和所述当前图像块的属性信息确定所述当前图像块的运动补偿方式,所述运动补偿方式为基于双向预测的加权预测技术或基于双向预测的光流技术BIO;根据确定的所述运动补偿方式以及所述初始预测块对所述当前图像块进行运动补偿。
- 根据权利要求1所述的方法,其特征在于,所述运动信息包括第一参考帧索引、第二参考帧索引、第一运动矢量和第二运动矢量;所述根据所述运动信息获取所述当前图像块的初始预测块,包括:根据所述第一参考帧索引和所述第一运动矢量确定所述当前图像块的第一初始预测块,所述第一参考帧索引用于表示所述当前图像块的前向参考块所在的帧的索引,所述第一运动矢量用于表示所述当前图像块相对所述前向参考块的运动位移,所述第一初始预测块的属性信息包括M*N个像素点的像素值,N为大于等于1的整数,M为大于等于1的整数;根据所述第二参考帧索引和所述第二运动矢量确定所述当前图像块的第二初始预测块,所述第二参考帧索引用于表示所述当前图像块的后向参考块所在的帧的索引,所述第二运动矢量用于表示所述当前图像块相对所述后向参考块的运动位移,所述第二初始预测块的属性信息包括M*N个像素点的像素值。
- 根据权利要求2所述的方法,其特征在于,所述根据所述初始预测块的属性信息确定所述当前图像块的运动补偿方式,包括:根据所述第一初始预测块的M*N个像素点的像素值与所述第二初始预测块的M*N个像素点的像素值得到M*N个像素差值;根据所述M*N个像素差值确定所述当前图像块的纹理复杂度;根据所述当前图像块的纹理复杂度确定所述运动补偿方式。
- 根据权利要求3所述的方法,其特征在于,所述根据所述M*N个像素差值确定所述当前图像块的纹理复杂度,包括:计算所述M*N个像素差值的绝对值之和;将所述M*N个像素差值的绝对值之和确定为所述当前图像块的纹理复杂度。
- 根据权利要求3所述的方法,其特征在于,所述根据所述M*N个像素差值确定所述当前图像块的纹理复杂度,包括:计算所述M*N个像素差值的平均值;将所述M*N个像素差值的平均值确定为所述当前图像块的纹理复杂度。
- 根据权利要求3所述的方法,其特征在于,所述根据所述M*N个像素差值确定所述当前图像块的纹理复杂度,包括:计算所述M*N个像素差值的标准差;将所述M*N个像素差值的标准差确定为所述当前图像块的纹理复杂度。
- 根据权利要求3-6中任一项所述的方法,其特征在于,所述根据所述当前图像块的纹理复杂度确定所述运动补偿方式,包括:判断所述当前图像块的纹理复杂度是否小于第一阈值,所述第一阈值为大于0的任意实数;若所述当前图像块的纹理复杂度小于所述第一阈值,确定所述运动补偿方式为基于双向预测的加权预测技术;若所述当前图像块的纹理复杂度大于或等于所述第一阈值,确定所述运动补偿方式为基于双向预测的光流技术。
- 根据权利要求2所述的方法,其特征在于,所述当前图像块的运动幅度由所述运动信息确定,所述根据所述运动信息和所述初始预测块的属性信息确定所述运动补偿方式,包括:根据所述第一运动矢量确定所述当前图像块的第一运动幅度,并根据所述第二运动矢量确定所述当前图像块的第二运动幅度;根据所述第一运动幅度、所述第二运动幅度、所述初始预测块的属性信息确定所述运动补偿方式。
- 根据权利要求8所述的方法,其特征在于,所述根据所述第一运动幅度、所述第二运动幅度、所述初始预测块的属性信息确定所述运动补偿方式,包括:根据所述第一初始预测块的M*N个像素点的像素值与所述第二初始预测块的M*N个像素点的像素值得到M*N个像素差值;根据所述M*N个像素差值确定所述当前图像块的纹理复杂度;根据所述当前图像块的纹理复杂度、所述第一运动幅度、所述第二运动幅度和第一数学模型确定选择概率;或者,根据所述当前图像块的纹理复杂度、所述第一运动幅度和所述第二运动幅度查询第一映射表确定选择概率,所述第一映射表包括选择概率与所述当前图像块的纹理复杂度、所述第一运动幅度和所述第二运动幅度的对应关系;根据所述选择概率确定所述运动补偿方式。
- 根据权利要求1所述的方法,其特征在于,所述运动信息包括第一运动矢量和第二运动矢量,所述根据所述运动信息和所述当前图像块的属性信息确定所述当前图像块的运动补偿方式,包括:根据所述当前图像块的尺寸、所述第一运动矢量的水平分量、所述第一运动矢量的垂直分量、所述第二运动矢量的水平分量、所述第二运动矢量的垂直分量和第二数学模型确定选择概率,所述第一运动矢量包括所述第一运动矢量的水平分量和所述第一运动矢量的垂直分量,所述第二运动矢量包括所述第二运动矢量的水平分量和所述第二运动矢量的垂直分量;或者,根据所述当前图像块的尺寸、所述第一运动矢量的水平分量、所述第一运动矢量的垂直分量、所述第二运动矢量的水平分量和所述第二运动矢量的垂直分量查询第二映射表确定选择概率,所述第二映射表包括选择数值与所述当前图像块的尺寸、所述第一运动矢量的水平分量、所述第一运动矢量的垂直分量、所述第二运动矢量的水平分量和所述第二运动矢量的垂直分量的对应关系;根据所述选择概率确定所述运动补偿方式。
- 根据权利要求9或10所述的方法,其特征在于,所述根据所述选择概率确定所述运动补偿方式,包括:判断所述选择概率是否大于第二阈值,所述第二阈值为大于等于0且小于等于1的任意实数;若所述选择概率大于所述第二阈值,确定所述运动补偿方式为基于双向预测的光流技术;若所述选择概率小于或等于所述第二阈值,确定所述运动补偿方式为基于双向预测的加权预测技术。
- 一种编码方法,其特征在于,包括:所述权利要求1-11中任一项所述的双向帧间预测方法用于编码过程中,所述当前图像块为待编码图像块。
- 一种解码方法,其特征在于,包括:所述权利要求1-11中任一项所述的双向帧间预测方法用于解码过程中,所述当前图像块为待解码图像块。
- 一种双向帧间预测装置,其特征在于,包括:运动估计单元,用于获取当前图像块的运动信息,所述当前图像块为待编码图像块或待解码图像块;确定单元,用于根据所述运动信息获取所述当前图像块的初始预测块;所述确定单元,还用于根据所述初始预测块的属性信息、或者根据所述运动信息和所述初始预测块的属性信息、或者根据所述运动信息和所述当前图像块的属性信息确定所述当前图像块的运动补偿方式,所述运动补偿方式为基于双向预测的加权预测技术或基于双向预测的光流技术BIO;运动补偿单元,用于根据确定的所述运动补偿方式以及所述初始预测块对所述当前图像块进行运动补偿。
- 根据权利要求14所述的装置,其特征在于,所述运动信息包括第一参考帧索引、第二参考帧索引、第一运动矢量和第二运动矢量;所述确定单元,具体用于:根据所述第一参考帧索引和所述第一运动矢量确定所述当前图像块的第一初始预测块,所述第一参考帧索引用于表示所述当前图像块的前向参考块所在的帧的索引,所述第一运动矢量用于表示所述当前图像块相对所述前向参考块的运动位移,所述第一初始预测块的属性信息包括M*N个像素点的像素值,N为大于等于1的整数,M为大于等于1的整数;根据所述第二参考帧索引和所述第二运动矢量确定所述当前图像块的第二初始预测块,所述第二参考帧索引用于表示所述当前图像块的后向参考块所在的帧的索引,所述第二运动矢量用于表示所述当前图像块相对所述后向参考块的运动位移,所述第二初始预测块的属性信息包括M*N个像素点的像素值。
- 根据权利要求15所述的装置,其特征在于,所述确定单元,具体用于:根据所述第一初始预测块的M*N个像素点的像素值与所述第二初始预测块的 M*N个像素点的像素值得到M*N个像素差值;根据所述M*N个像素差值确定所述当前图像块的纹理复杂度;根据所述当前图像块的纹理复杂度确定所述运动补偿方式。
- 根据权利要求16所述的装置,其特征在于,所述确定单元,具体用于:计算所述M*N个像素差值的绝对值之和;将所述M*N个像素差值的绝对值之和确定为所述当前图像块的纹理复杂度。
- 根据权利要求16所述的装置,其特征在于,所述确定单元,具体用于:计算所述M*N个像素差值的平均值;将所述M*N个像素差值的平均值确定为所述当前图像块的纹理复杂度。
- 根据权利要求16所述的装置,其特征在于,所述确定单元,具体用于:计算所述M*N个像素差值的标准差;将所述M*N个像素差值的标准差确定为所述当前图像块的纹理复杂度。
- 根据权利要求16-19中任一项所述的装置,其特征在于,所述确定单元,具体用于:判断所述当前图像块的纹理复杂度是否小于第一阈值,所述第一阈值为大于0的任意实数;若所述当前图像块的纹理复杂度小于所述第一阈值,确定所述运动补偿方式为基于双向预测的加权预测技术;若所述当前图像块的纹理复杂度大于或等于所述第一阈值,确定所述运动补偿方式为基于双向预测的光流技术。
- 根据权利要求15所述的装置,其特征在于,所述当前图像块的运动幅度由所述运动信息确定,所述确定单元,具体用于:根据所述第一运动矢量确定所述当前图像块的第一运动幅度,并根据所述第二运动矢量确定所述当前图像块的第二运动幅度;根据所述第一运动幅度、所述第二运动幅度、所述初始预测块的属性信息确定所述运动补偿方式。
- 根据权利要求21所述的装置,其特征在于,所述确定单元,具体用于:根据所述第一初始预测块的M*N个像素点的像素值与所述第二初始预测块的M*N个像素点的像素值得到M*N个像素差值;根据所述M*N个像素差值确定所述当前图像块的纹理复杂度;根据所述当前图像块的纹理复杂度、所述第一运动幅度、所述第二运动幅度和第一数学模型确定选择概率;或者,根据所述当前图像块的纹理复杂度、所述第一运动幅度和所述第二运动幅度查询第一映射表确定选择概率,所述第一映射表包括选择概率与所述当前图像块的纹理复杂度、所述第一运动幅度和所述第二运动幅度的对应关系;根据所述选择概率确定所述运动补偿方式。
- 根据权利要求14所述的装置,其特征在于,所述运动信息包括第一运动矢量和第二运动矢量,所述确定单元,具体用于:根据所述当前图像块的尺寸、所述第一运动矢量的水平分量、所述第一运动矢量 的垂直分量、所述第二运动矢量的水平分量、所述第二运动矢量的垂直分量和第二数学模型确定选择概率,所述第一运动矢量包括所述第一运动矢量的水平分量和所述第一运动矢量的垂直分量,所述第二运动矢量包括所述第二运动矢量的水平分量和所述第二运动矢量的垂直分量;或者,根据所述当前图像块的尺寸、所述第一运动矢量的水平分量、所述第一运动矢量的垂直分量、所述第二运动矢量的水平分量和所述第二运动矢量的垂直分量查询第二映射表确定选择概率,所述第二映射表包括选择数值与所述当前图像块的尺寸、所述第一运动矢量的水平分量、所述第一运动矢量的垂直分量、所述第二运动矢量的水平分量和所述第二运动矢量的垂直分量的对应关系;根据所述选择概率确定所述运动补偿方式。
- 根据权利要求22或23所述的装置,其特征在于,所述确定单元,具体用于:判断所述选择概率是否大于第二阈值,所述第二阈值为大于等于0且小于等于1的任意实数;若所述选择概率大于所述第二阈值,确定所述运动补偿方式为基于双向预测的光流技术;若所述选择概率小于或等于所述第二阈值,确定所述运动补偿方式为基于双向预测的加权预测技术。
- 一种终端,其特征在于,所述终端包括:一个或多个处理器、存储器和通信接口;所述存储器、所述通信接口与所述一个或多个处理器连接;所述终端通过所述通信接口与其他设备通信,所述存储器用于存储计算机程序代码,所述计算机程序代码包括指令,当所述一个或多个处理器执行所述指令时,所述终端执行如权利要求1-11中任意一项所述的双向帧间预测方法。
- 一种包含指令的计算机程序产品,其特征在于,当所述计算机程序产品在终端上运行时,使得所述终端执行如权利要求1-11中任意一项所述的双向帧间预测方法。
- 一种计算机可读存储介质,包括指令,其特征在于,当所述指令在终端上运行时,使得所述终端执行如权利要求1-11中任意一项所述的双向帧间预测方法。
- 一种视频编码器,包括非易失性存储介质以及中央处理器,其特征在于,所述非易失性存储介质存储有可执行程序,所述中央处理器与所述非易失性存储介质连接,当所述中央处理器执行所述可执行程序时,所述视频编码器执行如权利要求1-11中任意一项所述的双向帧间预测方法。
- 一种视频解码器,包括非易失性存储介质以及中央处理器,其特征在于,所述非易失性存储介质存储有可执行程序,所述中央处理器与所述非易失性存储介质连接,当所述中央处理器执行所述可执行程序时,所述视频解码器执行如权利要求1-11中任意一项所述的双向帧间预测方法。
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810276300.0 | 2018-03-30 | ||
CN201810276300.0A CN110324623B (zh) | 2018-03-30 | 2018-03-30 | 一种双向帧间预测方法及装置 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2019184639A1 true WO2019184639A1 (zh) | 2019-10-03 |
Family
ID=68060915
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/CN2019/076086 WO2019184639A1 (zh) | 2018-03-30 | 2019-02-25 | 一种双向帧间预测方法及装置 |
Country Status (2)
Country | Link |
---|---|
CN (2) | CN110324623B (zh) |
WO (1) | WO2019184639A1 (zh) |
Cited By (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111145151A (zh) * | 2019-12-23 | 2020-05-12 | 维沃移动通信有限公司 | 一种运动区域确定方法及电子设备 |
CN111754429A (zh) * | 2020-06-16 | 2020-10-09 | Oppo广东移动通信有限公司 | 运动矢量后处理方法和装置、电子设备及存储介质 |
CN112770113A (zh) * | 2019-11-05 | 2021-05-07 | 杭州海康威视数字技术股份有限公司 | 一种编解码方法、装置及其设备 |
CN114501010A (zh) * | 2020-10-28 | 2022-05-13 | Oppo广东移动通信有限公司 | 图像编码方法、图像解码方法及相关装置 |
CN114666580A (zh) * | 2019-12-31 | 2022-06-24 | Oppo广东移动通信有限公司 | 一种帧间预测方法、编码器、解码器及存储介质 |
CN115037933A (zh) * | 2022-08-09 | 2022-09-09 | 浙江大华技术股份有限公司 | 一种帧间预测的方法及设备 |
Families Citing this family (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112804534B (zh) * | 2019-11-14 | 2022-03-01 | 杭州海康威视数字技术股份有限公司 | 一种编解码方法、装置及其设备 |
CN111050168B (zh) * | 2019-12-27 | 2021-07-13 | 浙江大华技术股份有限公司 | 仿射预测方法及其相关装置 |
KR20210107409A (ko) * | 2020-02-24 | 2021-09-01 | 삼성전자주식회사 | 엣지 컴퓨팅 서비스를 이용한 영상 컨텐츠 전송 방법 및 장치 |
CN114071159B (zh) * | 2020-07-29 | 2023-06-30 | Oppo广东移动通信有限公司 | 帧间预测方法、编码器、解码器及计算机可读存储介质 |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101766030A (zh) * | 2007-07-31 | 2010-06-30 | 三星电子株式会社 | 使用加权预测的视频编码和解码方法以及设备 |
US20140269916A1 (en) * | 2011-11-28 | 2014-09-18 | Sk Telecom Co., Ltd. | Method and apparatus for video encoding/decoding using improved merge |
WO2017035831A1 (en) * | 2015-09-06 | 2017-03-09 | Mediatek Inc. | Adaptive inter prediction |
WO2017036399A1 (en) * | 2015-09-02 | 2017-03-09 | Mediatek Inc. | Method and apparatus of motion compensation for video coding based on bi prediction optical flow techniques |
Family Cites Families (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102934444A (zh) * | 2010-04-06 | 2013-02-13 | 三星电子株式会社 | 用于对视频进行编码的方法和设备以及用于对视频进行解码的方法和设备 |
US20180242004A1 (en) * | 2015-08-23 | 2018-08-23 | Lg Electronics Inc. | Inter prediction mode-based image processing method and apparatus therefor |
US10375413B2 (en) * | 2015-09-28 | 2019-08-06 | Qualcomm Incorporated | Bi-directional optical flow for video coding |
US10944963B2 (en) * | 2016-05-25 | 2021-03-09 | Arris Enterprises Llc | Coding weighted angular prediction for intra coding |
US10728572B2 (en) * | 2016-09-11 | 2020-07-28 | Lg Electronics Inc. | Method and apparatus for processing video signal by using improved optical flow motion vector |
-
2018
- 2018-03-30 CN CN201810276300.0A patent/CN110324623B/zh active Active
- 2018-03-30 CN CN202111040982.3A patent/CN113923455B/zh active Active
-
2019
- 2019-02-25 WO PCT/CN2019/076086 patent/WO2019184639A1/zh active Application Filing
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101766030A (zh) * | 2007-07-31 | 2010-06-30 | 三星电子株式会社 | 使用加权预测的视频编码和解码方法以及设备 |
US20140269916A1 (en) * | 2011-11-28 | 2014-09-18 | Sk Telecom Co., Ltd. | Method and apparatus for video encoding/decoding using improved merge |
WO2017036399A1 (en) * | 2015-09-02 | 2017-03-09 | Mediatek Inc. | Method and apparatus of motion compensation for video coding based on bi prediction optical flow techniques |
WO2017035831A1 (en) * | 2015-09-06 | 2017-03-09 | Mediatek Inc. | Adaptive inter prediction |
Cited By (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112770113A (zh) * | 2019-11-05 | 2021-05-07 | 杭州海康威视数字技术股份有限公司 | 一种编解码方法、装置及其设备 |
CN111145151A (zh) * | 2019-12-23 | 2020-05-12 | 维沃移动通信有限公司 | 一种运动区域确定方法及电子设备 |
CN111145151B (zh) * | 2019-12-23 | 2023-05-26 | 维沃移动通信有限公司 | 一种运动区域确定方法及电子设备 |
CN114666580A (zh) * | 2019-12-31 | 2022-06-24 | Oppo广东移动通信有限公司 | 一种帧间预测方法、编码器、解码器及存储介质 |
CN111754429A (zh) * | 2020-06-16 | 2020-10-09 | Oppo广东移动通信有限公司 | 运动矢量后处理方法和装置、电子设备及存储介质 |
CN111754429B (zh) * | 2020-06-16 | 2024-06-11 | Oppo广东移动通信有限公司 | 运动矢量后处理方法和装置、电子设备及存储介质 |
CN114501010A (zh) * | 2020-10-28 | 2022-05-13 | Oppo广东移动通信有限公司 | 图像编码方法、图像解码方法及相关装置 |
CN114501010B (zh) * | 2020-10-28 | 2023-06-06 | Oppo广东移动通信有限公司 | 图像编码方法、图像解码方法及相关装置 |
CN115037933A (zh) * | 2022-08-09 | 2022-09-09 | 浙江大华技术股份有限公司 | 一种帧间预测的方法及设备 |
CN115037933B (zh) * | 2022-08-09 | 2022-11-18 | 浙江大华技术股份有限公司 | 一种帧间预测的方法及设备 |
Also Published As
Publication number | Publication date |
---|---|
CN110324623A (zh) | 2019-10-11 |
CN110324623B (zh) | 2021-09-07 |
CN113923455A (zh) | 2022-01-11 |
CN113923455B (zh) | 2023-07-18 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
WO2019184639A1 (zh) | 一种双向帧间预测方法及装置 | |
TWI688262B (zh) | 用於視訊寫碼之重疊運動補償 | |
RU2705428C2 (ru) | Вывод информации движения для подблоков при видеокодировании | |
TWI615021B (zh) | 對多層視訊寫碼之層間預測信令之最佳化 | |
TWI558178B (zh) | 用於寫碼視訊資料之方法及器件及電腦可讀儲存媒體 | |
WO2019147826A1 (en) | Advanced motion vector prediction speedups for video coding | |
JP2019505144A (ja) | ビデオコーディングのためのフィルタのための幾何学的変換 | |
TW201924345A (zh) | 寫碼用於視頻寫碼之仿射預測移動資訊 | |
WO2019062544A1 (zh) | 视频图像的帧间预测方法、装置及编解码器 | |
TWI741239B (zh) | 視頻資料的幀間預測方法和裝置 | |
JP7143512B2 (ja) | ビデオ復号方法およびビデオデコーダ | |
WO2021109978A1 (zh) | 视频编码的方法、视频解码的方法及相应装置 | |
CN115668915A (zh) | 图像编码方法、图像解码方法及相关装置 | |
JP2023153802A (ja) | イントラ・サブパーティション・コーディング・ツールによって引き起こされるサブパーティション境界のためのデブロッキングフィルタ | |
WO2019184556A1 (zh) | 一种双向帧间预测方法及装置 | |
JP2018511232A (ja) | 非正方形区分を使用してビデオデータを符号化するための最適化 | |
CN111385572A (zh) | 预测模式确定方法、装置及编码设备和解码设备 | |
US20240314326A1 (en) | Video Coding Method and Related Apparatus Thereof | |
WO2022166462A1 (zh) | 编码、解码方法和相关设备 | |
US20240214580A1 (en) | Intra prediction modes signaling | |
TW201921938A (zh) | 具有在用於視訊寫碼之隨機存取組態中之未來參考訊框之可調適圖像群組結構 | |
WO2020253681A1 (zh) | 融合候选运动信息列表的构建方法、装置及编解码器 | |
WO2022271756A1 (en) | Video coding using multi-direction intra prediction | |
CN111656786B (zh) | 候选运动信息列表的构建方法、帧间预测方法及装置 | |
CN117616751A (zh) | 动态图像组的视频编解码 |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 19777466 Country of ref document: EP Kind code of ref document: A1 |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 19777466 Country of ref document: EP Kind code of ref document: A1 |