WO2022078150A1 - 候选运动信息列表确定方法、装置、电子设备及存储介质 - Google Patents
候选运动信息列表确定方法、装置、电子设备及存储介质 Download PDFInfo
- Publication number
- WO2022078150A1 WO2022078150A1 PCT/CN2021/118839 CN2021118839W WO2022078150A1 WO 2022078150 A1 WO2022078150 A1 WO 2022078150A1 CN 2021118839 W CN2021118839 W CN 2021118839W WO 2022078150 A1 WO2022078150 A1 WO 2022078150A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- motion information
- information list
- candidate
- historical
- displacement vector
- Prior art date
Links
- 230000033001 locomotion Effects 0.000 title claims abstract description 611
- 238000000034 method Methods 0.000 title claims abstract description 119
- 239000013598 vector Substances 0.000 claims abstract description 264
- 238000006073 displacement reaction Methods 0.000 claims abstract description 187
- 238000004590 computer program Methods 0.000 claims abstract description 11
- 230000010365 information processing Effects 0.000 claims description 8
- 230000010076 replication Effects 0.000 claims description 4
- 230000009269 systemic vascular permeability Effects 0.000 claims 2
- 230000001960 triggered effect Effects 0.000 claims 1
- 238000007906 compression Methods 0.000 description 27
- 230000008569 process Effects 0.000 description 26
- 230000006835 compression Effects 0.000 description 25
- 238000010586 diagram Methods 0.000 description 19
- 238000005516 engineering process Methods 0.000 description 16
- 230000006870 function Effects 0.000 description 14
- 238000005429 filling process Methods 0.000 description 10
- 238000012545 processing Methods 0.000 description 9
- 238000013139 quantization Methods 0.000 description 8
- 230000005540 biological transmission Effects 0.000 description 6
- 238000001914 filtration Methods 0.000 description 6
- 230000009286 beneficial effect Effects 0.000 description 5
- 230000002123 temporal effect Effects 0.000 description 4
- 230000009466 transformation Effects 0.000 description 4
- 230000003044 adaptive effect Effects 0.000 description 3
- 230000000694 effects Effects 0.000 description 3
- 238000004891 communication Methods 0.000 description 2
- 238000009795 derivation Methods 0.000 description 2
- 241000023320 Luma <angiosperm> Species 0.000 description 1
- 238000013473 artificial intelligence Methods 0.000 description 1
- 230000002457 bidirectional effect Effects 0.000 description 1
- 238000004364 calculation method Methods 0.000 description 1
- 230000007547 defect Effects 0.000 description 1
- 230000001419 dependent effect Effects 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 239000011159 matrix material Substances 0.000 description 1
- OSWPMRLSEDHDFF-UHFFFAOYSA-N methyl salicylate Chemical compound COC(=O)C1=CC=CC=C1O OSWPMRLSEDHDFF-UHFFFAOYSA-N 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000005192 partition Methods 0.000 description 1
- 230000004044 response Effects 0.000 description 1
- 238000000844 transformation Methods 0.000 description 1
- 230000003245 working effect Effects 0.000 description 1
Images
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/50—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
- H04N19/503—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving temporal prediction
- H04N19/51—Motion estimation or motion compensation
- H04N19/513—Processing of motion vectors
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/50—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
- H04N19/503—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving temporal prediction
- H04N19/51—Motion estimation or motion compensation
- H04N19/513—Processing of motion vectors
- H04N19/517—Processing of motion vectors by encoding
- H04N19/52—Processing of motion vectors by encoding by predictive encoding
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/102—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
- H04N19/103—Selection of coding mode or of prediction mode
- H04N19/105—Selection of the reference unit for prediction within a chosen coding or prediction mode, e.g. adaptive choice of position and number of pixels used for prediction
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/134—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
- H04N19/136—Incoming video signal characteristics or properties
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/169—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
- H04N19/17—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object
- H04N19/176—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object the region being a block, e.g. a macroblock
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/50—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
- H04N19/593—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving spatial prediction techniques
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/20—Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
- H04N21/23—Processing of content or additional data; Elementary server operations; Server middleware
- H04N21/234—Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs
- H04N21/2343—Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs involving reformatting operations of video signals for distribution or compliance with end-user requests or end-user device requirements
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/40—Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
- H04N21/43—Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
- H04N21/44—Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs
- H04N21/4402—Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs involving reformatting operations of video signals for household redistribution, storage or real-time display
Definitions
- the present application relates to video coding technology, and in particular, to a method, apparatus, electronic device, storage medium and computer program product for determining a candidate motion information list.
- VVC Versatile Video Coding
- AVS3 Audio Video Coding Standard 3
- the motion information list cannot provide an effective predicted displacement vector, which will affect the compression performance of the video, and cannot effectively reduce the volume of the compressed video, which is not conducive to users. user experience.
- the embodiments of the present application provide a method, apparatus, electronic device, storage medium, and computer program product for determining a candidate motion information list, which can acquire at least one motion information based on a historical motion information table, and based on the at least one motion information
- the candidate motion information list is filled to obtain a better displacement vector prediction effect and improve video compression performance.
- An embodiment of the present application provides a method for determining a candidate motion information list, including:
- the historical motion information table is used for inter-frame prediction, intra-frame prediction, intra-frame block copy (IBC, Intra Block Copy) prediction, intra-frame string copy (ISC, Intra String Copy) at least one of the predictions;
- IBC intra-frame block copy
- ISC Intra Block Copy
- the target motion information list When the target motion information list is not full, acquiring at least one piece of motion information based on the historical motion information table, and filling the candidate motion information list based on the at least one piece of motion information;
- the candidate motion information list is used to provide candidate prediction displacement vectors for the current codec block.
- the embodiment of the present application also provides a method for determining a candidate motion information list, including:
- the target motion information list is not full, at least one piece of motion information is acquired based on at least one of the historical motion information table and the spatial motion information list, and the candidate motion information list is filled based on the at least one piece of motion information ;
- the candidate motion information list is used to provide candidate prediction displacement vectors for the current codec block.
- the embodiment of the present application also provides an apparatus for determining a candidate motion information list, including:
- the first information processing module is configured to determine a target motion information list and a historical motion information table, wherein the historical motion information table is used for at least one of inter-frame prediction, intra-frame prediction, intra-frame block copy prediction, and intra-frame string copy prediction. one;
- a first information filling module configured to acquire at least one piece of motion information based on the historical motion information table when the target motion information list is not full;
- the first information filling module configured to fill the candidate motion information list based on the at least one motion information
- the candidate motion information list is used to provide candidate prediction displacement vectors for the current codec block.
- the embodiment of the present application also provides an apparatus for determining a candidate motion information list, including:
- the second information processing module is configured to determine the positional relationship between the historical motion information table and the airspace motion information list
- the second information filling module is configured to, when the target motion information list is not full, acquire at least one piece of motion information based on at least one of the historical motion information table and the airspace motion information list, and based on the at least one motion information the candidate motion information list is filled;
- the candidate motion information list is used to provide candidate prediction displacement vectors for the current codec block.
- the embodiment of the present application also provides an electronic device, the electronic device includes:
- a memory configured to store executable instructions
- the processor is configured to implement the method for determining the candidate motion information list provided by the embodiment of the present application when executing the executable instructions stored in the memory.
- Embodiments of the present application further provide a computer-readable storage medium storing executable instructions, which, when executed by a processor, implement the method for determining a candidate motion information list provided by the embodiments of the present application.
- the number of displacement vectors in the target motion information list and the historical motion information table is determined; when the target motion information list is not full, at least one motion information is acquired based on the historical motion information table, and based on the At least one piece of motion information fills the candidate motion information list; wherein, the candidate motion information list is used to provide candidate prediction displacement vectors for the current codec block, so that it is possible to provide more information in the candidate motion information list. More and more effective displacement vectors to achieve better displacement vector prediction effect, improve video compression performance, and improve user experience.
- FIG. 1 is a schematic diagram of a usage scenario of a method for determining a list provided by an embodiment of the present application
- FIG. 2 is a schematic diagram of the composition and structure of an electronic device provided by an embodiment of the present application.
- FIG. 3 is a schematic flowchart of a video encoding process provided by an embodiment of the present application.
- FIG. 4 is a schematic diagram of an inter-frame prediction mode provided by an embodiment of the present application.
- FIG. 5 is a schematic diagram of a candidate motion vector provided by an embodiment of the present application.
- FIG. 6 is a schematic diagram of an intra-block copy mode provided by an embodiment of the present application.
- FIG. 7 is a schematic diagram of an intra-frame string copy mode provided by an embodiment of the present application.
- FIG. 8 is a schematic flowchart of a method for determining a candidate motion information list according to an embodiment of the present application
- FIG. 9 is a schematic flowchart of a method for determining a candidate motion information list provided by an embodiment of the present application.
- FIG. 10 is a schematic flowchart of a method for determining a candidate motion information list provided by an embodiment of the present application
- FIG. 11 is a schematic diagram of an airspace adjacent block provided by an embodiment of the present application.
- FIG. 12 is a schematic diagram of a usage scenario of the method for determining a candidate motion information list provided by an embodiment of the present application
- FIG. 13 is a schematic diagram of video compression provided by an embodiment of the present application.
- FIG. 14 is a schematic diagram of a compressed video presentation provided by an embodiment of the present application.
- FIG. 15 is a schematic flowchart of a method for determining a candidate motion information list provided by an embodiment of the present application.
- API The full name of Application Programming Interface, which can be translated into application programming interface, is some predefined functions, or refers to the convention of connecting different components of the software system. The purpose is to provide applications and developers the ability to access a set of routines based on a piece of software or hardware without having to access the source code or understand the details of the inner workings.
- SDK The full name is Software Development Kit, which can be translated into a software development kit, which is a collection of development tools used to build application software for a specific software package, software framework, hardware platform, operating system, etc. A collection of software-related documentation, examples, and tools.
- P frame inter-frame prediction frame, which can use intra-frame prediction and inter-frame prediction, and can refer to the forward reference prediction video coding method.
- B frame inter-frame prediction frame, intra-frame prediction and inter-frame prediction can be used, and forward, backward and bidirectional reference prediction can be used.
- I frame intra-frame prediction frame, which uses intra-frame information for prediction.
- Video codec standard a certain agreed video code stream decoding rule.
- Video Transcoding which refers to converting a compressed and encoded video stream into another video stream to adapt to different network bandwidths, different terminal processing capabilities and different user needs.
- the carrier that implements specific functions in the terminal for example, the mobile client (APP) is the carrier of specific functions in the mobile terminal, such as the function of performing online live broadcast (video streaming) or the function of playing online video.
- APP mobile client
- one or more of the executed operations may be real-time, or may have a set delay; Unless otherwise specified, there is no restriction on the order of execution of multiple operations to be executed.
- FIG. 1 is a schematic diagram of a usage scenario of the method for determining the list provided by the embodiment of the present application.
- the terminal including the terminal 10-1 and the terminal 10-2 are provided with corresponding clients capable of executing different functions, wherein the client client is the terminal (including the terminal 10-1 and the terminal 10-2) obtained from the corresponding server 200 through the network 300 using different business processes
- the terminal connects to the server 200 through the network 300.
- the network 300 may be a wide area network or a local area network, or a combination of the two, using a wireless link to realize data transmission.
- the video types obtained by the terminals (including the terminal 10-1 and the terminal 10-2) from the corresponding servers 200 through the network 300 are not the same, for example, the terminals (including the terminal 10-1 and the terminal 10-2) can either
- the video is obtained from the corresponding server 200 through the network 300 (that is, the video carries the video information or the corresponding video link), and the video can also be obtained from the corresponding server 400 through the network 300 only including different types of video (eg short video or long video) to browse.
- Different types of videos may be stored in the server 200 and the server 400 .
- the processes of different types of videos stored in the server 200 may be written in software codes of different programming languages, and the code objects may be different types of code entities.
- a code object in C language software code, can be a function.
- a code object in the software code of the JAVA language, can be a class, and in the OC language on the IOS side, it can be a piece of object code.
- C++ language software code a code object can be a class or a function.
- the compilation environment of different types of videos is no longer distinguished.
- the video compression process in the prior art such as VVC (Versatile Video Coding, universal video coding) and AVS3 (Audio Video coding Standard 3, audio and video coding standard 3)
- the video codec usually A list of motion information needs to be built to derive the predicted displacement vector.
- the motion information list cannot provide a valid predicted displacement vector.
- IntraHMVP table A list of candidate motion information and derived prediction block displacement vector (BVP, Block Vector Predictor) or prediction string vector (SVP, String Vector Predictor).
- BVP Block Vector Predictor
- SVP String Vector Predictor
- the maximum length of the IntraHMVP is 12, and the maximum length of the candidate motion information list is 7.
- the IntraHMVP length is insufficient or empty, the candidate motion information list cannot be filled, resulting in insufficient motion information for the displacement vector. It is predicted that, as a result, the compression performance of the video will be affected, and the video compression rate will be reduced, which is not conducive to the user experience of video compression.
- the server 200 when the server 200 sends or receives different types of videos to the terminal (terminal 10-1 and/or terminal 10-2) through the network 300, since the video information occupies a large storage space, it needs to Video information is compressed, so.
- the server 200 is configured to determine a target motion information list and a historical motion information table, wherein the historical motion information table is used for at least one of inter prediction, intra prediction, intra block copy prediction, and intra string copy prediction.
- the number of displacement vectors in the historical motion information table can also be determined; when the target motion information list is not full, at least one motion information is acquired based on the historical motion information table, and based on the at least one motion information The candidate motion information list is filled; wherein, the candidate motion information list is used to provide candidate prediction displacement vectors for the current codec block.
- the server 200 may also be configured to determine the positional relationship between the historical motion information table and the airspace motion information list; when the target motion information list is not full, based on the historical motion information table, airspace motion information At least one of the lists, obtaining at least one piece of motion information, and filling the candidate motion information list based on the at least one piece of motion information; wherein the candidate motion information list is used to provide candidate prediction shifts for the current codec block vector.
- the server 200 can flexibly adjust the process of determining the candidate motion information list according to different usage environments or user settings.
- other types of historical motion information tables may also be selected to fill the candidate motion information lists in the process of inter prediction, intra block copy prediction and intra frame string copy prediction.
- the server 200 may be an independent physical server, a server cluster or a distributed system composed of multiple physical servers, or a cloud service, cloud database, cloud computing, cloud function, cloud storage, network Cloud servers for basic cloud computing services such as services, cloud communications, middleware services, domain name services, security services, Content Delivery Network (CDN, Content Delivery Network), and big data and artificial intelligence platforms.
- cloud service such as services, cloud communications, middleware services, domain name services, security services, Content Delivery Network (CDN, Content Delivery Network), and big data and artificial intelligence platforms.
- FIG. 2 is a schematic diagram of the composition and structure of an electronic device provided by an embodiment of the present application. It can be understood that FIG. 2 only shows an exemplary structure of a server but not the entire structure, and part or all of the structure shown in FIG. 2 may be implemented as required.
- the server provided in this embodiment of the present application includes: at least one processor 201 , a memory 202 , a user interface 203 , and at least one network interface 204 .
- the various components in the electronic device are coupled together by a bus system 205 .
- the bus system 205 is configured to enable connection communication between these components.
- the bus system 205 also includes a power bus, a control bus and a status signal bus.
- the various buses are labeled as bus system 205 in FIG. 2 .
- the user interface 203 may include a display, a keyboard, a mouse, a trackball, a click wheel, keys, buttons, a touch pad or a touch screen, and the like.
- the memory 202 may be either volatile memory or non-volatile memory, and may include both volatile and non-volatile memory.
- the memory 202 in this embodiment of the present application can store data to support the operation of the terminal (eg 10-1). Examples of such data include: any computer program used to operate on the terminal (eg 10-1), such as operating systems and applications.
- the operating system includes various system programs, such as a framework layer, a core library layer, a driver layer, etc., for implementing various basic services and processing hardware-based tasks.
- Applications can contain various applications.
- the device for determining the candidate motion information list provided by the embodiments of the present application may be implemented by a combination of software and hardware.
- the device for determining the candidate motion information list provided by the embodiments of the present application may be implemented by using a hardware decoding processor A processor in the form of a processor that is programmed to execute the method for determining the candidate motion information list provided by the embodiments of the present application.
- a processor in the form of a hardware decoding processor may adopt one or more Application Specific Integrated Circuits (ASIC, Application Specific Integrated Circuit), DSP, Programmable Logic Device (PLD, Programmable Logic Device), Complex Programmable Logic Device (CPLD, Complex Programmable Logic Device), Field Programmable Gate Array (FPGA, Field-Programmable Gate Array) or other electronic components.
- ASIC Application Specific Integrated Circuit
- DSP Digital Signal processor
- PLD Programmable Logic Device
- CPLD Complex Programmable Logic Device
- FPGA Field-Programmable Gate Array
- the device for determining the candidate motion information list provided by the embodiment of the present application can be directly embodied as a combination of software modules executed by the processor 201. It can be located in a storage medium, the storage medium is located in the memory 202, the processor 201 reads the executable instructions included in the software modules in the memory 202, and combines the necessary hardware (for example, including the processor 201 and other components connected to the bus 205) to complete this process.
- the processor 201 may be an integrated circuit chip with signal processing capabilities, such as a general-purpose processor, a digital signal processor (DSP, Digital Signal Processor), or other programmable logic devices, discrete gates or transistor logic devices , discrete hardware components, etc., where a general-purpose processor may be a microprocessor or any conventional processor, or the like.
- DSP Digital Signal Processor
- the apparatus for determining the candidate motion information list may be directly executed by the processor 201 in the form of a hardware decoding processor, for example, by one or more Application Specific Integrated Circuit (ASIC, Application Specific Integrated Circuit), DSP, Programmable Logic Device (PLD, Programmable Logic Device), Complex Programmable Logic Device (CPLD, Complex Programmable Logic Device), Field Programmable Gate Array (FPGA, Field-Programmable Gate Array) or other electronic components to implement the method for determining the candidate motion information list provided by the embodiments of the present application.
- ASIC Application Specific Integrated Circuit
- DSP Programmable Logic Device
- PLD Programmable Logic Device
- CPLD Complex Programmable Logic Device
- FPGA Field-Programmable Gate Array
- the memory 202 in the embodiment of the present application is configured to store various types of data to support the operation of the electronic device. Examples of these data include: any executable instructions for operating on the electronic device, such as executable instructions, and the program implementing the method for determining from the candidate motion information list according to the embodiment of the present application may be included in the executable instructions.
- the apparatus for determining the candidate motion information list may be implemented in software.
- FIG. 2 shows the apparatus 2020 for determining the candidate motion information list stored in the memory 202, which may be a program or a plug-in and other forms of software, and includes a series of modules, as an example of the program stored in the memory 202, may include a candidate motion information list determination device 2020, the candidate motion information list determination device 2020 includes the following software modules: first information processing Module 2081 , a first information filling module 2082 , a second information processing module 2083 , and a second information filling module 2084 .
- the first information processing module 2081 configured to determine the target motion information list and the historical motion information table
- the first information filling module 2082 is configured to obtain at least one piece of movement information based on the historical movement information table when the target movement information list is not full;
- the first information filling module 2082 configured to fill the candidate motion information list based on the at least one motion information
- the second information processing module 2083 is configured to determine the positional relationship between the historical motion information table and the airspace motion information list
- the second information filling module 2084 is configured to obtain at least one piece of movement information based on at least one of the historical movement information table and the airspace movement information list when the target movement information list is not full, and based on the at least one movement information Filling the candidate motion information list;
- the candidate motion information list is used to provide candidate prediction displacement vectors for the current codec block.
- the present application further provides a computer program product or computer program, the computer program product or computer program comprising computer instructions, the computer instructions being stored in a computer-readable in the storage medium.
- the processor of the computer device reads the computer instructions from the computer-readable storage medium, and the processor executes the computer instructions, so that the computer device executes the various embodiments and implementations provided in the various implementations of the above-mentioned candidate motion information list determination method combination of examples.
- FIG. 3 is a schematic flowchart of the video encoding process in the embodiment of the present application.
- the video signal refers to an image sequence including multiple frames.
- a frame is a representation of the spatial information of a video signal. Taking YUV mode as an example, a frame includes one luma sample matrix (Y) and two chroma sample matrices (Cb and Cr). From the point of view of the acquisition method of video signal, it can be divided into two methods: camera captured and computer generated. Due to different statistical characteristics, the corresponding compression coding methods may also be different.
- H.265/HEVC High Efficient Video Coding
- H.266/VVC Versatile Video Coding
- Audio Video coding Standard Audio Video coding Standard
- AVS Audio Video coding Standard
- AVS3 AVS3
- Block Partition Structure The input image is divided into several non-overlapping processing units, and each processing unit will perform similar compression operations.
- This processing unit is called a Coding Tree Unit (CTU), or the largest coding unit (Large Coding Unit, LCU). Further down the CTU, more finer divisions can be continued to obtain one or more basic coding units, which are called coding units (Coding Unit, CU).
- CTU Coding Tree Unit
- LCU Large Coding Unit
- CU basic coding units
- Each CU is the most basic element in a coding session. Described below are various encoding methods that may be used for each CU.
- Predictive Coding Including intra-frame prediction and inter-frame prediction. After the original video signal is predicted by the selected reconstructed video signal, a residual video signal is obtained. The encoder needs to decide among many possible predictive coding modes for the current CU, select the most suitable one, and inform the decoder. Among them, intra-frame prediction means that the predicted signal comes from an area that has been coded and reconstructed in the same image. Inter-frame prediction means that the predicted signal comes from other pictures (called reference pictures) that have been encoded and different from the current picture.
- Transform & Quantization The residual video signal undergoes transformation operations such as Discrete Fourier Transform (DFT), Discrete Cosine Transform (DCT), etc., to convert the signal into the transform domain, are called transform coefficients.
- DFT Discrete Fourier Transform
- DCT Discrete Cosine Transform
- the signal in the transform domain is subjected to a lossy quantization operation, which loses certain information, so that the quantized signal is beneficial to the compressed expression.
- the encoder also needs to select one of the transformations for the current CU and inform the decoder.
- the fineness of quantization is usually determined by the quantization parameter.
- Quantization Parameter (QP) value is larger, indicating that the coefficients with a larger value range will be quantized into the same output, so it usually brings greater distortion and lower code rate; on the contrary, QP takes If the value is smaller, the coefficients representing a smaller value range will be quantized into the same output, so it usually brings less distortion and corresponds to a higher code rate.
- Entropy Coding or Statistical Coding The quantized transform domain signal will undergo statistical compression coding according to the frequency of occurrence of each value, and finally output a binarized (0 or 1) compressed code stream. At the same time, the encoding generates other information, such as the selected mode, motion vector, etc., and entropy encoding is also required to reduce the bit rate.
- Statistical coding is a lossless coding method that can effectively reduce the code rate required to express the same signal. Common statistical coding methods include Variable Length Coding (VLC) or context-based binary arithmetic coding (Content Adaptive Binary Arithmetic Coding, CABAC).
- Loop Filtering The decoded image can be reconstructed by performing inverse quantization, inverse transformation, and prediction compensation operations (the inverse operations of 2 to 4 above) for an already encoded image. Compared with the original image, the reconstructed image has some information different from the original image due to the influence of quantization, resulting in distortion. Filtering the reconstructed image, such as deblocking, Sample Adaptive Offset (SAO) or Adaptive Lattice Filter (ALF), can effectively Reduces the amount of distortion produced by quantization. Since these filtered reconstructed images will be used as references for subsequent encoded images to predict future signals, the above filtering operations are also referred to as in-loop filtering and filtering operations within the encoding loop.
- SAO Sample Adaptive Offset
- ALF Adaptive Lattice Filter
- the decoder first performs entropy decoding to obtain various mode information and quantized transform coefficients. Each coefficient is inversely quantized and inversely transformed to obtain a residual signal.
- the prediction signal corresponding to the CU can be obtained, and after adding the two, the reconstructed signal can be obtained.
- the reconstructed value of the decoded image needs to undergo a loop filtering operation to generate the final output signal.
- a block-based hybrid coding framework In related video coding standards, such as HEVC, VVC, AVS3 and other standards, a block-based hybrid coding framework is adopted. They divide the original video data into a series of coding blocks, and combine video coding methods such as prediction, transform and entropy coding to realize the compression of video data.
- motion compensation is a type of prediction method commonly used in video coding. Based on the redundancy characteristics of video content in the temporal or spatial domain, motion compensation derives the prediction value of the current coding block from the coded region.
- Such prediction methods include: inter prediction, intra block copy prediction, intra string copy prediction, etc., which may be used alone or in combination in coding implementations.
- the displacement vector may have different names. This article will describe it in the following manner: 1) The displacement vector in the inter prediction mode is called a motion vector. MV); 2) The displacement vector in the Intra Block Copy (IBC) prediction mode is called a block vector (Block Vector, BV); 3) Intra String Copy (Intra String Copy, ISC) in the prediction mode. The displacement vector is called a String Vector (SV). Intra-frame string duplication is also referred to as "string prediction” or "string matching", among others.
- MV refers to the displacement vector used in the inter prediction mode, pointing from the current image to the reference image, and its value is the coordinate offset between the current block and the reference block, wherein the current block and the reference block are in two different images.
- motion vector prediction can be introduced. By predicting the motion vector of the current block, the predicted motion vector corresponding to the current block is obtained, and the difference between the predicted motion vector corresponding to the current block and the actual motion vector is calculated. Compared with directly encoding and transmitting the actual motion vector corresponding to the current block, encoding and transmission is beneficial to save bit overhead.
- the predicted motion vector refers to the predicted value of the motion vector of the current block obtained through the motion vector prediction technology.
- BV refers to the displacement vector used for the IBC prediction mode, and its value is the coordinate offset between the current block and the reference block, wherein both the current block and the reference block are in the current image.
- block vector prediction can be introduced. By predicting the block vector of the current block, the predicted block vector corresponding to the current block is obtained, and the difference between the predicted block vector corresponding to the current block and the actual block vector is encoded. Compared with directly encoding and transmitting the actual block vector corresponding to the current block, it is beneficial to save bit overhead.
- the predicted block vector refers to the predicted value of the block vector of the current block obtained through the block vector prediction technology.
- SV refers to the displacement vector used for the ISC prediction mode, and its value is the coordinate offset between the current string and the reference string, wherein both the current string and the reference string are in the current image.
- string vector prediction can be introduced. By predicting the string vector of the current string, the predicted string vector corresponding to the current string is obtained, and the difference between the predicted string vector corresponding to the current string and the actual string vector is encoded. Compared with directly encoding and transmitting the actual string vector corresponding to the current string, it is beneficial to save bit overhead.
- the predicted string vector refers to the predicted value of the string vector of the current string obtained through the string vector prediction technology.
- FIG. 4 is a schematic diagram of an inter-frame prediction mode in an embodiment of the present application; inter-frame prediction utilizes the correlation in the video time domain, and uses the pixels adjacent to the encoded image to predict the pixels of the current image, so as to effectively remove the video For the purpose of time domain redundancy, bits of coded residual data can be effectively saved.
- P is the current frame
- Pr is the reference frame
- B is the current block to be encoded
- Br is the reference block of B.
- the coordinates of B' and B in the image are the same, the coordinates of Br are (xr, yr), and the coordinates of B' are (x, y).
- the displacement between the current block to be coded and its reference block is called a motion vector (MV), that is:
- MV (xr-x, yr-y).
- inter-frame prediction includes two MV prediction technologies, Merge and Advanced Motion Vector Prediction (AMVP).
- AMVP Advanced Motion Vector Prediction
- Merge mode will build a MV candidate list for the current Prediction Unit (PU), in which there are 5 candidate MVs (and their corresponding reference images). Traverse these five candidate MVs, and select the one with the smallest rate-distortion cost as the optimal MV. If the codec builds the candidate list in the same way, the encoder only needs to transmit the index of the optimal MV in the candidate list.
- the MV prediction technology of HEVC also has a skip mode, which is a special case of the Merge mode. After the optimal MV is found in the Merge mode, if the current block is basically the same as the reference block, then there is no need to transmit residual data, only the index of the MV and a skip flag need to be transmitted.
- FIG. 5 is a schematic diagram of a candidate motion vector provided by an embodiment of the present application.
- the MV candidate list established by the Merge mode includes both the spatial domain and the time domain.
- B Slice B frame image
- the airspace provides at most 4 candidate MVs, and its establishment is shown in part (a) of Fig. 5.
- the airspace list is established in the order of A1 ⁇ B1 ⁇ B0 ⁇ A0 ⁇ B2, where B2 is a substitute, that is, when one or more of A1, B1, B0, and A0 do not exist, the motion information of B2 needs to be used; time domain Only one candidate MV is provided at most, and its establishment is shown in part (b) of Figure 5, which is obtained by scaling the MV of the co-located PU as follows:
- curMV td*colMV/tb
- curMV represents the MV of the current PU
- colMV represents the MV of the co-located PU
- td represents the distance between the current image and the reference image
- tb represents the distance between the co-located image and the reference image.
- the AMVP mode utilizes the MV correlation of adjacent blocks in the spatial and temporal domains to build an MV candidate list for the current PU.
- Difference Motion Vector Difference
- the MV candidate list of the AMVP mode also includes two cases, the spatial domain and the time domain, the difference is that the length of the MV candidate list of the AMVP mode is only 2.
- HMVP History based Motion Vector Prediction
- H.266/VVC is a new MV prediction technology adopted in H.266/VVC.
- HMVP is a motion vector prediction method based on historical information. The motion information of the historical coding blocks is stored in the HMVP list and used as the MVP of the current CU.
- H.266/VVC adds HMVP to the candidate list for Merge mode, after the spatial and temporal MVPs.
- HMVP technology stores motion information of previously encoded blocks in a first-in, first-out queue (First Input First Output, FIFO).
- FIFO First Input First Output
- the duplicate candidate motion information will be removed first, then all HMVP candidates will be moved forward, and the motion information of the current coding unit will be added at the end of the FIFO . If the motion information of the current coding unit is different from any candidate motion information in the FIFO, the latest motion information is added to the end of the FIFO.
- the HMVP list is reset (emptied) when a new Coding Tree Unit (CTU) row is encountered.
- CTU Coding Tree Unit
- IBC is an intra-frame coding tool adopted in the HEVC Screen Content Coding (SCC) extension, which significantly improves the coding efficiency of screen content.
- SCC Screen Content Coding
- IBC technology is also adopted to improve the performance of screen content encoding.
- IBC utilizes the spatial correlation of screen content and video, and uses the encoded image pixels on the current image to predict the pixels of the current block to be encoded, which can effectively save the bits required for encoding pixels.
- the displacement between the current block and its reference block in IBC is called BV.
- H.266/VVC uses a BV prediction technique similar to inter-frame prediction to save the bits needed to encode BV.
- ISC technology divides a coded block into a series of pixel strings or unmatched pixels according to a certain scan order (such as raster scan, round-trip scan, Zig-Zag scan, etc.). Similar to IBC, each string looks for a reference string of the same shape in the coded area of the current image, and derives the predicted value of the current string. By encoding the residual between the pixel value of the current string and the predicted value, instead of directly encoding the pixel value, it can effectively Save bits.
- Figure 7 shows a schematic diagram of intra-frame string replication, the dark gray area is the encoded area, the 28 white pixels are string 1, the light gray 35 pixels are string 2, and the black 1 pixel represents an unmatched pixel.
- the displacement between string 1 and its reference string is the string vector 1 in FIG. 6 ; the displacement between string 2 and its reference string is the string vector 2 in FIG. 6 .
- the intra-frame string replication technology needs to encode the SV corresponding to each string in the current coding block, the string length, and the flag of whether there is a matching string.
- SV represents the displacement of the to-be-coded string to its reference string.
- String length indicates the number of pixels contained in the string. In different implementations, there are many ways to encode the length of the string.
- IBC and ISC are two screen content coding tools in AVS3. They both use the current image as a reference and derive the predicted value of the coding unit through motion compensation. Considering that IBC and ISC have similar reference regions, BV and SV have a high correlation, which can improve coding efficiency by allowing prediction between the two.
- AVS3 uses an intra-frame prediction history motion information table (IntraHMVP) similar to HMVP to record the displacement vector information, position information, size information and repetition times of these two types of coding blocks, and derives the predicted block vector (Block Vector Predictor, BVP from IntraHMVP) ) and predictor string vector (String Vector Predictor, SVP).
- IntraHMVP intra-frame prediction history motion information table
- BVP is the predicted value of the block vector
- SVP is the predicted value of the string vector.
- Class based Block Vector Prediction (CBVP) is adopted in AVS3, similar to HMVP, this method first uses an HBVP (History based Block Vector Prediction, history-based block vector prediction) list to store historical IBC codes
- HBVP History based Block Vector Prediction, history-based block vector prediction
- the block information in addition to recording the BV information of the historical coding block, also records information such as the position and size of the historical coding block.
- the candidate BVs in HBVP are classified by:
- Category 0 The area of the historical coding block is greater than or equal to 64 pixels;
- Category 1 The frequency of BV is greater than or equal to 2;
- Category 2 The coordinates of the upper left corner of the historical coding block are located to the left of the coordinates of the upper left corner of the current block;
- Category 3 The coordinates of the upper left corner of the historical coding block are located above the coordinates of the upper left corner of the current block;
- Category 4 The coordinates of the upper left corner of the historical coding block are located at the upper left of the coordinates of the upper left corner of the current block;
- Category 5 The coordinates of the upper left corner of the historical coding block are located at the upper right of the coordinates of the upper left corner of the current block;
- Category 6 The coordinates of the upper left corner of the historical coding block are located at the lower left of the coordinates of the upper left corner of the current block;
- each category is arranged in the reverse order of the coding order (the closer the coding order is to the current block, the higher the ranking), and the BV corresponding to the first historical coding block is the candidate BV corresponding to this category.
- the candidate BV to CBVP list corresponding to each category is then added in the order of category 0 to category 6.
- the encoding end selects the best candidate BV in the CBVP list as the BVP, and encodes an index in the code stream, indicating the index of the category corresponding to the best candidate BV in the CBVP list.
- the decoding end decodes the BVP from the CBVP list according to the index.
- the intra prediction motion information of the current prediction block includes displacement vector information, position information, size information and repetition times, wherein the displacement vector information of the block copy intra prediction block is a block vector; the position information includes the abscissa of the upper left corner of the current prediction block, the upper left Angle ordinate; size information is the product of width and height; the repetition times of the current prediction block is initialized to 0.
- AVS3 encodes an index for each string in the ISC coding block, indicating the position of the SVP of that string in IntraHMVP. Similar to the skip mode in inter prediction, the SV of the current string is equal to the SVP, and there is no need to encode the residual between the SV and the SVP.
- the string copy intra prediction motion information of the current prediction block includes displacement vector information, position information, size information and repetition times, wherein the displacement vector information of the current string is a string vector; the position information includes the abscissa of the first pixel sample of the string and The ordinate is (xi,yi); the size information is the string length of this part, that is, StrLen[i]; the number of repetitions is initialized to 0.
- the intra-frame prediction motion information includes displacement vector information, position information, size information, and the number of repetitions.
- the prediction type of the current prediction unit is block copy intra prediction or string copy intra prediction, and NumOfIntraHmvpCand is greater than 0, update the intra prediction history according to the intra prediction motion information of the current prediction block
- the displacement vector information, position information, size information and repetition times of the motion information table IntraHmvpCandidateList and IntraHmvpCandidateList[X] are recorded as intraMvCandX, posCandX, sizeCandX and cntCandX respectively; otherwise, the operations defined in this article are not performed.
- IntraHmvpCandidateList[CntIntraHmvp] is the intra prediction motion information of the current prediction unit, and CntIntraHmvp is incremented by 1.
- step c If X is less than CntIntraHmvp, go to step c); otherwise, go to step e).
- cntCur is equal to the value of cntCandX plus 1. If sizeCur is less than sizeCandX, the current sizeCur is equal to sizeCandX respectively.
- IntraHmvpCandidateList[CntIntraHmvp-1] is equal to the intra prediction motion information of the current prediction unit.
- i is from 0 to CntIntraHmvp-1, so that IntraHmvpCandidateList[i] is equal to IntraHmvpCandidateList[i+1];
- IntraHmvpCandidateList[CntIntraHmvp-1] is equal to the intra prediction motion information of the current prediction unit.
- IntraHmvpCandidateList[CntIntraHmvp] is equal to the intra prediction motion information of the current prediction unit, CntIntraHmvp plus 1.
- a candidate motion information list is constructed only by constructing an intra prediction historical motion information (IntraHMVP) table and a prediction block vector (BVP) or prediction string vector (SVP) is derived.
- the maximum length of IntraHMVP is 12, and the maximum length of the candidate motion information list is 7.
- the length of IntraHMVP is insufficient or empty, the list of candidate motion information cannot be filled, so that enough motion information cannot be provided for prediction of displacement vectors.
- FIG. 8 is an optional schematic flowchart of the method for determining the candidate motion information list provided by the embodiment of the present application. It can be understood that, The steps shown in FIG. 8 may be executed by various servers running the apparatus for determining the candidate motion information list, such as a dedicated terminal, server or server cluster with the function of determining the candidate motion information list. The steps shown in FIG. 8 will be described below.
- Step 801 The device for determining a candidate motion information list determines a target motion information list and a historical motion information table.
- the historical motion information table is used for at least one of inter-frame prediction, intra-frame prediction, intra-frame block copy prediction, and intra-frame string copy prediction.
- the method for determining the candidate motion information list provided in the embodiment of the present application can be applied to the three-dimensional view
- the three-dimensional coding technology uses adjacent reconstructed pixels to perform intra-frame prediction on the current block, and selects the predicted motion vector from the motion vectors of the adjacent blocks to construct a motion vector list for inter-frame prediction of motion compensation.
- three concepts of coding unit (Coding Unit, CU), prediction unit (Prediction Unit, PU) and transform unit (Transform Uint, TU) are used to describe the entire coding process.
- the CU is a macroblock or a sub-macroblock, and each CU is a 2N*2N pixel block (N is a power of 2).
- Each CU implements the prediction process through a PU.
- the size of the PU is limited by the CU, which can be a square (such as 2N*2N, N*N) or a rectangle (2N*N, N*2N).
- the size of the ABCDE block is the minimum block size (4*4) defined by the system.
- the use of the method can also be determined according to the candidate motion information list The environment can be adjusted flexibly.
- Step 802 The device for determining the candidate motion information list determines whether the target motion information list is filled, and if so, executes step 803, otherwise executes step 804.
- Step 803 Continue the decoding process.
- Step 804 The device for determining the candidate motion information list acquires at least one piece of motion information based on the historical motion information table when the target motion information list is not full.
- the target movement information list is not filled up, including:
- the target motion information list is not filled after filling based on the airspace motion information list; or, when the airspace motion information list is not filled, the target motion information list is not full.
- the filling manner of the candidate motion information list includes at least one of the following:
- the sports information list is populated.
- different filling processes individually or in combination different video compression usage environments can be adapted, and the adaptability of the method for determining the candidate motion information list provided by the present application can be improved.
- the process of determining the candidate motion information list by the historical motion information table, the process of determining the candidate motion information list by the spatial motion information list, and the process of jointly determining the candidate motion information list by the spatial motion information list and the historical motion information table will be in the follow-up. The examples are introduced in turn.
- Step 805 The device for determining a candidate motion information list fills the candidate motion information list based on the at least one motion information.
- the candidate motion information list is used to provide candidate prediction displacement vectors for the current codec block.
- acquiring at least one piece of motion information based on a historical motion information table, and populating the candidate motion information list based on the at least one piece of motion information includes:
- the corresponding displacement vector Based on the historical motion information table, determine the corresponding displacement vector, and populate the candidate motion information list based on the displacement vector; or, determine the corresponding displacement vector based on the historical motion information table, and based on the candidate motion information list
- the parity of the index of the filling position in the dynamic filling is performed dynamically; or, based on the historical motion information table, the corresponding displacement vector is determined, and the displacement vector in the prediction mode is filled based on the displacement vector, so as to realize the candidate
- the motion information list is filled; or, based on the historical motion information table, a corresponding displacement vector is determined, and dynamic filling is performed based on the number of the displacement vectors and the index of the filling position in the candidate motion information list.
- filling the candidate motion information list based on the displacement vector can be filled according to any one of the following 1)-3) methods:
- Mode 1) based on the historical motion information table, determine the first displacement vector and the last displacement vector in the historical motion information table, and based on the mean value of the first displacement vector and the last displacement vector to the candidate motion information The list is filled;
- Mode 2) based on the historical motion information table, determine the first displacement vector and the last displacement vector in the historical motion information table, and based on the weighted average of the first displacement vector and the last displacement vector Filling the candidate motion information list;
- Mode 3) Based on the historical motion information table, determine the first displacement vector in the historical motion information table, and perform the candidate motion information list based on the first displacement vector. filling.
- dynamic padding may be performed based on the parity of the indices of the padding positions in the candidate motion information list.
- the index of the filling position in the motion information list is an odd number
- the first displacement vector and the last displacement vector in the historical motion information table are determined, and the candidate motion is determined based on the average value of the first displacement vector and the last displacement vector.
- the information list is filled; or, based on the historical motion information table, the first displacement vector in the historical motion information table is determined, and the candidate motion information list is filled based on the first displacement vector.
- the index of the filling position in the candidate motion information list is an even number, determine the first displacement vector and the last displacement vector in the historical motion information table, and based on the first displacement vector and the last displacement
- the candidate motion information list is filled with the mean value of the vectors; or, based on the historical motion information table, the first displacement vector in the historical motion information table is determined, and the candidate motion information is evaluated based on the first displacement vector.
- the list is populated.
- determining the corresponding displacement vector based on the historical motion information table, and filling the displacement vector in the prediction mode based on the displacement vector may be implemented in the following manner:
- the value of the vertical direction of the first displacement vector in the historical motion information table can also be determined based on the historical motion information table and The value of the horizontal direction of the last displacement vector; the vertical direction of the displacement vector in the prediction mode is filled based on the value of the vertical direction of the first displacement vector; based on the horizontal direction of the last displacement vector The value of fills the horizontal direction of the displacement vector in the prediction mode.
- the position of the displacement vector to be filled in the historical motion information table can be determined; based on the position of the displacement vector to be filled in the intra-frame prediction historical motion information table, the corresponding displacement is determined.
- the vector is dynamically filled, wherein the position of the displacement vector to be filled in the historical motion information table is determined based on the quantity value of the candidate displacement vector and the index value of the filling position in the candidate motion information list.
- the method for determining the candidate motion information list may be performed to realize the candidate motion information list. Population of sports information list.
- FIG. 9 is an optional schematic flowchart of the method for determining the candidate motion information list provided by the embodiment of the present application. It can be understood that, The steps shown in FIG. 9 may be executed by various servers running the apparatus for determining the candidate motion information list, such as a dedicated terminal, server or server cluster with the function of determining the candidate motion information list. The steps shown in FIG. 9 will be described below.
- Step 901 Use a historical motion information list to record the motion information of historical prediction units (eg decoded blocks or decoded strings) in the decoding process in a first-in-first-out (FIFO, First Input First Output) manner.
- historical prediction units eg decoded blocks or decoded strings
- Step 902 When decoding the displacement vector of the target prediction unit (eg, current block or current string), determine a candidate motion information list based on the historical motion information list and other motion information.
- the target prediction unit eg, current block or current string
- Step 903 Obtain the position (or index) of the prediction displacement vector of the current prediction unit in the candidate motion information list from the code stream, and determine the prediction displacement vector of the current prediction unit.
- FIG. 10 is an optional schematic flowchart of a method for determining a candidate motion information list provided by an embodiment of the present application.
- the candidate motion information list may be combined with other motion information to determine the candidate motion information in some embodiments of the present application.
- the exercise information list includes the following steps:
- Step 1001 Determine the target parsing sequence, and determine the corresponding ISC coding block by passing multiple adjacent block positions of the current block according to the target parsing sequence.
- FIG. 11 is a schematic diagram of adjacent blocks in the spatial domain provided by an embodiment of the present application; the decoded ISC blocks can be found in the order of ⁇ A, B, C, D, E ⁇ , and the adjacent spatial domains can be recorded respectively.
- the recorded SV information may include: the first or last SV information determined by the coding block according to the target parsing sequence.
- the spatially adjacent blocks of the current codec block refer to codec blocks that belong to the same image as the current codec block and are spatially adjacent to the current codec block.
- the so-called "proximity" here may mean that the distance to the current codec block is less than a threshold, and the threshold may be set in combination with the actual situation.
- the calculation method of the distance between two codec blocks can also be set flexibly, for example, it can be the distance between the coordinates of the upper left corner of the two codec blocks, or it can be the horizontal direction of the upper left corner of the two codec blocks.
- the distance between the coordinates (or the ordinate) may also be the distance between the center positions of the two codec blocks, or the shortest distance between the two codec blocks, etc. This is not done in this embodiment of the present application. limited.
- the coordinates of the upper left corner of the current codec block are (x0, y0), and the coordinates of the upper left corner of another codec block are (x1, y1), if the condition
- the values of a and b can be equal or unequal.
- Spatial neighbor blocks include spatial neighbor blocks and spatial non-neighbor blocks.
- the spatially adjacent blocks of the current codec block refer to codec blocks that belong to the same image as the current codec block and are adjacent and adjacent to the current codec block in spatial position.
- the so-called "adjacent and adjacent” may refer to the existence of overlapping edges or vertices with the current codec block.
- the spatial adjacent blocks of the current codec block F may include codec blocks such as A, B, C, D, and E.
- Spatial non-adjacent blocks of the current codec block refer to codec blocks that belong to the same image as the current codec block, and are spatially adjacent to and not adjacent to the current codec block.
- the so-called "adjacent and non-adjacent” may mean that the distance to the current codec block is less than a threshold, but there is no overlapping edge or vertex with the current codec block.
- the spatial non-adjacent blocks of the current codec block F may include codec blocks such as A', B', C', D', and E'.
- the adjacent blocks in the spatial domain include ISC blocks, and the so-called “ISC blocks” refer to codec blocks that use the ISC prediction mode to perform motion vector prediction.
- the spatially adjacent blocks include at least one of the following: spatially adjacent ISC blocks, and spatially non-adjacent ISC blocks.
- the spatially adjacent ISC blocks of the current codec block refer to the spatially adjacent blocks of the current codec block, and the spatially adjacent blocks are ISC blocks.
- the spatially non-adjacent ISC blocks of the current codec block refer to the spatially non-adjacent blocks of the current codec block and the spatially non-adjacent blocks are ISC blocks.
- the spatial neighbor block further includes an IBC block
- the so-called "IBC block” refers to a codec block that uses the IBC prediction mode to perform motion vector prediction.
- the spatially adjacent blocks further include at least one of the following: spatially adjacent IBC blocks, and spatially non-adjacent IBC blocks.
- the spatially adjacent IBC blocks of the current codec block refer to the spatially adjacent blocks of the current codec block, and the spatially adjacent blocks are IBC blocks.
- the spatially non-adjacent IBC blocks of the current codec block refer to the spatially non-adjacent blocks of the current codec block and the spatially non-adjacent blocks are IBC blocks.
- Step 1002 determine whether the motion information of the candidate motion information list is completely filled, when the filling is complete, go to step 1003, otherwise, go to step 1004;
- Step 1003 Execute the decoding process to determine the prediction displacement vector of the current prediction unit.
- Step 1004 Determine the candidate motion information list in the spatial domain, and fill the candidate motion information list based on the corresponding filling information through the first filling process.
- the first filling process may be used for filling based on the airspace motion information list.
- the padding information may be motion information of an ISC block, including at least one of the following:
- the SV and/or BV information of the non-adjacent blocks in the spatial domain may also be determined first, and then the SV information of the adjacent blocks in the spatial domain may be determined.
- An optional fourth target parsing order may be ⁇ A'->A, B'->B, C'->C, D'->D, E'->E ⁇ .
- the candidate motion information list is filled based on the motion information of the already decoded spatially non-adjacent IBCs or ISC blocks.
- the locations of the non-adjacent IBC or ISC blocks in the spatial domain refer to Figure 11 and Table 1, and the selection of non-adjacent spatial domains can be based on the second target parsing order ⁇ A', B', C', D', E' ⁇
- the BV or ISC block of the spatially non-adjacent IBC block is used as the first or last SV in the scanning order, based on the SV information of the spatially non-adjacent ISC block, then the Filling with SV information is the same as the filling process shown in step 1004 .
- neig_x_pos and neig_y_pos represent the abscissa and ordinate of the sample in the upper left corner of the non-adjacent block in the space
- cur_x_pos and cur_y_pos respectively represent the abscissa and ordinate of the sample in the upper left corner of the current block F
- cu_width and cu_height respectively represent the width and height of the current block F.
- the SV specified therein may be put into the candidate list, or all stored SVs may be put into the candidate list in order.
- the candidate motion information list may also be filled by (0,0).
- Step 1005 Fill the candidate motion information list based on the corresponding filling information through the second filling process.
- Step 1006 Determine the spatial candidate motion information list, and fill the candidate motion information list based on the corresponding filling information through the first filling process.
- the padding information pads the candidate motion information list.
- step 1004, step 1005 and step 1006 may be executed alternatively.
- the filling position in the candidate motion information list may be determined first, and then the corresponding filling information may be determined.
- determining the padding position in the candidate motion information list may include at least one of the following:
- the spatial candidate motion information is adjusted before the historical motion information; or, the spatial candidate motion information is adjusted after the historical motion information, or at least one spatial adjacent or non-adjacent motion information block is acquired for filling.
- a fixed position of the candidate motion information list may be filled with a corresponding spatial vector.
- the spatial vector may be filled in the position of the corresponding type in the candidate motion information list in order.
- the corresponding position can be filled square/upper-right/lower-left) airspace vector.
- the candidate motion information list when the spatial vector of the corresponding position does not exist, the candidate motion information list may be filled by (0,0) coding, or the candidate motion information list may be filled by the second filling process.
- FIG. 12 is a schematic diagram of a usage scenario of the method for determining the candidate motion information list provided by the embodiment of the present application, wherein,
- the video that needs to be compressed and transmitted is a short video
- the terminal including the terminal 10-1 and the terminal 10-2
- a client of software capable of displaying the corresponding short video such as a client or plug-in for short video playback.
- the client can obtain the target video and display it; the terminal is connected to the short video server 200 through the network 300, and the network 300 can be a wide area network or a local area network, or a combination of the two, using a wireless link to realize data transmission.
- the terminal needs to compress the video to be uploaded, or the server compresses the stored video to save transmission time. , reduce the storage space occupied and improve the video compression efficiency.
- 13 is a schematic diagram of an optional video compression in the embodiment of the present application, wherein the user compresses the target video to be processed through the compression plug-in of the instant messaging client, and the method for determining the candidate motion information list provided by the present application can be performed by corresponding
- the storage medium is stored in the plug-in of the instant messaging client for users to call.
- 14 is a schematic diagram of an optional compressed video presentation in an embodiment of the present application, wherein, when the user obtains the video information saved by the server through the video client, the server can use the method for determining the candidate motion information list provided by the present application.
- the transmitted target video is compressed to save the transmission time of the target video and reduce the storage space occupied by the target video.
- FIG. 15 is an optional schematic flowchart of a method for determining a candidate motion information list provided by an embodiment of the present application.
- only the first filling process and the second filling process may be used. and compressing the target video using the first padding process and the second padding process jointly.
- the historical motion information table is the intra-frame prediction historical motion information table to realize the processing in the intra-prediction process, and the candidate motion information list is filled through the intra-frame prediction historical motion information table. , which can include the following steps:
- Step 1501 When the number of candidate displacement vectors in the intra-frame prediction historical motion information table is greater than or equal to two, trigger the displacement vector prediction to derive the predicted displacement vector.
- Step 1502 Filling is performed based on the displacement vector in the historical motion information table through the second filling process.
- the mean value of the first and last candidate BV in the historical motion information table IntraHMVP can be filled; the average value of the first and last candidate BV in the historical motion information table IntraHMVP can also be filled Weighted mean; or, fill the first BV in the historical motion information table IntraHMVP.
- Step 1503 Perform dynamic padding based on the parity of the indices of padding positions in the candidate motion information list.
- the index cbvp_index of the filling position in the candidate motion information list is an odd number
- the mean value of the first and last candidate BV in the historical motion information table IntraHMVP is filled, otherwise the historical motion is filled The first BV in the info sheet IntraHMVP.
- the index cbvp_index of the filling position in the candidate motion information list is an even number
- the mean value of the first and last candidate BVs in the IntraHMVP is filled, otherwise the first BV in the IntraHMVP is filled.
- Step 1504 Perform dynamic filling based on the parameter value of the displacement vector.
- the value in the horizontal direction of the first BV in the block vector may be filled in the horizontal direction, and the value in the vertical direction of the last BV in the vertical direction of the block vector may be filled in the IntraHMVP ; or, fill in the horizontal value of the last BV in the block vector in the horizontal direction, and fill in the vertical value of the first BV in the block vector in the vertical direction of the IntraHMVP.
- Step 1505 When the number of candidate BVs in the IntraHMVP is greater than the index of the filling position in the candidate motion information list, fill the (cnt_hbvp_cands%(cbvp_index+1))th BV in the IntraHMVP.
- the target motion information list and the number of displacement vectors in the intra-frame prediction historical motion information table are determined; when the target motion information list is not full, at least one motion information table is obtained based on the intra-frame prediction historical motion information table. motion information, and fill the candidate motion information list based on the at least one motion information; wherein, the candidate motion information list is used to provide candidate prediction displacement vectors for the current codec block, thus, it can be implemented in this More and more effective displacement vectors are provided in the candidate motion information list to achieve better displacement vector prediction effect, improve video compression performance, and improve user experience.
Landscapes
- Engineering & Computer Science (AREA)
- Multimedia (AREA)
- Signal Processing (AREA)
- Compression Or Coding Systems Of Tv Signals (AREA)
Abstract
本申请提供了一种候选运动信息列表确定方法,包括:确定目标运动信息列表和历史运动信息表;当所述目标运动信息列表未填满时,基于所述历史运动信息表获取至少一个运动信息,并基于所述至少一个运动信息对所述候选运动信息列表进行填充;其中,所述候选运动信息列表用于为当前编解码块提供候选的预测位移矢量。本申请还提供了候选运动信息列表确定装置、电子设备、存储介质及计算机程序产品。
Description
相关申请的交叉引用
本申请基于申请号为202011114059.5、申请日为2020年10月18日的中国专利申请提出,并要求该中国专利申请的优先权,该中国专利申请的全部内容在此引入本申请作为参考。
本申请涉及视频编码技术,尤其涉及一种候选运动信息列表确定方法、装置、电子设备、存储介质及计算机程序产品。
相关技术中,在视频压缩处理过程中,如通用视频编码(Versatile Video Coding,VVC)和音视频编码标准3(Audio Video coding Standard 3,AVS3)中,视频编解码器通常需要构建运动信息列表以导出预测位移矢量。
但是,当运动信息列表中包含的位移矢量不足时,会导致该运动信息列表无法提供有效的预测位移矢量,这会影响到视频的压缩性能,无法有效减少经过压缩的视频的体积,不利于用户的使用体验。
发明内容
有鉴于此,本申请实施例提供一种候选运动信息列表确定方法、装置、电子设备、存储介质及计算机程序产品,能够基于历史运动信息表获取至少一个运动信息,并基于所述至少一个运动信息对所述候选运动信息列表进行填充,取得更好的位移矢量预测效果,提升视频的压缩性能。
本申请实施例提供了一种候选运动信息列表确定方法,包括:
确定目标运动信息列表和历史运动信息表,其中,所述历史运动信息表用于帧间预测、帧内预测、帧内块复制(IBC,Intra Block Copy)预测、帧内串复制(ISC,Intra String Copy)预测中至少之一;
当所述目标运动信息列表未填满时,基于所述历史运动信息表获取至少一个运动信息,并基于所述至少一个运动信息对所述候选运动信息列表进行填充;
其中,所述候选运动信息列表用于为当前编解码块提供候选的预测位移矢量。
本申请实施例还提供了一种候选运动信息列表确定方法,包括:
确定历史运动信息表和空域运动信息列表的位置关系;
当目标运动信息列表未填满时,基于所述历史运动信息表、空域运动信息列表中至少之一,获取至少一个运动信息,并基于所述至少一个运动信息对所述候选运动信息列表进行填充;
其中,所述候选运动信息列表用于为当前编解码块提供候选的预测位移矢量。
本申请实施例还提供了一种候选运动信息列表确定装置,包括:
第一信息处理模块,配置为确定目标运动信息列表和历史运动信息表,其中,所述历史运动信息表用于帧间预测、帧内预测、帧内块复制预测、帧内串复制预测中至少之一;
第一信息填充模块,配置为当所述目标运动信息列表未填满时,基于所述历史运动信息表获取至少一个运动信息;
所述第一信息填充模块,配置为基于所述至少一个运动信息对所述候选运动信息列表进行填充;
其中,所述候选运动信息列表用于为当前编解码块提供候选的预测位移矢量。
本申请实施例还提供了一种候选运动信息列表确定装置,包括:
第二信息处理模块,配置为确定历史运动信息表和空域运动信息列表的位置关系;
第二信息填充模块,配置为当目标运动信息列表未填满时,基于所述历史运动信息表、空域运动信息列表中至少之一,获取至少一个运动信息,并基于所述至少一个运动信息对所述候选运动信息列表进行填充;
其中,所述候选运动信息列表用于为当前编解码块提供候选的预测位移矢量。
本申请实施例还提供了一种电子设备,所述电子设备包括:
存储器,配置为存储可执行指令;
处理器,配置为运行所述存储器存储的可执行指令时,实现本申请实施例提供的候选运动信息列表确定方法。
本申请实施例还提供了一种计算机可读存储介质,存储有可执行指令,所述可执行指令被处理器执行时,实现本申请实施例提供的候选运动信息列表确定方法。
本申请实施例通过确定目标运动信息列表和历史运动信息表中位移矢量的数量;当所述目标运动信息列表未填满时,基于所述历史运动信息表获取至少一个运动信息,并基于所述至少一个运动信息对所述候选运动信息列表进行填充;其中,所述候选运动信息列表用于为当前编解码块提供候选的预测位移矢量,由此,可以实现在该候选运动信息列表中提供更多更有效的位移矢量,以取得更好的位移矢量预测效果,提升视频的压缩性能,提升用户的使用体验。
图1为本申请实施例提供的列表确定方法的使用场景示意图;
图2为本申请实施例提供的电子设备的组成结构示意图;
图3为本申请实施例提供的视频编码过程的流程示意图;
图4为本申请实施例提供的帧间预测模式的示意图;
图5为本申请实施例提供的候选运动矢量的示意图;
图6为本申请实施例提供的帧内块复制模式的示意图;
图7为本申请实施例提供的帧内串复制模式的示意图;
图8为本申请实施例提供的候选运动信息列表确定方法的流程示意图;
图9为本申请实施例提供的候选运动信息列表确定方法的流程示意图;
图10为本申请实施例提供的候选运动信息列表确定方法的流程示意图;
图11为本申请实施例提供的空域临近块的示意图;
图12为本申请实施例提供的候选运动信息列表确定方法的使用场景示意图;
图13为本申请实施例提供的视频压缩示意图;
图14为本申请实施例提供的压缩视频呈现示意图;
图15为本申请实施例提供的候选运动信息列表确定方法的流程示意图。
为了使本申请的目的、技术方案和优点更加清楚,下面将结合附图对本申请作在一些实施例中详细描述,所描述的实施例不应视为对本申请的限制,本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其它实施例,都属于本申请保护的范围。
在以下的描述中,涉及到“一些实施例”,其描述了所有可能实施例的子集,但是可以理解,“一些实施例”可以是所有可能实施例的相同子集或不同子集,并且可以在不冲突的情况下相互结合。
对本申请实施例进行详细说明之前,对本申请实施例中涉及的名词和术语进行说明,本申请实施例中涉及的名词和术语适用于如下的解释。
1)API:全称Application Programming Interface,可翻译成应用程序接口,是一些预先定义的函数,或指软件系统不同组成部分衔接的约定。目的是提供应用程序与开发人员基于某软件或硬件得以访问一组例程的能力,而又无需访问原码,或理解内部工作机制的细节。
2)SDK:全称Software Development Kit,可翻译成软件开发工具包,是为特定的软件包、软件框架、硬件平台、操作系统等建立应用软件时的开发工具的集合广义上包括辅助开发某一类软件的相关文档、范例和工具的集合。
3)P帧:帧间预测帧,可采用帧内预测和帧间预测,可前向参考预测视频编码方式。
4)B帧:帧间预测帧,可采用帧内预测和帧间预测,可前向、后向、双向参考预测。
5)I帧:帧内预测帧,利用帧内信息进行预测。
6)视频编解码标准:某一种约定的视频码流解码规则。
7)视频编码(Video Transcoding),是指将已经压缩编码的视频码流转换成另一个视频码流,以适应不同的网络带宽、不同的终端处理能力和不同的用户需求。
8)客户端,终端中实现特定功能的载体,例如移动客户端(APP)是移动终端中特定功能的载体,例如执行线上直播(视频推流)的功能或者是在线视频的播放功能。
9)响应于:用于表示所执行的操作所依赖的条件或者状态,当满足所依赖的条件或状态时,所执行的一个或多个操作可以是实时的,也可以具有设定的延迟;在没有特别说明的情况下,所执行的多个操作不存在执行先后顺序的限制。
下面对本申请所提供的候选运动信息列表确定方法的使用环境进行说明,参见图1,图1为本申请实施例提供的列表确定方法的使用场景示意图,参见图1,终端(包括终端10-1和终端10-2)上设置有能够执行不同功能相应客户端其中,所属客户端为终端(包括终端10-1和终端10-2)通过网络300从相应的服务器200中利用不同的业务进程获取不同的视频信息进行浏览,终端通过网络300连接服务器200,网络300可以是广域网或者局域网,又或者是二者的组合,使用无线链路实现数据传输。
其中,终端(包括终端10-1和终端10-2)通过网络300从相应的服务器200中所获取的视频类型并不相同,例如:终端(包括终端10-1和终端10-2)既可以通过网络300从相应的服务器200中获取视频(即视频中携带视频信息或相应的视频链接),也可以通过网络300从相应的服务器400中获取仅包括不同类型视频(例如短视频或者长视频)进行浏览。服务器200和服务器400中可以保存有不同类型的视频。在本申请的一些实施例中,服务器200中所保存的不同类型的视频的进程可以是在不 同编程语言的软件代码中所编写的,代码对象可以是不同类型的代码实体。例如,在C语言的软件代码中,一个代码对象可以是一个函数。在JAVA语言的软件代码中,一个代码对象可以是一个类,IOS端OC语言中可以是一段目标代码。在C++语言的软件代码中,一个代码对象可以是一个类或一个函数。其中本申请中不再对不同类型的视频的编译环境进行区分。但是,这一过程中,现有技术在视频压缩处理过程中,如VVC(Versatile Video Coding,通用视频编码)和AVS3(Audio Video coding Standard 3,音视频编码标准3)中,视频编解码器通常需要构建运动信息列表以导出预测位移矢量。但是,当运动信息列表中包含的位移矢量不足时,会导致该运动信息列表无法提供有效的预测位移矢量,例如,在进行位移矢量预测时,仅通过构建帧内预测历史运动信息IntraHMVP表来构建候选运动信息列表并导出预测块位移矢量(BVP,Block Vector Predictor)或预测串矢量(SVP,String Vector Predictor)。IntraHMVP的最大长度为12,候选运动信息列表的最大长度为7,当IntraHMVP长度不足或为空时,会导致候选运动信息列表无法填满,从而导致无法提供足够多的运动信息用于位移矢量的预测,由此,将会影响到视频的压缩性能,使得视频压缩率降低,不利于用户对视频压缩的使用体验。
在一些实施例中,服务器200通过网络300向终端(终端10-1和/或终端10-2)发送或接收不同类型的视频的过程中,由于视频信息所占用的存储空间较大,因此需要对视频信息进行压缩,因此。作为一个事例,服务器200配置为确定目标运动信息列表和历史运动信息表,其中,所述历史运动信息表用于帧间预测、帧内预测、帧内块复制预测、帧内串复制预测中至少之一,还可以确定历史运动信息表中位移矢量的数量;当所述目标运动信息列表未填满时,基于所述历史运动信息表获取至少一个运动信息,并基于所述至少一个运动信息对所述候选运动信息列表进行填充;其中,所述候选运动信息列表用于为当前编解码块提供候选的预测位移矢量。
当然,在本申请的一些实施例中,服务器200还可以配置为确定历史运动信息表和空域运动信息列表的位置关系;当目标运动信息列表未填满时,基于历史运动信息表、空域运动信息列表中至少之一,获取至少一个运动信息,并基于所述至少一个运动信息对所述候选运动信息列表进行填充;其中,所述候选运动信息列表用于为当前编解码块提供候选的预测位移矢量。例如,服务器200在视频压缩处理的过程中,可以根据不同的使用环境或者用户设置对候选运动信息列表确定过程进行灵活调整,例如可以灵活地使用帧内预测历史运动信息表对候选运动信息列表进行填充,也可以选择其它类型的历史运动信息表对帧间预测、帧内块复制预测以及帧内串复制预测过程中的候选运动信息列表进行填充。
在实际应用中,服务器200可以是独立的物理服务器,也可以是多个物理服务器构成的服务器集群或者分布式系统,还可以是提供云服务、云数据库、云计算、云函数、云存储、网络服务、云通信、中间件服务、域名服务、安全服务、内容分发网络(CDN,Content Delivery Network)、以及大数据和人工智能平台等基础云计算服务的云服务器。
下面对本申请实施例的服务器的结构做详细说明,服务器可以各种形式来实施,如带有候选运动信息列表确定功能的专用终端例如网关,也可以为带有候选运动信息列表确定功能的服务器,例如前述图1中的服务器200。图2为本申请实施例提供的电子设备的组成结构示意图,可以理解,图2仅仅示出了服务器的示例性结构而非全部结构,根据需要可以实施图2示出的部分结构或全部结构。
本申请实施例提供的服务器包括:至少一个处理器201、存储器202、用户接口203和至少一个网络接口204。电子设备中的各个组件通过总线系统205耦合在一起。可以理解,总线系统205配置为实现这些组件之间的连接通信。总线系统205除包括 数据总线之外,还包括电源总线、控制总线和状态信号总线。但是为了清楚说明起见,在图2中将各种总线都标为总线系统205。
其中,用户接口203可以包括显示器、键盘、鼠标、轨迹球、点击轮、按键、按钮、触感板或者触摸屏等。
可以理解,存储器202可以是易失性存储器或非易失性存储器,也可包括易失性和非易失性存储器两者。本申请实施例中的存储器202能够存储数据以支持终端(如10-1)的操作。这些数据的示例包括:用于在终端(如10-1)上操作的任何计算机程序,如操作系统和应用程序。其中,操作系统包含各种系统程序,例如框架层、核心库层、驱动层等,用于实现各种基础业务以及处理基于硬件的任务。应用程序可以包含各种应用程序。
在一些实施例中,本申请实施例提供的候选运动信息列表确定装置可以采用软硬件结合的方式实现,作为示例,本申请实施例提供的候选运动信息列表确定装置可以是采用硬件译码处理器形式的处理器,其被编程以执行本申请实施例提供的候选运动信息列表确定方法。例如,硬件译码处理器形式的处理器可以采用一个或多个应用专用集成电路(ASIC,Application Specific Integrated Circuit)、DSP、可编程逻辑器件(PLD,Programmable Logic Device)、复杂可编程逻辑器件(CPLD,Complex Programmable Logic Device)、现场可编程门阵列(FPGA,Field-Programmable Gate Array)或其他电子元件。
作为本申请实施例提供的候选运动信息列表确定装置采用软硬件结合实施的示例,本申请实施例所提供的候选运动信息列表确定装置可以直接体现为由处理器201执行的软件模块组合,软件模块可以位于存储介质中,存储介质位于存储器202,处理器201读取存储器202中软件模块包括的可执行指令,结合必要的硬件(例如,包括处理器201以及连接到总线205的其他组件)完成本申请实施例提供的候选运动信息列表确定方法。
作为示例,处理器201可以是一种集成电路芯片,具有信号的处理能力,例如通用处理器、数字信号处理器(DSP,Digital Signal Processor),或者其他可编程逻辑器件、分立门或者晶体管逻辑器件、分立硬件组件等,其中,通用处理器可以是微处理器或者任何常规的处理器等。
作为本申请实施例提供的候选运动信息列表确定装置采用硬件实施的示例,本申请实施例所提供的装置可以直接采用硬件译码处理器形式的处理器201来执行完成,例如,被一个或多个应用专用集成电路(ASIC,Application Specific Integrated Circuit)、DSP、可编程逻辑器件(PLD,Programmable Logic Device)、复杂可编程逻辑器件(CPLD,Complex Programmable Logic Device)、现场可编程门阵列(FPGA,Field-Programmable Gate Array)或其他电子元件执行实现本申请实施例提供的候选运动信息列表确定方法。
本申请实施例中的存储器202配置为存储各种类型的数据以支持电子设备的操作。这些数据的示例包括:用于在电子设备上操作的任何可执行指令,如可执行指令,实现本申请实施例的从候选运动信息列表确定方法的程序可以包含在可执行指令中。
在另一些实施例中,本申请实施例提供的候选运动信息列表确定装置可以采用软件方式实现,图2示出了存储在存储器202中的候选运动信息列表确定装置2020,其可以是程序和插件等形式的软件,并包括一系列的模块,作为存储器202中存储的程序的示例,可以包括候选运动信息列表确定装置2020,候选运动信息列表确定装置2020中包括以下的软件模块:第一信息处理模块2081、第一信息填充模块2082、第二信息处理模块2083、第二信息填充模块2084。当候选运动信息列表确定装置2020 中的软件模块被处理器201读取到随机存取存储器(RAM,Random Access Memory)中并执行时,将实现本申请实施例提供的候选运动信息列表确定方法,下面对候选运动信息列表确定装置2020中各个软件模块的功能进行介绍:
第一信息处理模块2081,配置为确定目标运动信息列表和历史运动信息表;
第一信息填充模块2082,配置为当所述目标运动信息列表未填满时,基于历史运动信息表获取至少一个运动信息;
所述第一信息填充模块2082,配置为基于所述至少一个运动信息对所述候选运动信息列表进行填充;
第二信息处理模块2083,配置为确定历史运动信息表和空域运动信息列表的位置关系;
第二信息填充模块2084,配置为当目标运动信息列表未填满时,基于所述历史运动信息表、空域运动信息列表中至少之一,获取至少一个运动信息,并基于所述至少一个运动信息对所述候选运动信息列表进行填充;
其中,所述候选运动信息列表用于为当前编解码块提供候选的预测位移矢量。
根据图2所示的电子设备,在本申请的一个方面中,本申请还提供了一种计算机程序产品或计算机程序,该计算机程序产品或计算机程序包括计算机指令,该计算机指令存储在计算机可读存储介质中。计算机设备的处理器从计算机可读存储介质读取该计算机指令,处理器执行该计算机指令,使得该计算机设备执行上述候选运动信息列表确定方法的各种实现方式中所提供的不同实施例及实施例的组合。
在介绍本申请所提供的候选运动信息列表确定方法之前,首先对相关技术中的视频编码过程进行介绍,其中,图3为本申请实施例中视频编码过程的流程示意图。其中,视频信号是指包括多个帧的图像序列。帧(frame)是视频信号空间信息的表示。以YUV模式为例,一个帧包括一个亮度样本矩阵(Y)和两个色度样本矩阵(Cb和Cr)。从视频信号的获取方式来看,可以分为摄像机拍摄到的以及计算机生成的两种方式。由于统计特性的不同,其对应的压缩编码方式也可能有所区别。
相关视频编码技术中,如高效率视频压缩编码(High Efficient Video Coding,H.265/HEVC)、通用视频编码(Versatile Video Coding,H.266/VVC)标准、音视频编码标准(Audio Video coding Standard,AVS)(如AVS3)中,采用了混合编码框架,对输入的原始视频信号进行如下一系列的操作和处理:
1、块划分结构(Block Partition Structure):输入图像划分成若干个不重叠的处理单元,每个处理单元将进行类似的压缩操作。这个处理单元被称作编码树单元(Coding Tree Unit,CTU),或者最大编码单元(Large Coding Unit,LCU)。CTU再往下,可以继续进行更加精细的划分,得到一个或多个基本编码的单元,称之为编码单元(Coding Unit,CU)。每个CU是一个编码环节中最基本的元素。以下描述的是对每一个CU可能采用的各种编码方式。
2、预测编码(Predictive Coding):包括了帧内预测和帧间预测等方式,原始视频信号经过选定的已重建视频信号的预测后,得到残差视频信号。编码端需要为当前CU决定在众多可能的预测编码模式中,选择最适合的一种,并告知解码端。其中,帧内预测是指预测的信号来自于同一图像内已经编码重建过的区域。帧间预测是指预测的信号来自已经编码过的,不同于当前图像的其他图像(称之为参考图像)。
3、变换编码及量化(Transform&Quantization):残差视频信号经过离散傅里叶变换(Discrete Fourier Transform,DFT)、离散余弦变换(Discrete Cosine Transform,DCT)等变换操作,将信号转换到变换域中,称之为变换系数。在变换域中的信号,进行有损的量化操作,丢失掉一定的信息,使得量化后的信号有利于压缩表达。在一 些视频编码标准中,可能有多于一种变换方式可以选择,因此,编码端也需要为当前CU选择其中的一种变换,并告知解码端。量化的精细程度通常由量化参数来决定。量化参数(Quantization Parameter,QP)取值较大,表示更大取值范围的系数将被量化为同一个输出,因此通常会带来更大的失真,及较低的码率;相反,QP取值较小,表示较小取值范围的系数将被量化为同一个输出,因此通常会带来较小的失真,同时对应较高的码率。
4、熵编码(Entropy Coding)或统计编码:量化后的变换域信号,将根据各个值出现的频率,进行统计压缩编码,最后输出二值化(0或者1)的压缩码流。同时,编码产生其他信息,例如选择的模式、运动矢量等,也需要进行熵编码以降低码率。统计编码是一种无损编码方式,可以有效的降低表达同样的信号所需要的码率。常见的统计编码方式有变长编码(Variable Length Coding,VLC)或者基于上下文的二值化算术编码(Content Adaptive Binary Arithmetic Coding,CABAC)。
5、环路滤波(Loop Filtering):已经编码过的图像,经过反量化、反变换及预测补偿的操作(上述2~4的反向操作),可获得重建的解码图像。重建图像与原始图像相比,由于存在量化的影响,部分信息与原始图像有所不同,产生失真(distortion)。对重建图像进行滤波操作,例如去块效应滤波(deblocking),样本自适应偏移量(Sample Adaptive Offset,SAO)或者自适应格型滤波器(Adaptive Lattice Filter,ALF)等滤波器,可以有效的降低量化所产生的失真程度。由于这些经过滤波后的重建图像,将作为后续编码图像的参考,用于对将来的信号进行预测,所以上述的滤波操作也被称为环路滤波,及在编码环路内的滤波操作。
根据上述编码过程可以看出,在解码端,对于每一个CU,解码器获得压缩码流后,先进行熵解码,获得各种模式信息及量化后的变换系数。各个系数经过反量化及反变换,得到残差信号。另一方面,根据已知的编码模式信息,可获得该CU对应的预测信号,两者相加之后,即可得到重建信号。最后,解码图像的重建值,需要经过环路滤波的操作,产生最终的输出信号。
相关视频编码标准中,如HEVC、VVC、AVS3等标准中,均采用基于块的混合编码框架。它们将原始的视频数据分成一系列的编码块,结合预测、变换和熵编码等视频编码方法,实现视频数据的压缩。其中,运动补偿是视频编码常用的一类预测方法,运动补偿基于视频内容在时域或空域的冗余特性,从已编码的区域导出当前编码块的预测值。这类预测方法包括:帧间预测、帧内块复制预测、帧内串复制预测等,在编码实现中,可能单独或组合使用这些预测方法。对于使用了这些预测方法的编码块,通常需要在码流显式或隐式地编码一个或多个二维的位移矢量,指示当前块(或当前块的同位块)相对它的一个或多个参考块的位移。
需要注意的是,在不同的预测模式下及不同的实现,位移矢量可能有不同的名称,本文统一按照以下方式进行描述:1)帧间预测模式中的位移矢量称为运动矢量(Motion Vector,MV);2)帧内块复制(Intra Block Copy,IBC)预测模式中的位移矢量称为块矢量(Block Vector,BV);3)帧内串复制(Intra String Copy,ISC)预测模式中的位移矢量称为串矢量(String Vector,SV)。帧内串复制也称作“串预测”或“串匹配”等。
MV是指用于帧间预测模式的位移矢量,由当前图像指向参考图像,其值为当前块和参考块之间的坐标偏移量,其中,当前块与参考块在两个不同图像中。在帧间预测模式中,可以引入运动矢量预测,通过对当前块的运动矢量进行预测,得到当前块对应的预测运动矢量,对当前块对应的预测运动矢量与实际运动矢量之间的差值进行编码传输,相较于直接对当前块对应的实际运动矢量进行编码传输,有利于节省比特 开销。在本申请实施例中,预测运动矢量是指通过运动矢量预测技术,得到的当前块的运动矢量的预测值。
BV是指用于IBC预测模式的位移矢量,其值为当前块和参考块之间的坐标偏移量,其中,当前块与参考块均在当前图像中。在IBC预测模式中,可以引入块矢量预测,通过对当前块的块矢量进行预测,得到当前块对应的预测块矢量,对当前块对应的预测块矢量与实际块矢量之间的差值进行编码传输,相较于直接对当前块对应的实际块矢量进行编码传输,有利于节省比特开销。在本申请实施例中,预测块矢量是指通过块矢量预测技术,得到的当前块的块矢量的预测值。
SV是指用于ISC预测模式的位移矢量,其值为当前串和参考串之间的坐标偏移量,其中,当前串与参考串均在当前图像中。在ISC预测模式中,可以引入串矢量预测,通过对当前串的串矢量进行预测,得到当前串对应的预测串矢量,对当前串对应的预测串矢量与实际串矢量之间的差值进行编码传输,相较于直接对当前串对应的实际串矢量进行编码传输,有利于节省比特开销。在本申请实施例中,预测串矢量是指通过串矢量预测技术,得到的当前串的串矢量的预测值。
下面对几种不同的预测模式进行介绍:
一、帧间预测模式
如图4所示,图4为本申请实施例中帧间预测模式的示意图;帧间预测利用视频时间域的相关性,使用邻近已编码图像的像素预测当前图像的像素,以达到有效去除视频时域冗余的目的,能够有效节省编码残差数据的比特。其中,P为当前帧,Pr为参考帧,B为当前待编码块,Br是B的参考块。B’与B在图像中的坐标位置相同,Br坐标为(xr,yr),B’坐标为(x,y)。当前待编码块与其参考块之间的位移,称为运动矢量(MV),即:
MV=(xr-x,yr-y)。
考虑到时域或空域邻近块具有较强的相关性,可以采用MV预测技术减少编码MV所需要的比特。在H.265/HEVC中,帧间预测包含Merge和高级运动向量预测(Advanced Motion Vector Prediction,AMVP)两种MV预测技术。
Merge模式会为当前预测单元(Prediction Unit,PU)建立一个MV候选列表,其中存在5个候选MV(及其对应的参考图像)。遍历这5个候选MV,选取率失真代价最小的作为最优MV。若编解码器依照相同的方式建立候选列表,则编码器只需要传输最优MV在候选列表中的索引即可。需要注意的是,HEVC的MV预测技术还有一种skip模式,是Merge模式的一种特例。在Merge模式找到最优MV后,如果当前块和参考块基本一样,那么不需要传输残差数据,只需要传送MV的索引和一个skip flag。
参考图5,图5为本申请实施例提供的候选运动矢量的示意图。Merge模式建立的MV候选列表中包含了空域和时域的两种情形,对于B Slice(B帧图像),还包含组合列表的方式。其中,空域最多提供4个候选MV,它的建立如图5中的(a)部分所示。空域列表按照A1→B1→B0→A0→B2的顺序建立,其中B2为替补,即当A1,B1,B0,A0中有一个或多个不存在时,则需要使用B2的运动信息;时域最多只提供1个候选MV,它的建立如图5中的(b)部分所示,由同位PU的MV按下式伸缩得到:
curMV=td*colMV/tb;
其中,curMV表示当前PU的MV,colMV表示同位PU的MV,td表示当前图像与参考图像之间的距离,tb表示同位图像与参考图像之间的距离。若同位块上D0位置PU不可用,则用D1位置的同位PU进行替换。对于B Slice中的PU,由于存 在两个MV,其MV候选列表也需要提供两个预测运动矢量(Motion Vector Predictor,MVP)。HEVC通过将MV候选列表中的前4个候选MV进行两两组合,产生了用于B Slice的组合列表。
类似的,AMVP模式利用空域和时域邻近块的MV相关性,为当前PU建立MV候选列表。与Merge模式不同,AMVP模式的MV候选列表中选择最优的预测MV,与当前待编码块通过运动搜索得到的最优MV进行差分编码,即编码MVD=MV-MVP,其中MVD为运动矢量残差(Motion Vector Difference);解码端通过建立相同的列表,仅需要MVD与MVP在该列表中的序号即可计算当前解码块的MV。AMVP模式的MV候选列表也包含空域和时域两种情形,不同的是AMVP模式的MV候选列表长度仅为2。
基于历史的运动矢量预测(History based Motion Vector Prediction,HMVP)是H.266/VVC中新采纳的一种MV预测技术。HMVP是基于历史信息的运动矢量预测方法。历史编码块的运动信息被存储在HMVP列表中,并且用作当前CU的MVP。H.266/VVC将HMVP添加至Merge模式的候选列表中,其顺序在空域和时域MVP之后。HMVP技术以先进先出的队列(First Input First Output,FIFO)存储先前编码块的运动信息。如果已存储的候选运动信息与刚刚编码完成的运动信息相同,这个重复的候选运动信息首先会被移除,然后将所有HMVP候选向前移动,并在FIFO的尾部会加入当前编码单元的运动信息。如果当前编码单元的运动信息与FIFO中任意候选的运动信息均不相同,则将最新的运动信息加到FIFO末尾。在向HMVP列表添加新的运动信息时,如果列表已达到最大长度,就去掉FIFO中第一个候选,再将最新的运动信息加到FIFO末尾。在遇到新的编码树单元(Coding Tree Unit,CTU)行时HMVP列表将重置(清空)。在H.266/VVC中,HMVP表大小S设置为6,为了减少冗余确定操作的数量,引入了以下简化:
1.将用于Merge列表生成的HMVP候选的数量设置为(N<=4)?M:(8-N),其中,N表示Merge列表中现有候选的数量,M表示Merge列表中可用的HMVP候选的数量。
2.一旦可用Merge列表的长度达到最大允许的长度减1,则HMVP的合并候选者列表构建过程终止。
二、IBC预测模式
IBC是HEVC屏幕内容编码(Screen Content Coding,SCC)扩展中采纳的一种帧内编码工具,它显著的提升了屏幕内容的编码效率。在AVS3和VVC中,也采纳了IBC技术以提升屏幕内容编码的性能。IBC利用屏幕内容视频在空间的相关性,使用当前图像上已编码图像像素预测当前待编码块的像素,能够有效节省编码像素所需的比特。如图6所示,在IBC中当前块与其参考块之间的位移,称为BV。H.266/VVC采用了类似于帧间预测的BV预测技术节省编码BV所需的比特。
三、ISC预测模式
ISC技术按照某种扫描顺序(如光栅扫描、往返扫描和Zig-Zag扫描等)将一个编码块分成一系列像素串或未匹配像素。类似于IBC,每个串在当前图像已编码区域中寻找相同形状的参考串,导出当前串的预测值,通过编码当前串像素值与预测值之间残差,代替直接编码像素值,能够有效节省比特。图7给出了帧内串复制的示意图,深灰色的区域为已编码区域,白色的28个像素为串1,浅灰色的35个像素为串2,黑色的1个像素表示未匹配像素。串1与其参考串之间的位移,即为图6中的串矢量1;串2与其参考串之间的位移,即为图6中的串矢量2。
帧内串复制技术需要编码当前编码块中各个串对应的SV、串长度以及是否有匹 配串的标志等。其中,SV表示待编码串到其参考串的位移。串长度表示该串所包含的像素数量。在不同的实现方式中,串长度的编码有多种方式,以下给出几种示例(部分示例可能组合使用):1)直接在码流中编码串的长度;2)在码流中编码处理该串后续的待处理像素数量,解码端则根据当前块的大小N,已处理的像素数量N1,解码得到的待处理像素数量N2,计算得到当前串的长度,L=N-N1-N2;3)在码流中编码一个标志指示该串是否为最后一个串,如果是最后一个串,则根据当前块的大小N,已处理的像素数量N1,计算得到当前串的长度L=N-N1。如果一个像素在可参考的区域中没有找到对应的参考,将直接对未匹配像素的像素值进行编码。
四、AVS3中的帧内预测运动矢量预测
IBC和ISC是AVS3中两种屏幕内容编码工具,他们均以当前图像为参考,通过运动补偿导出编码单元的预测值。考虑到IBC与ISC具有相似的参考区域,BV和SV具有较高的相关性,可通过允许两者之间的预测提高编码效率。AVS3使用一个类似于HMVP的帧内预测历史运动信息表(IntraHMVP)记录这两类编码块的位移矢量信息、位置信息、尺寸信息和重复次数,并由IntraHMVP导出预测块矢量(Block Vector Predictor,BVP)和预测串矢量(String Vector Predictor,SVP)。其中,BVP即为块矢量的预测值,SVP即为串矢量的预测值。为了支持并行编码,如果当前最大编码单元是片中当前行的第一个最大编码单元,帧内预测历史运动信息表中CntIntraHmvp的值初始化为0。
1.预测块矢量的导出
AVS3中采纳了基于类别的块矢量预测(Class based Block Vector Prediction,CBVP),类似于HMVP,该方法首先使用一个HBVP(History based Block Vector Prediction,基于历史的块矢量预测)列表存储历史的IBC编码块的信息,除了记录历史编码块的BV信息以外,还记录了历史编码块的位置、大小等信息。对于当前编码块,按以下条件对HBVP中的候选BV进行分类:
类别0:历史编码块的面积大于或等于64像素;
类别1:BV的频率大于或等于2;
类别2:历史编码块左上角的坐标位于当前块左上角坐标的左方;
类别3:历史编码块左上角的坐标位于当前块左上角坐标的上方;
类别4:历史编码块左上角的坐标位于当前块左上角坐标的左上方;
类别5:历史编码块左上角的坐标位于当前块左上角坐标的右上方;
类别6:历史编码块左上角的坐标位于当前块左上角坐标的左下方;
其中,每个类别中的实例按编码顺序的逆序排列(编码顺序距离当前块越近,排序越靠前),第一个历史编码块对应的BV为该类对应的候选BV。然后按类别0到类别6的顺序添加每个类别对应的候选BV至CBVP列表。在向CBVP列表中添加新的BV时,需要确定CBVP列表中是否已存在重复的BV。仅当不存在重复的BV时,才将该BV添加至CBVP列表中。编码端在CBVP列表中选择最佳的候选BV作为BVP,并在码流中编码一个索引,表示最佳的候选BV所对应类别在CBVP列表中的索引。解码端根据该索引从CBVP列表中解码得到BVP。
完成当前预测单元的解码后,如果当前预测单元的预测类型为块复制帧内预测(即IBC),当NumOfIntraHmvpCand大于0时,根据当前预测块的块复制帧内预测运动信息,按下文介绍的方式更新IntraHMVP。当前预测块的帧内预测运动信息包括位移矢量信息、位置信息、尺寸信息和重复次数,其中块复制帧内预测块的位移矢量信息为块矢量;位置信息包括当前预测块左上角横坐标,左上角纵坐标;尺寸信息为宽度与高度的乘积;当前预测块的重复次数初始化为0。
2.预测串矢量的导出
AVS3为ISC编码块中的每个串编码一个索引,指示该串的SVP在IntraHMVP中的位置。类似于帧间预测中的skip模式,当前串的SV等于SVP,无需编码SV与SVP之间的残差。
完成当前预测单元的解码后,如果当前预测单元的预测类型为串复制帧内预测(即ISC),当NumOfIntraHmvpCand大于0时,根据当前预测块的串复制帧内预测运动信息,按下文介绍的方式更新IntraHMVP。当前预测块的串复制帧内预测运动信息包括位移矢量信息、位置信息、尺寸信息和重复次数,其中当前串的位移矢量信息为串矢量;位置信息包括该串第一个像素样本的横坐标和纵坐标,即(xi,yi);尺寸信息为该部分的串长度,即StrLen[i];重复次数初始化为0。
3.帧内预测历史运动信息表更新
帧内预测运动信息包括位移矢量信息、位置信息、尺寸信息和重复次数。完成当前预测单元的解码后,如果当前预测单元的预测类型为块复制帧内预测或串复制帧内预测,且NumOfIntraHmvpCand大于0时,根据当前预测块的帧内预测运动信息,更新帧内预测历史运动信息表IntraHmvpCandidateList,IntraHmvpCandidateList[X]的位移矢量信息、位置信息、尺寸信息和重复次数分别记为intraMvCandX、posCandX、sizeCandX和cntCandX;否则,不执行本条定义的操作。
a)将X初始化为0,将cntCur初始化为0。
b)如果CntIntraHmvp等于0,则IntraHmvpCandidateList[CntIntraHmvp]为当前预测单元的帧内预测运动信息,CntIntraHmvp加1。
c)否则,根据intraMvCur和intraMvCandX是否相等判断当前预测块的帧内预测运动信息和IntraHmvpCandidateList[X]是否相同:
1)如果intraMvCur和intraMvCandX相同,执行步骤d),否则,X加1。
2)如果X小于CntIntraHmvp,执行步骤c);否则,执行步骤e)。
d)cntCur等于cntCandX的值加1。如果sizeCur小于sizeCandX,则当前sizeCur分别等于sizeCandX。
e)如果X小于CntIntraHmvp,则:
1)i从X到CntIntraHmvp-1,令IntraHmvpCandidateList[i]等于IntraHmvpCandidateList[i+1];
2)IntraHmvpCandidateList[CntIntraHmvp-1]等于当前预测单元的帧内预测运动信息。
f)否则,如果X等于CntIntraHmvp且CntIntraHmvp等于NumOfIntraHmvpCand,则:
1)i从0到CntIntraHmvp-1,令IntraHmvpCandidateList[i]等于IntraHmvpCandidateList[i+1];
2)IntraHmvpCandidateList[CntIntraHmvp-1]等于当前预测单元的帧内预测运动信息。
g)否则,如果X等于CntIntraHmvp且CntIntraHmvp小于NumOfIntraHmvpCand,则IntraHmvpCandidateList[CntIntraHmvp]等于当前预测单元的帧内预测运动信息,CntIntraHmvp加1。
在目前的AVS3标准中,在进行位移矢量预测时,仅通过构建帧内预测历史运动信息(IntraHMVP)表来构建候选运动信息列表并导出预测块矢量(BVP)或预测串矢量(SVP)。IntraHMVP的长度最大为12,候选运动信息列表的最大长度为7。当IntraHMVP的长度不足或为空时,也会导致候选运动信息列表无法填满,从而导致无 法提供足够多的运动信息用于位移矢量的预测。
为了克服上述缺陷,继续参考图8,通过本申请所提供的候选运动信息列表确定方法,可以实现基于空域确定相应的候选运动信息列表,实现通过与历史运动信息列表相结合提升视频的压缩性能,使得视频压缩的压缩性能提升,从而有效使用视频压缩的用户的使用体验。
继续结合前序实施例说明本申请实施例提供的候选运动信息列表确定方法,参见图8,图8为本申请实施例提供的候选运动信息列表确定方法一个可选的流程示意图,可以理解地,图8所示的步骤可以由运行候选运动信息列表确定装置的各种服务器执行,例如可以是如带有候选运动信息列表确定功能的专用终端、服务器或者服务器集群。下面针对图8示出的步骤进行说明。
步骤801:候选运动信息列表确定装置确定目标运动信息列表和历史运动信息表。
其中,所述历史运动信息表用于帧间预测、帧内预测、帧内块复制预测、帧内串复制预测中至少之一。
其中,本申请实施例提供的候选运动信息列表确定方法可以应用于三维视
频编码技术中,如HEVC(High Efficiency Video Coding)等三维编码技术中。三维编码技术是采用相邻重构像素对当前块进行帧内预测,从相邻块的运动矢量中选择预测运动矢量构建运动矢量列表进行运动补偿的帧间预测等。在三维编码技术中,使用了编码单元(Coding Unit,CU)、预测单元(PredictionUnit,PU)和变换单元(Transform Uint,TU)3个概念描述整个编码过程。其中,CU为宏块或者子宏块,每个CU均为2N*2N的像素块(N为2的幂次方)。每个CU是通过PU来实现预测过程,PU尺寸受限于所述的CU,可以是方块(如2N*2N,N*N),也可以为矩形(2N*N,N*2N),在本申请的一些实施例中,如后续图11所示,其中ABCDE块的大小为系统定义的最小块尺寸(4*4),当然,在实际使用中还可以根据候选运动信息列表确定方法的使用环境进行灵活调整。
步骤802:候选运动信息列表确定装置判断目标运动信息列表是否填充完成,如果是,执行步骤803,否则执行步骤804。
步骤803:继续解码处理。
步骤804:候选运动信息列表确定装置当所述目标运动信息列表未填满时,基于历史运动信息表获取至少一个运动信息。
其中,目标运动信息列表未填满包括:
基于空域运动信息列表进行填充后所述目标运动信息列表未填满;或者,未经过空域运动信息列表进行填充时,所述目标运动信息列表未填满。
在一些实施例中,当所述目标运动信息列表未填满时,所述候选运动信息列表的填充方式包括至少以下之一:
基于所述历史运动信息表对所述候选运动信息列表进行填充;或者,基于空域运动信息列表对所述候选运动信息列表进行填充;或者,基于历史运动信息表和空域运动信息列表对所述候选运动信息列表进行填充。通过不同填充过程的单独使用或者组合使用,可以适配不同的视频压缩使用环境,提高本申请所提供的候选运动信息列表确定方法的适配性。其中,通过历史运动信息表确定候选运动信息列表的过程、通过空域运动信息列表确定候选运动信息列表的过程,以及通过空域运动信息列表和历史运动信息表共同确定候选运动信息列表的过程将在后续实施例中依次介绍。
步骤805:候选运动信息列表确定装置并基于所述至少一个运动信息对所述候选运动信息列表进行填充。
其中,所述候选运动信息列表用于为当前编解码块提供候选的预测位移矢量。
在本申请的一些实施例中,基于历史运动信息表获取至少一个运动信息,并基于所述至少一个运动信息对所述候选运动信息列表进行填充,包括:
基于历史运动信息表,确定对应的位移矢量,并基于所述位移矢量对所述候选运动信息列表进行填充;或者,基于历史运动信息表,确定对应的位移矢量,并基于所述候选运动信息列表中填充位置的索引的奇偶性进行动态填充;或者,基于所述历史运动信息表,确定对应的位移矢量,并基于所述位移矢量对预测模式中的位移矢量进行填充,以实现对所述候选运动信息列表进行填充;或者,基于所述历史运动信息表,确定对应的位移矢量,并基于所述位移矢量数量以及所述候选运动信息列表中填充位置的索引进行动态填充。其中,基于位移矢量对候选运动信息列表进行填充可以根据以下1)-3)任意一种方式进行填充:
方式1)基于所述历史运动信息表,确定历史运动信息表中第一个位移矢量和最后一个位移矢量,并基于所述第一个位移矢量和最后一个位移矢量的均值对所述候选运动信息列表进行填充;方式2)基于所述历史运动信息表,确定历史运动信息表中第一个位移矢量和最后一个位移矢量,并基于所述第一个位移矢量和最后一个位移矢量的加权平均值对所述候选运动信息列表进行填充;方式3)基于所述历史运动信息表,确定历史运动信息表中第一个位移矢量,并基于所述第一个位移矢量对所述候选运动信息列表进行填充。
在本申请的一些实施例中,由于候选运动信息列表中填充位置的索引的奇偶性不同,因此,可以基于候选运动信息列表中填充位置的索引的奇偶性进行动态填充,例如,当所述候选运动信息列表中填充位置的索引为奇数时,确定历史运动信息表中第一个位移矢量和最后一个位移矢量,并基于所述第一个位移矢量和最后一个位移矢量的均值对所述候选运动信息列表进行填充;或者,基于历史运动信息表,确定历史运动信息表中第一个位移矢量,并基于所述第一个位移矢量对所述候选运动信息列表进行填充。另一方面,当所述候选运动信息列表中填充位置的索引为偶数时,确定历史运动信息表中第一个位移矢量和最后一个位移矢量,并基于所述第一个位移矢量和最后一个位移矢量的均值对所述候选运动信息列表进行填充;或者,基于所述历史运动信息表,确定历史运动信息表中第一个位移矢量,并基于所述第一个位移矢量对所述候选运动信息列表进行填充。
在本申请的一些实施例中,基于所述历史运动信息表,确定对应的位移矢量,并基于所述位移矢量对预测模式中的位移矢量进行填充,可以通过以下方式实现:
基于历史运动信息表,确定历史运动信息表中第一个位移矢量的水平方向的值和最后一个位移矢量的竖直方向的值;基于所述第一个位移矢量的水平方向的值对所述预测模式中的位移矢量的水平方向进行填充;基于所述最后一个位移矢量的竖直方向的值对所述预测模式中的位移矢量的竖直方向进行填充。当然,由于进行填充时的视频压缩环境多种多样,为了适配不同的使用环境,还可以基于所述历史运动信息表,确定历史运动信息表中第一个位移矢量的竖直方向的值和最后一个位移矢量的水平方向的值;基于所述第一个位移矢量的竖直方向的值对所述预测模式中的位移矢量的竖直方向进行填充;基于所述最后一个位移矢量的水平方向的值对所述预测模式中的位移矢量的水平方向进行填充。
在本申请的一些实施例中,由于历史运动信息表的候选位移矢量数量随着使用环境的不同或者压缩程度的不同呈现动态变化,因此,当所述历史运动信息表的候选位移矢量数量大于所述候选运动信息列表中填充位置的索引时,可以确定历史运动信息表中的待填充位移矢量的位置;基于所述帧内预测历史运动信息表中的待填充位移矢量的位置,确定相应的位移矢量进行动态填充,其中,历史运动信息表中的待填充位 移矢量的位置基于所述候选位移矢量数量值和所述候选运动信息列表中填充位置的索引值确定,在实际实施时,候选BV的个数cnt_hbvp_cands大于在候选运动信息列表中填充位置的索引cbvp_index时,填充IntraHMVP中的第(cnt_hbvp_cands%(cbvp_index+1))个BV。在一些实施例中,当历史运动信息表的候选位移矢量数量小于等于所述候选运动信息列表中填充位置的索引时,可以执行前序实施例中所提供的候选运动信息列表确定方法,实现候选运动信息列表的填充。
继续结合前序实施例说明本申请实施例提供的候选运动信息列表确定方法,参见图9,图9为本申请实施例提供的候选运动信息列表确定方法一个可选的流程示意图,可以理解地,图9所示的步骤可以由运行候选运动信息列表确定装置的各种服务器执行,例如可以是如带有候选运动信息列表确定功能的专用终端、服务器或者服务器集群。下面针对图9示出的步骤进行说明。
步骤901:使用一个历史运动信息列表以先入先出(FIFO,First Input First Output)的方式记录解码过程中历史预测单元(如已解码块或已解码串)的运动信息。
步骤902:在对目标预测单元(如当前块或当前串)的位移矢量进行解码时,基于历史运动信息列表结合其他运动信息,确定候选运动信息列表。
步骤903:从码流中获取当前预测单元的预测位移矢量在候选运动信息列表中的位置(或称为索引),确定当前预测单元的预测位移矢量。
其中,参考图10,图10为本申请实施例提供的候选运动信息列表确定方法一个可选的流程示意图,在本申请的一些实施例中,可以基于历史运动信息列表结合其他运动信息,确定候选运动信息列表包括以下步骤:
步骤1001:确定目标解析顺序,根据目标解析顺序通过当前块的多个相邻块位置,确定相应的ISC编码块。
其中,参考图11,图11是本申请一个实施例提供的空域临近块的示意图;可以按照{A,B,C,D,E}的顺序,寻找已经解码的ISC块,分别记录空域相邻ISC块的SV信息。其中,所记录的SV信息可以包括:该编码块按照目标解析顺序所确定的第一个或最后一个SV信息。在本申请实施例中,当前编解码块的空域临近块是指与当前编解码块属于同一图像,且空间位置上与当前编解码块临近的编解码块。这里所谓的“临近”可以是与当前编解码块之间的距离小于阈值,该阈值可以结合实际情况进行设定。另外,两个编解码块之间的距离的计算方式,也可以灵活设定,例如可以是两个编解码块的左上角坐标之间的距离,也可以是两个编解码块的左上角横坐标(或纵坐标)之间的距离,还可以是两个编解码块的中心位置之间的距离,还可以是两个编解码块之间的最短距离等等,本申请实施例对此不作限定。在一个示例中,假设当前编解码块的左上角坐标为(x0,y0),另一编解码块的左上角坐标为(x1,y1),如果满足条件|x0-x1|<a或|y0-y1|<b,则确定该另一编解码块属于当前编解码块的空域临近块。其中,a和b的值可以相等,也可以不相等。示例性地,a=b=8。
空域临近块包括空域相邻块和空域非相邻块。其中,当前编解码块的空域相邻块是指与当前编解码块属于同一图像,且空间位置上与当前编解码块临近且相邻的编解码块。所谓“临近且相邻”可以是指与当前编解码块存在重合的边或者顶点。例如,如图11所示,当前编解码块F的空域相邻块可以包括A、B、C、D、E等编解码块。当前编解码块的空域非相邻块是指与当前编解码块属于同一图像,且空间位置上与当前编解码块临近且非相邻的编解码块。所谓“临近且非相邻”可以是指与当前编解码块之间的距离小于阈值,但与当前编解码块不存在重合的边或者顶点。例如,如图11所示,当前编解码块F的空域非相邻块可以包括A′、B′、C′、D′、E′等编解码块。
在本申请实施例中,空域临近块包括ISC块,所谓“ISC块”是指采用ISC预测模式进行运动矢量预测的编解码块。在一些实施例中,空域临近块包括以下至少一种:空域相邻的ISC块、空域非相邻的ISC块。其中,当前编解码块的空域相邻的ISC块,是指当前编解码块的空域相邻块且该空域相邻块为ISC块。当前编解码块的空域非相邻的ISC块,是指当前编解码块的空域非相邻块且该空域非相邻块为ISC块。
在一些实施例中,空域临近块还包括IBC块,所谓“IBC块”是指采用IBC预测模式进行运动矢量预测的编解码块。在一些实施例中,空域临近块还包括以下至少一种:空域相邻的IBC块、空域非相邻的IBC块。其中,当前编解码块的空域相邻的IBC块,是指当前编解码块的空域相邻块且该空域相邻块为IBC块。当前编解码块的空域非相邻的IBC块,是指当前编解码块的空域非相邻块且该空域非相邻块为IBC块。
步骤1002:判断候选运动信息列表的运动信息是否填充完整,当填充完整时,执行步骤1003,否则,执行步骤1004;
步骤1003:执行解码进程,确定当前预测单元的预测位移矢量。
步骤1004:确定空域候选运动信息列表,并通过第一填充进程基于相应的填充信息对候选运动信息列表进行填充。
其中第一填充进程可以用于基于空域运动信息列表进行填充。
在本申请的一些实施例中,填充信息可以为ISC块的运动信息,包括至少以下之一:
1、ISC块中按照目标解析顺序所确定的第一个编解码串的SV,其中,目标解析顺序可以为第一目标解析顺序{A,B,C,D,E};
2、ISC块中按照目标解析顺序所确定的最后一个编解码串的SV;
3、ISC块中按照目标解析顺序所确定的第一个编解码串的SV和最后一个编解码串的SV的均值;
4、ISC块中按照目标解析顺序所确定的第一个编解码串的SV和最后一个编解码串的SV的加权均值;
5、ISC块中多个编解码串的SV。
需要说明的是,上述列出的ISC块的运动信息仅是示例性和解释性的,本申请实施例并不限定还可以有其他实现方式。
在本申请的一些实施例中,查找空域临近块及非临近块矢量顺序可以为第三目标解析顺序{A->A’,B->B’,C->C’,D->D’,E->E’},其中,对于每个位置X(包含临近块和非临近块,X=A,B,C,D,E),先确定临近块X是否为ISC块,当确定临近块X是否为ISC块时,获得ISC块所对应的SV,并继续按照目标解析顺序遍历下一个包含临近块和非临近块的位置,直至填充完成停止遍历。
当确定临近块X不为ISC块时,确定X’是否为ISC或IBC块,当确定临近块X’为ISC块时,则获得X’对应SV或者BV;否则,继续按照目标解析顺序继续进行遍历。
在本申请的一些实施例中,还可以先确定空域非相邻块的SV和/或BV信息,再确定空域相邻块的SV信息。可选的一种第四目标解析顺序可以为{A’->A,B’->B,C’->C,D’->D,E’->E}。对于每个位置X(包含临近块和非临近块,X=A,B,C,D,E),先确定非临近块X’是否为ISC或IBC块,当确定非临近块X’为ISC或IBC块时,获得其对应的SV或BV,续按照目标解析顺序遍历;否则,确定临近块X是否为ISC块,如果是,则获得其对应的SV;否则,认为该位置不可得,续按照目标解析顺序遍历。
其中,当空域相邻的ISC块为空,或者空域相邻的ISC块的SV不可用时,基于已经解码的空域非相邻的IBC或者ISC块的运动信息,对候选运动信息列表进行填充。
其中,空域非相邻的IBC或者ISC块的位置参考图11和表1,空域非相邻块的选取可以基于第二目标解析顺序{A’,B’,C’,D’,E’},例如,基于第二目标解析顺序使用空域非相邻的IBC块的BV或ISC块按扫描顺序的第一个或最后一个SV,基于空域非相邻ISC块的SV信息,则通过ISC块的SV信息进行填充与步骤1004所示的填充过程相同。
其中neig_x_pos和neig_y_pos分别表示空域非相邻块左上角样本的横纵坐标,cur_x_pos和cur_y_pos分别表示当前块F左上角样本的横纵坐标,cu_width和cu_height分别表示当前块F的宽和高。
其中,当检查某空域块为ISC模式时,可以将其中指定的SV放入候选列表或者,将所有存储的SV按顺序都放入候选列表。
在一些实施例中,可选的在当候选列表仍未填满时,还可通过(0,0)填充候选运动信息列表。
步骤1005:通过第二填充进程基于相应的填充信息对候选运动信息列表进行填充。
步骤1006:确定空域候选运动信息列表,并通过第一填充进程基于相应的填充信息对候选运动信息列表进行填充,当候选运动信息列表的运动信息仍未填充完整,通过第二填充进程基于相应的填充信息对候选运动信息列表进行填充。
其中,步骤1004、步骤1005以及步骤1006可以择一执行。
在本申请的一些实施例中,还可以基于空域运动矢量和历史运动信息表,对候选运动信息列表进行填充时,可以首先确定候选运动信息列表中的填充位置,之后再确定相应的填充信息。例如,确定候选运动信息列表中的填充位置可以包括至少以下之一:
将空域候选运动信息调整至历史运动信息之前;或者,将空域候选运动信息调整至历史运动信息之后,或者,获取至少一个空域相邻或非相邻运动信息块进行填充。
其中,在本申请的一些实施例中,可以在固定的候选运动信息列表位置,填充相应的空域矢量。例如CBVP列表中某类来自历史运动信息的候选运动矢量不存在时,可以按照顺序将空域矢量填充到候选运动信息列表对应类别的位置。例如,当CBVP列表中第i类的来自当前块左方/上方/左上方/右上方/左下方的历史运动信息的候选运动矢量不存在时,即可填充对应位置(左方/上方/左上方/右上方/左下方)的空域矢量。
在本申请的一些实施例中,当对应位置的空域矢量也不存在时,既可以通过(0,0) 编码填充候选运动信息列表,也可以通过第二填充进程对候选运动信息列表进行填充。
下面以通过即时通讯客户端传输短视频为例对本申请所提供的候选运动信息列表确定方法进行说明,其中,图12为本申请实施例提供的候选运动信息列表确定方法的使用场景示意图,其中,需要经过压缩传输的视频为短视频,终端(包括终端10-1和终端10-2)上设置有能够显示相应短视频的软件的客户端,例如短视频播放的客户端或插件,用户通过相应的客户端可以获得目标视频并进行展示;终端通过网络300连接短视频服务器200,网络300可以是广域网或者局域网,又或者是二者的组合,使用无线链路实现数据传输。当然,用户也可以通过终端中的微信小程序上传视频以供网络中的其他用户观看,这一过程中终端需要对所要上传的视频进行压缩,或者服务器对所存储的视频进行压缩以节省传输时间,减少所占用的存储空间,提升视频的压缩效率。其中,图13为本申请实施例中一个可选的视频压缩示意图,其中,用户通过即时通讯客户端的压缩插件对待处理的目标视频进行压缩,本申请所提供的候选运动信息列表确定方法可以通过相应的存储介质保存在即时通讯客户端的插件中,以供用户调用。图14为本申请实施例中一个可选的压缩视频呈现示意图,其中,当用户通过视频客户端获取服务器所保存的视频信息时,服务器可以通过本申请所提供的候选运动信息列表确定方法对所要传输的目标视频进行压缩,以节省目标视频的传输时间,减少目标视频所占用的保存空间。
继续参考图15,图15为本申请实施例提供的候选运动信息列表确定方法一个可选的流程示意图,在本申请的一些实施例中,可以仅通过第一填充进程,仅通过第二填充进程以及联合使用第一填充进程和第二填充进程对目标视频进行压缩。
其中,在短视频的压缩过程中,历史运动信息表为帧内预测历史运动信息表以实现对这内预测过程中搞得处理,通过帧内预测历史运动信息表,对候选运动信息列表进行填充,可以包括以下步骤:
步骤1501:当帧内预测历史运动信息表的候选位移矢量数量大于等于两个时,触发位移矢量预测导出预测位移矢量。
其中,当帧内预测历史运动信息表的候选位移矢量数量小于两个时,在码流中无需编码最佳的候选BV所对应类别在CBVP列表中的索引。
步骤1502:通过第二填充进程,基于历史运动信息表中的位移矢量进行填充。
其中,在本申请的一些实施例中,可以填充历史运动信息表IntraHMVP中的第一个和最后一个候选BV的均值;还可以填充历史运动信息表IntraHMVP中的第一个和最后一个候选BV的加权均值;或者,填充历史运动信息表IntraHMVP中的第一个BV。
步骤1503:基于候选运动信息列表中填充位置的索引的奇偶性进行动态填充。
其中,在本申请的一些实施例中,当在候选运动信息列表中填充位置的索引cbvp_index为奇数时,填充历史运动信息表IntraHMVP中的第一个和最后一个候选BV的均值,否则填充历史运动信息表IntraHMVP中的第一个BV。
在本申请的一些实施例中,当在候选运动信息列表中填充位置的索引cbvp_index为偶数时,填充IntraHMVP中的第一个和最后一个候选BV的均值,否则填充IntraHMVP中的第一个BV。
步骤1504:基于位移矢量的参数值进行动态填充。
其中,在本申请的一些实施例中,可以在块矢量水平方向上填充IntraHMVP中的第一个BV水平方向的值,并在块矢量竖直方向填充IntraHMVP中的最后一个BV竖直方向的值;或者,在块矢量水平方向上填充IntraHMVP中的最后一个BV水平 方向的值,并在块矢量竖直方向填充IntraHMVP中的第一个BV竖直方向的值。
步骤1505:当IntraHMVP中候选BV的个数大于在候选运动信息列表中填充位置的索引时,填充IntraHMVP中的第(cnt_hbvp_cands%(cbvp_index+1))个BV。
其中,由于IntraHMVP中候选BV的个数是随着视频压缩的进行不断动态变化的,因此,当IntraHMVP中候选BV的个数大于在候选运动信息列表中填充位置的索引时,可以填充IntraHMVP中的第(cnt_hbvp_cands%(cbvp_index+1))个BV;否则,可以择一执行前序实施例中的步骤1502至步骤1504。
本申请实施例具有以下有益效果:
本申请实施例通过确定目标运动信息列表,以及帧内预测历史运动信息表中位移矢量的数量;当所述目标运动信息列表未填满时,基于所述帧内预测历史运动信息表获取至少一个运动信息,并基于所述至少一个运动信息对所述候选运动信息列表进行填充;其中,所述候选运动信息列表用于为当前编解码块提供候选的预测位移矢量,由此,可以实现在该候选运动信息列表中提供更多更有效的位移矢量,以取得更好的位移矢量预测效果,提升视频的压缩性能,提升用户的使用体验。
以上所述,仅为本申请的实施例而已,并非用于限定本申请的保护范围,凡在本申请的精神和原则之内所作的任何修改、等同替换和改进等,均应包含在本申请的保护范围之内。
Claims (27)
- 一种候选运动信息列表确定方法,所述方法包括:确定目标运动信息列表和历史运动信息表,其中,所述历史运动信息表用于帧间预测、帧内预测、帧内块复制IBC预测、帧内串复制ISC预测中至少之一;当所述目标运动信息列表未填满时,基于所述历史运动信息表获取至少一个运动信息,并基于所述至少一个运动信息对所述候选运动信息列表进行填充;其中,所述候选运动信息列表,用于为当前编解码块提供候选的预测位移矢量。
- 根据权利要求1所述的方法,其中,所述基于所述历史运动信息表获取至少一个运动信息,并基于所述至少一个运动信息对所述候选运动信息列表进行填充,包括:基于所述历史运动信息表,确定对应的位移矢量,并基于所述位移矢量对所述候选运动信息列表进行填充;或者,基于所述候选运动信息列表中填充位置的索引的奇偶性进行动态填充;或者,基于所述历史运动信息表,确定对应的位移矢量,并基于所述位移矢量对预测模式中的位移矢量进行填充,以实现对所述候选运动信息列表进行填充;或者,基于所述历史运动信息表,确定对应的位移矢量,并基于所述位移矢量数量以及所述候选运动信息列表中填充位置的索引进行动态填充。
- 根据权利要求2所述的方法,其中,所述基于所述历史运动信息表,确定对应的位移矢量,并基于所述位移矢量对所述候选运动信息列表进行填充,包括:基于所述历史运动信息表,确定所述历史运动信息表中第一个位移矢量和最后一个位移矢量,并基于所述第一个位移矢量和最后一个位移矢量的均值对所述候选运动信息列表进行填充;或者,基于所述历史运动信息表,确定所述历史运动信息表中第一个位移矢量和最后一个位移矢量,并基于所述第一个位移矢量和最后一个位移矢量的加权平均值对所述候选运动信息列表进行填充;或者,基于所述历史运动信息表,确定所述历史运动信息表中第一个位移矢量,并基于所述第一个位移矢量对所述候选运动信息列表进行填充。
- 根据权利要求2所述的方法,其中,所述基于所述候选运动信息列表中填充位置的索引的奇偶性进行动态填充,包括:当所述候选运动信息列表中填充位置的索引为奇数时,确定所述历史运动信息表中第一个位移矢量和最后一个位移矢量,并基于所述第一个位移矢量和最后一个位移矢量的均值对所述候选运动信息列表进行填充;或者,基于所述历史运动信息表,确定所述历史运动信息表中第一个位移矢量,并基于所述第一个位移矢量对所述候选运动信息列表进行填充。
- 根据权利要求2所述的方法,其中,所述基于所述候选运动信息列表中填充位置的索引的奇偶性进行动态填充,包括:当所述候选运动信息列表中填充位置的索引为偶数时,确定所述历史运动信息表中第一个位移矢量和最后一个位移矢量,并基于所述第一个位移矢量和最后一个位移矢量的均值对所述候选运动信息列表进行填充;或者,基于所述历史运动信息表,确定所述历史运动信息表中第一个位移矢量,并基于所述第一个位移矢量对所述候选运动信息列表进行填充。
- 根据权利要求2所述的方法,其中,所述基于所述历史运动信息表,确定对 应的位移矢量,并基于所述位移矢量对预测模式中的位移矢量进行填充,包括:基于所述历史运动信息表,确定所述历史运动信息表中第一个位移矢量的水平方向的值和最后一个位移矢量的竖直方向的值;基于所述第一个位移矢量的水平方向的值对所述预测模式中的位移矢量的水平方向进行填充;基于所述最后一个位移矢量的竖直方向的值对所述预测模式中的位移矢量的竖直方向进行填充。
- 根据权利要求2所述的方法,其中,所述基于所述历史运动信息表,确定对应的位移矢量,并基于所述位移矢量对预测模式中的位移矢量进行填充,包括:基于所述历史运动信息表,确定所述历史运动信息表中第一个位移矢量的竖直方向的值和最后一个位移矢量的水平方向的值;基于所述第一个位移矢量的竖直方向的值对所述预测模式中的位移矢量的竖直方向进行填充;基于所述最后一个位移矢量的水平方向的值对所述预测模式中的位移矢量的水平方向进行填充。
- 根据权利要求2所述的方法,其中,所述基于所述位移矢量数量以及所述候选运动信息列表中填充位置的索引进行动态填充,包括:当所述历史运动信息表的候选位移矢量数量大于所述候选运动信息列表中填充位置的索引时,确定所述历史运动信息表中的待填充位移矢量的位置;基于所述历史运动信息表中的待填充位移矢量的位置,确定相应的位移矢量进行动态填充,其中,所述历史运动信息表中的待填充位移矢量的位置基于所述候选位移矢量数量值和所述候选运动信息列表中填充位置的索引值确定。
- 根据权利要求8所述的方法,其中,所述方法还包括:当所述历史运动信息表的候选位移矢量数量小于等于所述候选运动信息列表中填充位置的索引时,执行至少以下之一:基于所述历史运动信息表,确定对应的位移矢量,并基于所述位移矢量对所述候选运动信息列表进行填充;或者,基于所述候选运动信息列表中填充位置的索引的奇偶性进行动态填充;或者,基于所述历史运动信息表,确定对应的位移矢量,并基于所述位移矢量对预测模式中的位移矢量进行填充,以实现对所述候选运动信息列表进行填充。
- 根据权利要求1所述的方法,其中,所述方法还包括:获取所述历史运动信息表中的位移矢量的数量;当所述历史运动信息表中位移矢量的数量大于等于两个时,触发位移矢量预测导出预测位移矢量。
- 根据权利要求1所述的方法,其中,所述目标运动信息列表未填满包括:基于空域运动信息列表进行填充后所述目标运动信息列表未填满;或者,未经过空域运动信息列表进行填充时,所述目标运动信息列表未填满。
- 根据权利要求11所述的方法,其中,所述方法还包括:当所述目标运动信息列表未填满时,所述候选运动信息列表的填充方式包括至少以下之一:基于所述历史运动信息表对所述候选运动信息列表进行填充;或者,基于空域运动信息列表对所述候选运动信息列表进行填充;或者,基于所述历史运动信息表和空域运动信息列表对所述候选运动信息列表进行填充。
- 根据权利要求1-12任一项所述的方法,其中,所述方法还包括:当所述候选运动信息列表未填充完整时,构建空域运动信息列表,所述空域运动信息列表中包括当前编解码块的空域临近块的运动信息,所述空域临近块至少包括帧内串复制ISC块;在候选运动信息列表中的运动信息不足的情况下,从所述空域运动信息列表中获取至少一个运动信息,采用所述至少一个运动信息对所述候选运动信息列表进行填充;其中,所述候选运动信息列表用于为所述当前编解码块提供候选的预测位移矢量。
- 根据权利要求13所述的方法,其中,所述候选运动信息列表用于为IBC块提供候选的预测块矢量BVP;或者,所述候选运动信息列表用于为ISC串提供候选的预测串矢量SVP;或者,所述候选运动信息列表用于为IBC块提供候选的BVP以及为ISC串提供候选的SVP。
- 根据权利要求13所述的方法,其中,在所述空域临近块包括ISC块的情况下,所述ISC块的运动信息包括以下至少一项:所述ISC块中按扫描顺序的第一个编解码串的串矢量SV;所述ISC块中按扫描顺序的最后一个编解码串的SV;所述ISC块中按扫描顺序的第一个编解码串的SV和最后一个编解码串的SV的均值;所述ISC块中按扫描顺序的第一个编解码串的SV和最后一个编解码串的SV的加权均值;所述ISC块中多个编解码串的SV。
- 根据权利要求13所述的方法,其中,所述方法还包括:从所述空域运动信息列表中导出的运动信息在所述候选运动信息列表中的位置,位于从历史运动信息表中导出的运动信息在所述候选运动信息列表中的位置之前;或者,从所述空域运动信息列表中导出的运动信息在所述候选运动信息列表中的位置,位于从历史运动信息表中导出的运动信息在所述候选运动信息列表中的位置之后;或者,从所述空域运动信息列表中导出的运动信息位于所述候选运动信息列表中的设定位置;或者,当所述候选运动信息列表中的至少一个位置不存在运动信息时,从所述空域运动信息列表中导出的运动信息填入所述至少一个位置;填入时,按照所述运动信息在所述空域运动信息列表中的顺序,依次填入所述至少一个位置;或者,将所述空域运动信息列表中指定位置的运动信息,填入所述至少一个位置。
- 一种候选运动信息列表确定方法,所述方法包括:确定历史运动信息表和空域运动信息列表的位置关系,其中,所述历史运动信息表用于帧间预测、帧内预测、帧内块复制IBC预测、帧内串复制ISC预测中至少之一;当目标运动信息列表未填满时,基于所述历史运动信息表、空域运动信息列表中至少之一,获取至少一个运动信息,并基于所述至少一个运动信息对所述候选运动信息列表进行填充;其中,所述候选运动信息列表用于,为当前编解码块提供候选的预测位移矢量。
- 根据权利要求17所述的方法,其中,所述确定历史运动信息表和空域运动信息列表的位置关系,包括:确定从空域运动信息列表中导出的运动信息在候选运动信息列表中的位置,位于从历史运动信息表中导出的运动信息在所述候选运动信息列表中的位置之前;或者,确定从所述空域运动信息列表中导出的运动信息在所述候选运动信息列表中的位置,位于从历史运动信息表中导出的运动信息在所述候选运动信息列表中的位置之后;或者,确定从所述空域运动信息列表中导出的运动信息位于所述候选运动信息列表中的设定位置;或者,当所述候选运动信息列表中的至少一个位置不存在运动信息时,确定从所述空域运动信息列表中导出的运动信息填入所述至少一个位置。
- 根据权利要求18所述的方法,其中,所述从所述空域运动信息列表中导出的运动信息填入所述至少一个位置包括:按照所述运动信息在所述空域运动信息列表中的顺序,依次填入所述至少一个位置;或者,将所述空域运动信息列表中指定位置的运动信息,填入所述至少一个位置。
- 根据权利要求18-19任一项所述的方法,其中,所述当目标运动信息列表未填满时,基于所述历史运动信息表、空域运动信息列表中至少之一,获取至少一个运动信息,并基于所述至少一个运动信息对所述候选运动信息列表进行填充,包括:确定空域运动信息列表,所述空域运动信息列表中包括当前编解码块的空域临近块的运动信息,所述空域临近块至少包括帧内串复制ISC块;在候选运动信息列表中的运动信息不足的情况下,从所述空域运动信息列表中获取至少一个运动信息,采用所述至少一个运动信息对所述候选运动信息列表进行填充;其中,所述候选运动信息列表用于,为所述当前编解码块提供候选的预测位移矢量。
- 根据权利要求18-19任一项所述的方法,其中,所述当目标运动信息列表未填满时,基于所述历史运动信息表、空域运动信息列表中至少之一,获取至少一个运动信息,并基于所述至少一个运动信息对所述候选运动信息列表进行填充,包括:确定目标运动信息列表,以及历史运动信息表中位移矢量的数量;当所述目标运动信息列表未填满时,基于所述历史运动信息表获取至少一个运动信息,并基于所述至少一个运动信息对所述候选运动信息列表进行填充。
- 根据权利要求18-19任一项所述的方法,其中,所述当目标运动信息列表未填满时,基于所述历史运动信息表、空域运动信息列表中至少之一,获取至少一个运动信息,并基于所述至少一个运动信息对所述候选运动信息列表进行填充,包括:当所述目标运动信息列表未填满时,基于所述历史运动信息表对所述候选运动信息列表进行填充;或者,基于空域运动信息列表对所述候选运动信息列表进行填充;或者,基于所述历史运动信息表和空域运动信息列表对所述候选运动信息列表进行填充。
- 一种候选运动信息列表确定装置,所述方法包括:第一信息处理模块,配置为确定目标运动信息列表和历史运动信息表,其中,所述历史运动信息表用于帧间预测、帧内预测、帧内块复制IBC预测、帧内串复制 ISC预测中至少之一;第一信息填充模块,配置为当所述目标运动信息列表未填满时,基于所述历史运动信息表获取至少一个运动信息;所述第一信息填充模块,配置为基于所述至少一个运动信息对所述候选运动信息列表进行填充;其中,所述候选运动信息列表用于,为当前编解码块提供候选的预测位移矢量。
- 一种候选运动信息列表确定装置,所述方法包括:第二信息处理模块,配置为确定历史运动信息表和空域运动信息列表的位置关系,其中,所述历史运动信息表用于帧间预测、帧内预测、帧内块复制IBC预测、帧内串复制ISC预测中至少之一;第二信息填充模块,配置为当目标运动信息列表未填满时,基于所述历史运动信息表、空域运动信息列表中至少之一,获取至少一个运动信息,并基于所述至少一个运动信息对所述候选运动信息列表进行填充;其中,所述候选运动信息列表用于,为当前编解码块提供候选的预测位移矢量。
- 一种电子设备,所述电子设备包括:存储器,配置为存储可执行指令;处理器,配置为运行所述存储器存储的可执行指令时,实现权利要求1至16任意一项所述的候选运动信息列表确定方法,或者,实现权利要求17-22任意一项所述的候选运动信息列表确定方法。
- 一种计算机可读存储介质,存储有可执行指令,所述可执行指令被处理器执行时实现权利要求1至16任意一项所述的候选运动信息列表确定方法,或者,实现权利要求17-22任意一项所述的候选运动信息列表确定方法。
- 一种计算机程序产品,包括计算机程序或指令,所述计算机程序或指令被处理器执行时,实现权利要求1至16任意一项所述的候选运动信息列表确定方法,或者,实现权利要求17-22任意一项所述的候选运动信息列表确定方法。
Priority Applications (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
EP21879196.0A EP4114004A4 (en) | 2020-10-18 | 2021-09-16 | METHOD AND APPARATUS FOR CANDIDATE MOVEMENT INFORMATION LIST DETERMINATION, ELECTRONIC DEVICE AND STORAGE MEDIA |
US17/948,094 US20230016630A1 (en) | 2020-10-18 | 2022-09-19 | Method and apparatus for determining candidate motion information list, electronic device, and storage medium |
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202011114059.5A CN114374849A (zh) | 2020-10-18 | 2020-10-18 | 一种候选运动信息列表确定方法、装置、电子设备及存储介质 |
CN202011114059.5 | 2020-10-18 |
Related Child Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US17/948,094 Continuation US20230016630A1 (en) | 2020-10-18 | 2022-09-19 | Method and apparatus for determining candidate motion information list, electronic device, and storage medium |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2022078150A1 true WO2022078150A1 (zh) | 2022-04-21 |
Family
ID=81139040
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/CN2021/118839 WO2022078150A1 (zh) | 2020-10-18 | 2021-09-16 | 候选运动信息列表确定方法、装置、电子设备及存储介质 |
Country Status (4)
Country | Link |
---|---|
US (1) | US20230016630A1 (zh) |
EP (1) | EP4114004A4 (zh) |
CN (1) | CN114374849A (zh) |
WO (1) | WO2022078150A1 (zh) |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110519600A (zh) * | 2019-08-21 | 2019-11-29 | 浙江大华技术股份有限公司 | 帧内帧间联合预测方法、装置、编解码器及存储装置 |
CN110545424A (zh) * | 2019-08-21 | 2019-12-06 | 浙江大华技术股份有限公司 | 基于mmvd模式的帧间预测方法、视频编码方法及相关装置、设备 |
WO2020006969A1 (zh) * | 2018-07-02 | 2020-01-09 | 华为技术有限公司 | 运动矢量预测方法以及相关装置 |
CN111567045A (zh) * | 2017-10-10 | 2020-08-21 | 韩国电子通信研究院 | 使用帧间预测信息的方法和装置 |
-
2020
- 2020-10-18 CN CN202011114059.5A patent/CN114374849A/zh active Pending
-
2021
- 2021-09-16 EP EP21879196.0A patent/EP4114004A4/en active Pending
- 2021-09-16 WO PCT/CN2021/118839 patent/WO2022078150A1/zh unknown
-
2022
- 2022-09-19 US US17/948,094 patent/US20230016630A1/en active Pending
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111567045A (zh) * | 2017-10-10 | 2020-08-21 | 韩国电子通信研究院 | 使用帧间预测信息的方法和装置 |
WO2020006969A1 (zh) * | 2018-07-02 | 2020-01-09 | 华为技术有限公司 | 运动矢量预测方法以及相关装置 |
CN110519600A (zh) * | 2019-08-21 | 2019-11-29 | 浙江大华技术股份有限公司 | 帧内帧间联合预测方法、装置、编解码器及存储装置 |
CN110545424A (zh) * | 2019-08-21 | 2019-12-06 | 浙江大华技术股份有限公司 | 基于mmvd模式的帧间预测方法、视频编码方法及相关装置、设备 |
Non-Patent Citations (1)
Title |
---|
X. XU (TENCENT), X. LI, S. LIU (TENCENT), Y. HAN (QUALCOMM), W.-J. CHIEN (QUALCOMM), M. KARCZEWICZ (QUALCOMM), H. GAO (HUAWEI), S.: "CE8-related: Combination test of JVET-N0176/JVET-N0317/JVET-N0382 on simplification of IBC vector prediction", 14. JVET MEETING; 20190319 - 20190327; GENEVA; (THE JOINT VIDEO EXPLORATION TEAM OF ISO/IEC JTC1/SC29/WG11 AND ITU-T SG.16 ), no. JVET-N0843, 25 March 2019 (2019-03-25), XP030204945 * |
Also Published As
Publication number | Publication date |
---|---|
US20230016630A1 (en) | 2023-01-19 |
EP4114004A1 (en) | 2023-01-04 |
EP4114004A4 (en) | 2023-08-30 |
CN114374849A (zh) | 2022-04-19 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
RU2683495C1 (ru) | Нововведения в предсказание блочных векторов и оценку восстановленных значений отсчетов в области перекрытия | |
CN110944185B (zh) | 视频解码的方法和装置、计算机设备及存储介质 | |
RU2654129C2 (ru) | Функциональные возможности режима внутреннего предсказания с блочным копированием для кодирования и декодирования видео и изображений | |
CN105659602B (zh) | 用于视频和图像编码的帧内块复制预测模式的编码器侧选项 | |
CN114827594B (zh) | 图片数据处理方法、装置及存储介质 | |
JP2022502961A (ja) | 3d補助データを用いた動き推定 | |
JP7164710B2 (ja) | ビデオ復号化方法及びビデオ・デコーダ | |
CN113557527B (zh) | 视频解码方法、视频解码器及介质 | |
WO2020175893A1 (ko) | Aps 시그널링 기반 비디오 또는 영상 코딩 | |
CN114651447A (zh) | 用于视频编解码的方法和装置 | |
WO2022116836A1 (zh) | 视频解码方法、视频编码方法、装置及设备 | |
CN111726622B (zh) | 视频编解码的方法、装置及介质 | |
WO2022078339A1 (zh) | 参考像素候选列表构建方法、装置、设备及存储介质 | |
WO2020063599A1 (zh) | 图像预测方法、装置以及相应的编码器和解码器 | |
US20240089494A1 (en) | Video encoding and decoding method and apparatus, storage medium, electronic device, and computer program product | |
CN110944184B (zh) | 视频解码方法及视频解码器 | |
WO2022022299A1 (zh) | 视频编解码中的运动信息列表构建方法、装置及设备 | |
WO2022078150A1 (zh) | 候选运动信息列表确定方法、装置、电子设备及存储介质 | |
CN116636218A (zh) | 利用多方向帧内预测的视频编解码 | |
CN110958452B (zh) | 视频解码方法及视频解码器 | |
WO2022037458A1 (zh) | 视频编解码中的运动信息列表构建方法、装置及设备 | |
RU2783337C2 (ru) | Способ декодирования видео и видеодекодер | |
WO2024145857A1 (zh) | 帧内模板匹配预测方法、视频编解码方法、装置和系统 | |
WO2024145851A1 (zh) | 帧内模板匹配预测方法、视频编解码方法、装置和系统 | |
KR20240100392A (ko) | 비디오 코딩에서 아핀 병합 모드에 대한 후보 도출 |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 21879196 Country of ref document: EP Kind code of ref document: A1 |
|
ENP | Entry into the national phase |
Ref document number: 2021879196 Country of ref document: EP Effective date: 20220929 |
|
NENP | Non-entry into the national phase |
Ref country code: DE |