WO2023231775A1 - 滤波方法、滤波模型训练方法及相关装置 - Google Patents

滤波方法、滤波模型训练方法及相关装置 Download PDF

Info

Publication number
WO2023231775A1
WO2023231775A1 PCT/CN2023/094769 CN2023094769W WO2023231775A1 WO 2023231775 A1 WO2023231775 A1 WO 2023231775A1 CN 2023094769 W CN2023094769 W CN 2023094769W WO 2023231775 A1 WO2023231775 A1 WO 2023231775A1
Authority
WO
WIPO (PCT)
Prior art keywords
filtering
filter
block
models
sample
Prior art date
Application number
PCT/CN2023/094769
Other languages
English (en)
French (fr)
Inventor
王珅
陈焕浜
杨海涛
宋利
Original Assignee
华为技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 华为技术有限公司 filed Critical 华为技术有限公司
Publication of WO2023231775A1 publication Critical patent/WO2023231775A1/zh

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/102Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
    • H04N19/117Filters, e.g. for pre-processing or post-processing
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/102Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
    • H04N19/124Quantisation
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/80Details of filtering operations specially adapted for video compression, e.g. for pixel interpolation

Definitions

  • the present application relates to the field of coding and decoding technology, and in particular to a filtering method, a filtering model training method and related devices.
  • Codec technology is widely used in fields such as multimedia services, broadcasting, video communications, and storage.
  • the image is divided into multiple non-overlapping coding blocks, and the multiple coding blocks are encoded in sequence.
  • each reconstruction block is parsed from the code stream in order, and then the reconstructed image is determined.
  • the reconstruction blocks need to be filtered.
  • the encoding end encodes the encoding block according to the intra prediction method or the inter prediction method, in order to ensure the encoding quality of subsequent encoding blocks, the encoding end also needs to filter the reconstructed block.
  • a filter model is trained for each quantization parameter in advance.
  • filter models corresponding to multiple quantization parameters adjacent to the quantization parameters of the image are selected from these pre-trained filter models to obtain multiple filter models.
  • a target filter model is selected from the plurality of filter models, and the reconstructed block is filtered through the target filter model.
  • the encoding end can also encode the index of the target filter model into the code stream and send it to the decoding end. After the decoding end receives the code stream sent by the encoding end, it can determine the index of the reconstruction block and the target filter model by analyzing the code stream, and then filter the reconstruction block through the target filter model based on the index of the target filter model.
  • one quantization parameter corresponds to a filtering model
  • coding blocks with different contents in the same image may require different filtering. Therefore, in order to meet the filtering effect of each coding block in the same image, the filtering model corresponding to each quantization parameter
  • the network structure will be more complex, and the speed of filtering according to the above method will be affected, which may affect the encoding and decoding speed of the image.
  • Embodiments of the present application provide a filtering method, a filtering model training method and related devices, which can improve filtering performance on the basis of a simplified network model and meet the filtering effects of coding blocks of different qualities and different contents in the same image.
  • the technical solutions are as follows:
  • a filtering method is provided, which is applied to the encoding side.
  • K groups of filter models are determined according to the quantized parameters of the target image.
  • Each group of filter models in the K groups of filter models includes M filter models, and the same group of filter models corresponds to the same quantized parameter.
  • Different groups of filter models The filtering model corresponds to different quantization parameters, K and M are both integers greater than 1, determine the reconstruction block corresponding to the current coding block in the target image, and determine the target from the K group of filtering models.
  • the target filtering model refers to the filtering model that has the smallest encoding distortion after filtering the reconstruction block, and the encoding distortion after filtering the reconstruction block through the target filtering model is smaller than the encoding of the reconstruction block Distortion, filtering the reconstruction block based on the target filtering model.
  • the encoding end obtains K reference quantization parameters from the target correspondence according to the quantization parameters of the target image. Since one quantization parameter corresponds to a set of filter models, the encoding end can determine K sets of filter models based on the K reference quantization parameters.
  • the target correspondence is used to indicate the correspondence between the image quantization parameters and the reference quantization parameters.
  • the target correspondence is the correspondence between the quantization parameter range and the reference quantization parameter, or the target correspondence is the correspondence between the image quantization parameter and the reference quantization parameter.
  • the encoding end only needs to store the quantization parameter range, and does not need to store each quantization parameter in sequence. In this way, it is helpful to save the storage space of the encoding end, thereby improving the efficiency of the encoding end in determining K groups of filter models.
  • the target correspondence is the correspondence between the image quantization parameter and the reference quantization parameter
  • the correlation between the K reference quantization parameters and the image quantization parameter is stronger. Therefore, the K group of filter models determined by the encoding end according to the target correspondence are more closely related to the quantized parameters of the target image, which can further improve the filtering effect.
  • the quantization parameter corresponding to the coding block determines the coding quality of the coding block, that is, the smaller the quantization parameter, the higher the coding quality, and the larger the quantization parameter, the lower the coding quality.
  • the same set of filter models corresponds to the same quantization parameter, and different sets of filter models correspond to different quantization parameters. Therefore, multiple coding blocks encoded by the same quantization parameter have the same coding quality, and the multiple coding blocks with the same coding quality can be filtered by the same set of filtering models.
  • Multiple coding blocks encoded with different quantization parameters have different coding qualities, and the multiple coding blocks with different coding qualities can be filtered by different sets of filtering models. That is, the same set of filter models is suitable for coding blocks with the same coding quality, and different sets of filter models are suitable for coding blocks with different coding qualities.
  • the encoding end after determining the K sets of filter models based on the quantization parameters of the target image, the encoding end also needs to encode the quantization parameters corresponding to the K sets of filter models into the code stream.
  • the encoding end determines the filtering indication information corresponding to the reconstruction block based on the current coding block, the reconstruction block and the K group of filter models, and the filtering indication information is used to indicate whether the reconstruction block requires filtering.
  • the filtering indication information indicates that the reconstruction block requires filtering
  • the target filtering model is determined from the K sets of filtering models.
  • the encoding end inputs the reconstruction block to each filter model in the K group of filter models to obtain K*M filter blocks, and determines the reconstruction block based on the current encoding block, the reconstruction block and the K*M filter blocks The corresponding rate distortion cost, and the rate distortion cost corresponding to each filter block. If the rate distortion cost corresponding to the reconstruction block is not less than the rate distortion cost corresponding to each filter block, the filtering indication information is determined to be the first indication information, and the first indication information is used to indicate that the reconstruction block requires filtering. If the rate distortion cost corresponding to the reconstruction block is less than the rate distortion cost corresponding to each filter block, the filtering indication information is determined to be the second indication information, and the second indication information is used to indicate that the reconstruction block does not require filtering.
  • the rate distortion cost is used to indicate the degree of image distortion between the reconstructed block and the original coding block, and the degree of image distortion between the filtered block and the original coding block. If the rate distortion cost corresponding to the reconstruction block is less than the rate distortion cost corresponding to each filter block, it means that the image distortion between the reconstruction block and the original coding block is minimum. In this way, the reconstructed image restored based on the reconstruction block is the same as the original image. image distortion is minimal. At this time, there is no need to filter the reconstruction block. If the reconstruction block is If the corresponding rate distortion cost is not less than the rate distortion cost corresponding to each filter block, it means that the image distortion between the filter block and the original coding block is minimum. In this way, the image distortion between the reconstructed image restored based on the filter block and the original image is minimum. . At this time, the reconstruction block needs to be filtered.
  • the rate distortion cost to indicate coding distortion as an example.
  • the filtering indication information indicates that the reconstruction block requires filtering
  • the rate distortion cost corresponding to each filter block is compared, and the filtering model corresponding to the filtering block with the smallest rate distortion cost is determined as the target filtering model.
  • coding distortion can also be indicated through other parameters, which is not limited in the embodiments of the present application.
  • the encoding end after the encoding end determines the filtering indication information corresponding to the reconstruction block based on the current coding block, the reconstruction block and the K group of filtering models, the encoding end also needs to encode the filtering indication information into the code stream.
  • the encoding end determines the target filter model from the K sets of filter models, it also needs to encode the target index into the code stream, and the target index is used to indicate the target filter model.
  • the filtering indication information indicates that the reconstruction block needs to be filtered.
  • the filtering indication information may also indicate that the reconstruction block does not require filtering. If the filtering indication information indicates that the reconstruction block does not require filtering, the reconstruction block is not filtered.
  • the same set of filter models is suitable for coding blocks of the same coding quality
  • different sets of filter models are suitable for coding blocks of different coding qualities
  • different filter models in the same set of filter models are suitable for coding blocks of different contents.
  • the target filter model can be selected from the K sets of filter models based on the coding quality and the content of the coding block, and then The reconstructed block is filtered based on the target filtering model, thereby reducing coding distortion and improving filtering performance.
  • the filtering performance can be improved on the basis of a simplified network model to meet the filtering effect of coding blocks of different quality and content in the same image.
  • a filtering method is provided and applied to the decoding end.
  • K groups of filter models are determined.
  • Each group of filter models in the K groups of filter models includes M filter models, and the same group of filter models corresponds to the same quantization parameter, and different groups of filter models correspond to different quantization parameters.
  • K and M are both integers greater than 1
  • the reconstruction block is determined based on the code stream
  • the target filter model in the K group of filter models is determined
  • the reconstruction block is filtered based on the target filter model.
  • the decoder determines K sets of filter models based on the quantization parameters of the target image to which the reconstruction block belongs.
  • the quantization parameters corresponding to the K sets of filter models are also encoded into the code stream. Therefore, after receiving the code stream, the decoder can parse the quantization parameters corresponding to the K group of filter models from the code stream, and then determine the K group of filter models based on the quantization parameters corresponding to the K group of filter models.
  • the decoding end determines the filtering indication information of the reconstruction block, and the filtering indication information is used to indicate whether the reconstruction block requires filtering.
  • the filtering indication information indicates that the reconstruction block requires filtering
  • the target filtering model in the K group of filtering models is determined.
  • the encoding end Since the encoding end determines the filtering indication information corresponding to the reconstruction block based on the current coding block, the reconstruction block and the K group of filtering models, it also encodes the filtering indication information into the code stream. Therefore, after receiving the code stream, the decoder can parse the filtering indication information from the code stream, and then determine whether the reconstruction block needs to be filtered based on the filtering indication information. When the filtering indication information indicates that the reconstruction block requires filtering, the decoder can parse the target index from the code stream, and then determine the target filtering model based on the target index.
  • the same set of filter models is suitable for coding blocks of the same coding quality
  • different sets of filter models are suitable for coding blocks of different coding qualities
  • different filter models in the same set of filter models are suitable for coding blocks of different contents.
  • the target filter model can be selected from the K sets of filter models based on the coding quality and the content of the coding block, and then The reconstructed block is filtered based on the target filtering model, thereby reducing coding distortion and improving filtering performance.
  • the filtering performance can be improved on the basis of a simplified network model to meet the filtering effect of coding blocks of different quality and content in the same image.
  • a filter model training method is provided.
  • a training sample set is obtained.
  • the training sample set includes a plurality of sample coding blocks and a reconstruction block corresponding to each sample coding block.
  • the plurality of samples The quantization parameter of the image to which the coding block belongs is the same quantization parameter.
  • the filter model to be trained is trained to obtain an initial filter model.
  • the training sample set is divided into M initial sample subsets, and each initial sample subset includes at least two sample coding blocks and reconstruction blocks corresponding to the at least two sample coding blocks.
  • the initial filtering models are trained respectively to obtain M optimized filtering models.
  • the M optimized filtering models are trained to obtain a set of filtering models.
  • the plurality of sample coding blocks are divided from a plurality of sample images, or the plurality of sample coding blocks are divided from one sample image. That is, the plurality of sample coding blocks may come from the same sample image or may come from different sample images, as long as the quantization parameters of the images to which the plurality of sample coding blocks belong are the same. Since the multiple sample coding blocks are obtained by dividing the image into multiple non-overlapping coding blocks, the contents of the multiple sample coding blocks are different.
  • the sample coding blocks included in each initial sample subset are at least two consecutive sample coding blocks in the sorting result.
  • the M optimized filtering models are trained through loop iteration.
  • the i-th iteration processing in the loop iteration method includes the following steps:
  • the training sample set is divided into M optimized sample subsets.
  • the M optimized sample subsets are consistent with the M processed in the i-th iteration.
  • the reconstruction blocks corresponding to the multiple sample coding blocks are input to the M filtering models of the i-th iteration process to obtain M filter blocks corresponding to each sample coding block.
  • This application trains the M optimized filtering models through loop iteration.
  • the iteration number i of the M filtering models is less than the iteration number threshold, it indicates that the optimized filtering model currently trained is unreliable, and the trained
  • the M filter models processed in the i-th iteration are used as the M filter models processed in the i+1-th iteration, and the i+1-th iteration process is continued.
  • the iteration number i of the M filter models is greater than or equal to the iteration number threshold, it indicates that the optimized filter model obtained by current training is reliable, then the iterative processing is stopped, and the M filter models processed by the i-th iteration after training are regarded as a Group filter model.
  • the iteration number threshold is set in advance.
  • the iteration number threshold is a specified number of iterations or a maximum number of iterations, which can be set according to different requirements. This is not limited in the embodiments of the present application.
  • the filter model processed in the i-th iteration is trained, and the other filter models stop iterative processing.
  • the quantization parameters of the images to which the multiple sample coding blocks included in the training sample set belong are the same, an untrained filter model is trained based on the training sample set, and the resulting set of filter models is suitable for the same Encoding block for encoding quality.
  • the M filter models included in the set of filter models are suitable for coding blocks with different contents.
  • a fourth aspect provides a filtering device, which has the function of realizing the behavior of the filtering method in the first aspect.
  • the filtering device includes at least one module, and the at least one module is used to implement the filtering method provided in the first aspect.
  • a filtering device has the function of realizing the behavior of the filtering method in the above-mentioned second aspect.
  • the filtering device includes at least one module, and the at least one module is used to implement the filtering method provided in the second aspect.
  • a filter model training device has the function of realizing the behavior of the filter model training method in the above third aspect.
  • the filter model training device includes at least one module, and the at least one module is used to implement the filter model training method provided in the third aspect.
  • an encoding end device in a seventh aspect, includes a processor and a memory, and the memory is used to store a computer program for executing the filtering method provided in the first aspect.
  • the processor is configured to execute a computer program stored in the memory to implement the filtering method described in the first aspect.
  • the coding end device may also include a communication bus, which is used to establish a connection between the processor and the memory.
  • a decoding end device in an eighth aspect, includes a processor and a memory, and the memory is used to store a computer program for executing the filtering method provided in the second aspect.
  • the processor is configured to execute a computer program stored in the memory to implement the filtering method described in the second aspect.
  • the decoding end device may also include a communication bus, which is used to establish a connection between the processor and the memory.
  • a filter model training device in a ninth aspect, includes a processor and a memory, and the memory is used to store a computer program for executing the filter model training method provided in the third aspect.
  • the processor is configured to execute the computer program stored in the memory to implement the filter model training method described in the third aspect.
  • the filter model training device may also include a communication bus, which is used to establish a connection between the processor and the memory.
  • a computer-readable storage medium In a tenth aspect, a computer-readable storage medium is provided. Instructions are stored in the storage medium. When the instructions are run on a computer, they cause the computer to perform the steps of the filtering method described in the first aspect, and the steps of the filtering method described in the first aspect. The steps of the filtering method described in the second aspect or the steps of executing the filtering model training method described in the third aspect.
  • a computer program product containing instructions is provided.
  • the computer When the instructions are run on a computer, the computer is caused to perform the steps of the filtering method described in the first aspect and the filtering method described in the second aspect. or perform the steps of the filter model training method described in the third aspect above.
  • a computer program is provided.
  • the computer program When the computer program is run on a computer, it causes the computer to perform the steps of the filtering method described in the first aspect, the steps of the filtering method described in the second aspect, or perform the above steps.
  • the steps of the filter model training method described in the third aspect are provided.
  • Figure 1 is a schematic diagram of an implementation environment provided by an embodiment of the present application.
  • Figure 2 is an exemplary structural block diagram of an encoding end provided by an embodiment of the present application.
  • Figure 3 is an exemplary structural block diagram of a decoding end provided by an embodiment of the present application.
  • Figure 4 is a flow chart of a filtering method provided by an embodiment of the present application.
  • Figure 5 is a flow chart of another filtering method provided by an embodiment of the present application.
  • Figure 6 is a flow chart of a filter model training method provided by an embodiment of the present application.
  • Figure 7 is a schematic structural diagram of a filtering device provided by an embodiment of the present application.
  • Figure 8 is a schematic structural diagram of another filtering device provided by an embodiment of the present application.
  • Figure 9 is a schematic structural diagram of a filter model training device provided by an embodiment of the present application.
  • Figure 10 is a schematic structural diagram of a computer device provided by an embodiment of the present application.
  • Encoding refers to the process of compressing the image to be encoded into a code stream. Wherein, the image is a static image, a dynamic image, or any video frame included in the video.
  • Decoding refers to the process of restoring the encoded code stream into a reconstructed image according to specific grammatical rules and processing methods.
  • Coding block refers to the coding area obtained by dividing the image to be coded. An image can be divided into multiple coding blocks, and these multiple coding blocks together form one image. Each coding block can be coded independently. For example, the size of the coding block is 128*128.
  • Quantization refers to the process of mapping the continuous values of a signal into multiple discrete amplitudes. Quantization can effectively reduce the value range of the signal, thereby obtaining better compression effects, and quantization is the root cause of distortion.
  • the quantization parameter is an important parameter that controls the degree of quantization and reflects the compression of the image.
  • QP The quantization parameter
  • Intra-frame prediction refers to predicting the current coding block based on the reconstruction block corresponding to the coding block that has been coded before the current coding block in the same image. For example, the current coding block is predicted through the reconstruction blocks corresponding to the coded blocks on the left side and above of the current coding block.
  • Inter-frame prediction refers to determining the reconstructed image corresponding to the image that has been encoded before the current image as the reference image, and predicting the current encoding block based on the reconstruction block in the reference image that is similar to the current encoding block.
  • Codec technology is widely used in fields such as multimedia services, broadcasting, video communications, and storage.
  • the image is divided into multiple non-overlapping coding blocks, and the multiple coding blocks are encoded in sequence.
  • each reconstruction block is parsed from the code stream in order, and then the reconstructed image is determined.
  • the reconstruction blocks need to be filtered.
  • the encoding end encodes the encoding block according to the intra prediction method or the inter prediction method, in order to ensure the encoding quality of subsequent encoding blocks, the encoding end also needs to filter the reconstructed block.
  • FIG. 1 is a schematic diagram of an implementation environment provided by an embodiment of the present application.
  • the implementation environment includes a source device 10 , a destination device 20 , a link 30 and a storage device 40 .
  • the source device 10 is used to encode each coding block in the image, and is also used to filter the reconstructed block of the coding block during the coding process according to the intra prediction method or the inter prediction method.
  • the destination device 20 is used to parse the code stream to determine the reconstruction block, and is also used to filter the reconstruction block.
  • the source device 10 is used to encode images to generate code streams. Therefore, the source device 10 is also called an image encoding device, or an image encoding end.
  • the destination device 20 is used to decode the code stream generated by the source device 10 . Therefore, the destination device 20 is also called an image decoding device, or an image decoding terminal.
  • the link 30 is used to receive the code stream generated by the source device 10 and transmit the code stream to the destination device 20 .
  • the storage device 40 is used to receive the code stream generated by the source device 10 and store the code stream. Under such conditions, the destination device 20 can directly obtain the code stream from the storage device 40 .
  • the storage device 40 corresponds to a file server or is capable of saving data generated by the source Another intermediate storage device for the code stream generated by the device 10. Under such conditions, the destination device 20 can stream or download the code stream stored in the storage device 40.
  • Source device 10 and destination device 20 each include one or more processors and memory coupled to the one or more processors, including random access memory (RAM), read-only memory (read- only memory (ROM), electrically erasable programmable read-only memory (EEPROM), flash memory, which can be used to store desired program code in the form of instructions or data structures that can be accessed by a computer Any other media etc.
  • source device 10 and destination device 20 each include a desktop computer, a mobile computing device, a notebook (eg, laptop) computer, a tablet computer, a set-top box, a telephone handset such as a so-called "smart" phone, a television machine, camera, display device, digital media player, video game console, vehicle-mounted computer, or the like.
  • Link 30 includes one or more media or devices capable of transmitting a code stream from source device 10 to destination device 20 .
  • the link 30 includes one or more communication media that enable the source device 10 to directly send the code stream to the destination device 20 in real time.
  • the source device 10 modulates the code stream according to a communication standard, which is a wireless communication protocol, etc., and sends the code stream to the destination device 20 .
  • the one or more communication media includes wireless and/or wired communication media.
  • the one or more communication media includes a radio frequency (radio frequency, RF) spectrum or one or more physical transmission lines.
  • the one or more communication media can form part of a packet-based network, such as a local area network, a wide area network, or a global network (eg, the Internet).
  • the one or more communication media include routers, switches, base stations, or other equipment that facilitates communication from the source device 10 to the destination device 20 , which are not specifically limited in the embodiments of the present application.
  • the storage device 40 is used to store the received code stream sent by the source device 10 , and the destination device 20 can directly obtain the code stream from the storage device 40 .
  • the storage device 40 includes any of a variety of distributed or locally accessed data storage media, for example, any of the multiple types of distributed or locally accessed data storage media is a hard drive. , Blu-ray Disc, digital versatile disc (DVD), compact disc read-only memory (CD-ROM), flash memory, volatile or non-volatile memory, or for storage Any other suitable digital storage media for the code stream, etc.
  • the storage device 40 corresponds to a file server or another intermediate storage device capable of saving the code stream generated by the source device 10 , and the destination device 20 can stream or download the code stream stored in the storage device 40 .
  • a file server is any type of server capable of storing code streams and sending the code streams to destination device 20 .
  • the file server includes a network server, a file transfer protocol (FTP) server, a network attached storage (NAS) device or a local disk drive, etc.
  • Destination device 20 can obtain the code stream over any standard data connection, including an Internet connection.
  • Any standard data connection including a wireless channel (e.g., Wi-Fi connection), wired connection (e.g., digital subscriber line (DSL), cable modem, etc.), or suitable for retrieving streams stored on a file server A combination of both.
  • the transmission of the code stream from the storage device 40 may be streaming transmission, downloading transmission, or a combination of both.
  • the implementation environment shown in Figure 1 is only one possible implementation, and the technology of the embodiment of the present application is not only applicable to the source device 10 shown in Figure 1 that can encode images, and the destination that decodes the code stream
  • the device 20 is also applicable to other devices capable of encoding images and decoding code streams, which are not specifically limited in the embodiments of the present application.
  • the source device 10 includes a data source 120 , an encoder 100 and an output interface 140 .
  • the output interface 140 includes a regulator/demodulator (modem) and/or a transmitter, where the transmitter is also called a transmitter.
  • Data source 120 includes an image capture device (e.g., video camera, etc.), an archive containing previously captured images, a feed interface for receiving images from an image content provider, and/or a computer graphics system for generating images, or an image a combination of these sources.
  • the data source 120 is used to send images to the encoder 100, and the encoder 100 is used to encode the images sent by the data source 120 to obtain a code stream.
  • the encoder sends the code stream to the output interface.
  • the source device 10 sends the code stream directly to the destination device 20 via the output interface 140 .
  • the code stream may also be stored on the storage device 40 for later acquisition by the destination device 20 and used for decoding and/or display.
  • the destination device 20 includes an input interface 240 , a decoder 200 and a display device 220 .
  • input interface 240 includes a receiver and/or modem.
  • the input interface 240 may receive the code stream via the link 30 and/or from the storage device 40, and then send it to the decoder 200.
  • the decoder 200 is used to decode the received code stream to obtain a reconstructed image.
  • the decoder sends the reconstructed image to display device 220.
  • Display device 220 may be integrated with destination device 20 or may be external to destination device 20 . Generally, the display device 220 displays the reconstructed image.
  • the display device 220 is any one of multiple types of display devices.
  • the display device 220 is a liquid crystal display (LCD), a plasma display, an organic light-emitting diode (OLED) display or Other types of display devices.
  • LCD liquid crystal display
  • OLED organic light-emitting diode
  • encoder 100 and decoder 200 may each be integrated with an audio encoder and decoder, and include appropriate multiplexer-demultiplexers.
  • MUX-DEMUX MUX-DEMUX unit or other hardware and software for encoding of both audio and video in a common data stream or in separate data streams.
  • the MUX-DEMUX unit may conform to the ITU H.223 multiplexer protocol, or other protocols such as user datagram protocol (UDP), if applicable.
  • the encoder 100 and the decoder 200 may each be any of the following circuits: one or more microprocessors, digital signal processing (DSP), application specific integrated circuit (ASIC) ), field-programmable gate array (FPGA), discrete logic, hardware, or any combination thereof. If the technology of embodiments of the present application is implemented partially in software, the device may store instructions for the software in a suitable non-volatile computer-readable storage medium, and may use one or more processors in hardware The instructions are executed to implement the technology of the embodiments of the present application. Any of the foregoing (including hardware, software, a combination of hardware and software, etc.) may be considered one or more processors. Each of the encoder 100 and the decoder 200 is included in one or more encoders or decoders, either of which can be integrated as a combined encoder in the respective device. / part of the decoder (codec).
  • Embodiments of the present application may generally refer to encoder 100 as “signaling” or “sending” certain information to another device, such as decoder 200.
  • the term “signaling” or “sending” may generally refer to the transmission of syntax elements and/or other data used to decode a codestream. This transfer can occur in real time or near real time. Alternatively, this communication may occur over a period of time, such as when encoding to store the syntax elements in the encoded bitstream to a computer-readable storage medium, and the decoding device may then occur after the syntax elements are stored to such media. retrieve the syntax element at any time.
  • the encoding end includes a predictor, a transformer, a quantizer, an entropy coder, an inverse quantizer, an inverse transformer, a filter, and a memory.
  • the predictor is an intra predictor or an inter predictor. That is, for the current encoding block in the target image to be encoded, the encoding end can perform intra prediction on the current encoding block through the intra frame predictor, and can also perform intra prediction on the current encoding block through the inter frame predictor. Inter-frame prediction.
  • the first reference reconstruction block is obtained from the memory, and based on the first reference reconstruction block, the current coding block is intra predicted through the intra predictor to obtain the current coding block
  • the corresponding prediction block, the first reference reconstruction block is the reconstruction block corresponding to the coding block that has been coded before the current coding block in the target image.
  • the second reference reconstruction block is obtained from the memory, and then based on the second reconstruction block, the current coding block is predicted by the inter-frame predictor to obtain the correspondence of the current coding block.
  • prediction block, the second reference reconstruction block is a reconstruction block similar to the current encoding block in the image that has been encoded before the target image.
  • the encoding end determines the prediction block corresponding to the current coding block through the intra-frame predictor or the inter-frame predictor according to the above method, the difference between the current coding block and the prediction block is determined as the residual block. Then, the residual block is transformed through a transformer to obtain a transformed residual block, and then the transformed residual block is quantized through a quantizer to obtain a quantized transformed residual block. Finally, the quantized residual block and prediction indication information are encoded into the code stream through the entropy encoder, and the code stream is sent to the decoder. The prediction indication information is used to indicate the prediction used when predicting the current coding block. model.
  • the residual block after quantization transformation needs to be inversely quantized through an inverse quantizer to obtain the transformation
  • the transformed residual block is then inversely transformed through an inverse transformer to obtain a reconstructed residual block.
  • the reconstruction residual block and the prediction block are added to obtain the reconstruction block corresponding to the current coding block.
  • the encoding end After the encoding end determines the reconstruction block corresponding to the current encoding block, it filters the reconstruction block corresponding to the current encoding block through the filter according to the filtering method provided in the embodiment of the present application to obtain the filter block corresponding to the current encoding block, and then filters the reconstruction block corresponding to the current encoding block.
  • the filter block corresponding to the current coding block is stored in the memory to facilitate encoding of the next coding block.
  • QP is an important parameter that controls the degree of quantization. Therefore, when the encoding end encodes the current encoding block, it needs to determine the QP corresponding to the current encoding block. In practical applications, the QPs corresponding to different coding blocks in the same image may or may not be the same. For example, the encoding end divides the target image into multiple non-overlapping coding blocks, and for any one of the multiple coding blocks, the QP of the target image is used as the QP corresponding to the coding block. For another example, the encoding end divides the target image into multiple non-overlapping coding blocks, and for any one of the multiple coding blocks, the QP of the coding block is adaptively adjusted based on the QP of the target image.
  • the plurality of coding blocks may be coding blocks of the same size, or may be coding blocks of different sizes. That is, the encoding end divides the target image into encoding blocks of the same size, or the encoding end divides the target image into encoding blocks of different sizes according to the content of the target image.
  • the shape of the coding block is a square, or the shape of the coding block is other shapes. The embodiment of the present application does not limit the shape of the coding block.
  • the converter is any one of a discrete cosine transform (DCT) converter, a discrete sine transform (DST) converter, or a K-L transform (karhunen-loève transform, KLT) converter.
  • DCT discrete cosine transform
  • DST discrete sine transform
  • KLT K-L transform
  • FIG. 3 is an exemplary structural block diagram of a decoding end provided by an embodiment of the present application.
  • the decoding end includes an entropy decoder, predictor, inverse quantizer, inverse transformer, memory and filter.
  • the predictor is an intra predictor or an inter predictor. That is, for the target image, when the encoding end performs intra prediction on each coding block in the target image, the decoding end also needs to determine the prediction block through the intra predictor. When the encoding end performs inter-frame prediction on each coding block in the target image, the decoding end also needs to determine the prediction block through the inter-frame predictor.
  • the decoding end After the decoding end receives the code stream, it decodes the received code stream through the entropy decoder to obtain the quantized transformed residual block and prediction indication information.
  • the prediction indication information is used to indicate the prediction of the current coding block.
  • the prediction model used Mode Then, based on the prediction indication information, the decoding end determines whether to perform prediction through the intra-frame predictor or through the inter-frame predictor. When it is determined to perform prediction through the intra predictor, the decoder obtains the first reference reconstruction block from the memory, and determines the prediction block corresponding to the current coding block through the intra predictor.
  • the decoder When it is determined to perform prediction through the inter-frame predictor, the decoder obtains the second reference reconstruction block from the memory, and determines the prediction block corresponding to the current coding block through the inter-frame predictor. Then, the residual block after quantization transformation is sequentially passed through the inverse quantizer and the inverse transformer to obtain the reconstructed residual block, and the reconstructed residual block and the prediction block are added to obtain the reconstruction block corresponding to the current coding block.
  • the decoder can also pass the filter, according to this application
  • the filtering method provided in the embodiment is used to filter the reconstruction block.
  • Figure 4 is a flow chart of a filtering method provided by an embodiment of the present application. This method is applied to the encoding side. Please refer to Figure 4. The method includes the following steps.
  • Step 401 Determine K groups of filter models according to the quantized parameters of the target image.
  • Each group of filter models in the K group of filter models includes M filter models, and the same group of filter models corresponds to the same quantized parameter, and different groups of filter models correspond to Different quantization parameters, K and M are both integers greater than 1.
  • the encoding end obtains K reference quantization parameters from the target correspondence according to the quantization parameters of the target image. Since one quantization parameter corresponds to a set of filter models, the encoding end can determine K sets of filter models based on the K reference quantization parameters.
  • the target correspondence is used to indicate the correspondence between the image quantization parameters and the reference quantization parameters.
  • the target correspondence is the correspondence between the quantization parameter range and the reference quantization parameter, or the target correspondence is the correspondence between the image quantization parameter and the reference quantization parameter.
  • the encoding end When the target correspondence is the correspondence between the quantization parameter range and the reference quantization parameter, the encoding end first determines the quantization parameter range in which the quantization parameter of the target image is located to obtain the target quantization parameter range. Then, based on the target quantization parameter range, K reference quantization parameters corresponding to the target quantization parameter range are obtained from the target correspondence relationship.
  • Table 1 the target correspondence relationship is shown in Table 1 below.
  • one quantization parameter range corresponds to three reference quantization parameters.
  • Table 1 takes as an example that each quantization parameter range corresponds to three reference quantization parameters, that is, the number of reference quantization parameters corresponding to each quantization parameter range is the same.
  • the number of reference quantization parameters corresponding to each quantization parameter range may also be different.
  • the quantized parameters of the target image are obtained from the target correspondence relationship, and the K reference quantized parameters corresponding to the quantized parameters of the target image are obtained.
  • the target correspondence relationship is shown in Table 2 below.
  • one image quantization parameter corresponds to three reference quantization parameters.
  • Table 2 takes as an example that each image quantization parameter corresponds to three reference quantization parameters, that is, the number of reference quantization parameters corresponding to each image quantization parameter is the same.
  • the number of reference quantization parameters corresponding to each image quantization parameter may also be different.
  • the encoding end only needs to store the quantization parameter range, and does not need to store each quantization parameter in sequence. In this way, it is helpful to save the storage space of the encoding end, thereby improving the efficiency of the encoding end in determining K groups of filter models.
  • the target correspondence is the correspondence between the image quantization parameter and the reference quantization parameter
  • the correlation between the K reference quantization parameters and the image quantization parameter is stronger. Therefore, the K group of filter models determined by the encoding end according to the target correspondence are more closely related to the quantized parameters of the target image, which can further improve the filtering effect.
  • the quantization parameter corresponding to the coding block determines the coding quality of the coding block, that is, the smaller the quantization parameter, the higher the coding quality, and the larger the quantization parameter, the lower the coding quality.
  • the same set of filter models corresponds to the same quantization parameter, and different sets of filter models correspond to different quantization parameters. Therefore, multiple coding blocks encoded by the same quantization parameter have the same coding quality, and the multiple coding blocks with the same coding quality can be filtered by the same set of filtering models.
  • Multiple coding blocks encoded with different quantization parameters have different coding qualities, and the multiple coding blocks with different coding qualities can be filtered by different sets of filtering models. That is, the same set of filter models is suitable for coding blocks with the same coding quality, and different sets of filter models are suitable for coding blocks with different coding qualities.
  • the encoding end after determining K sets of filter models based on the quantization parameters of the target image, also needs to encode the quantization parameters corresponding to the K sets of filter models into the code stream.
  • the decoder can parse the quantization parameters corresponding to the K group of filter models from the code stream, and determine the K group of filter models based on the quantization parameters corresponding to the K group of filter models.
  • the structure of the filtering model may be a convolutional neural network (CNN) structure, or it may be other structures.
  • CNN convolutional neural network
  • the embodiments of this application do not limit the structure of the filtering model.
  • Step 402 Determine the reconstruction block corresponding to the current coding block in the target image.
  • Step 403 Determine the target filter model from the K group of filter models.
  • the target filter model refers to the filter model that has the smallest coding distortion after filtering the reconstruction block, and the coding distortion after filtering the reconstruction block through the target filter model is less than Coding distortion of this reconstructed block.
  • the encoding end can determine the target from the K group filter model according to the following steps (1)-(2). filter model.
  • the filtering indication information is used to indicate whether the reconstruction block requires filtering.
  • the encoding end inputs the reconstruction block to each filter model in the K group of filter models to obtain K*M filter blocks, and determines the reconstruction block based on the current encoding block, the reconstruction block and the K*M filter blocks The corresponding rate distortion cost, and the rate distortion cost corresponding to each filter block. If the rate distortion cost corresponding to the reconstruction block is not less than the rate distortion cost corresponding to each filter block, the filtering indication information is determined to be the first indication information, and the first indication information is used to indicate that the reconstruction block requires filtering. If the rate distortion cost corresponding to the reconstruction block is less than the rate distortion cost corresponding to each filter block, the filtering indication information is determined to be the second indication information, and the second indication information is used to indicate that the reconstruction block does not require filtering.
  • J represents the rate distortion cost
  • D represents the error between the pixel value of the pixel in the reconstruction block and the pixel value of the pixel in the current encoding block
  • represents the distortion parameter, which is usually the default value
  • R represents the number of bits required to encode the current coding block into the code stream when filtering of the reconstruction block is not required.
  • the rate distortion cost corresponding to each filter block can also be determined according to the above formula (1).
  • D in the above formula (1) represents the error between the pixel value of the pixel in the filter block and the pixel value of the pixel in the current coding block
  • R represents the current coding block when the reconstruction block needs to be filtered. The number of bits required to encode the code stream.
  • the number of bits required to encode the current coding block into the code stream includes the number of bits required to encode the filter indication information, the number of bits required to encode the residual block after quantization transformation, and The number of bits required to encode prediction indication information.
  • the number of bits required to encode the current coding block into the code stream includes the number of bits required to encode the filtering indication information, the number of bits required to encode the residual block after quantization transformation, and the number of bits required to encode the prediction. The number of bits required to indicate the information, and the number of bits required to encode the filter model index.
  • the encoding end stores a correspondence between the filter model index and the number of bits required to encode the filter model index. Therefore, after the encoding end determines the K*M filter models, based on the model indexes of the K*M filter models, the corresponding relationship between the stored filter model index and the number of bits required for encoding the filter model index can be obtained , obtain the number of bits required to encode the K*M filter model indexes, and then determine the rate distortion cost corresponding to each filter block according to the above formula (1).
  • the above content is based on the example that the number of bits required to encode different filter model indexes is different, that is, different filter model indexes correspond to different numbers of encoding bits.
  • the number of bits required to encode different filter model indexes may also be the same, that is, different filter model indexes correspond to the same number of encoding bits.
  • the rate distortion cost corresponding to the K*M filter blocks mainly depends on the distance between the K*M filter blocks and the current coding block. Pixel error.
  • the pixel value of the pixel in the reconstruction block is the same as the pixel value in the current encoding block.
  • the error between the pixel values of the K*M filter blocks and the pixel values of the pixels in the current encoding block is the sum of absolute differences (SAD), absolute Either sum of absolute transformed differences (SATD) or mean squared error (MSE).
  • the rate distortion cost is used to indicate the degree of image distortion between the reconstructed block and the original coding block, and the degree of image distortion between the filtered block and the original coding block. If the rate distortion cost corresponding to the reconstruction block is less than the rate distortion corresponding to each filter block The cost indicates that the image distortion between the reconstruction block and the original coding block is minimal, so that the image distortion between the reconstructed image restored based on the reconstruction block and the original image is minimal. At this time, there is no need to filter the reconstruction block. If the rate distortion cost corresponding to the reconstruction block is not less than the rate distortion cost corresponding to each filter block, it means that the image distortion between the filter block and the original coding block is minimal. In this way, the relationship between the reconstructed image restored based on the filter block and the original image The image distortion is minimal. At this time, the reconstruction block needs to be filtered.
  • the first indication information and the second indication information may be in various forms, such as numerical values, characters, etc.
  • the first indication information and the second indication information are numerical values
  • the first indication information is 0 and the second indication information is 1.
  • the first indication information and the second indication information can also be reversed, or have other numerical values, which are not limited in the embodiments of the present application.
  • the encoding end after the encoding end determines the filtering indication information corresponding to the reconstruction block based on the current coding block, the reconstruction block, and the K group of filtering models, the encoding end also needs to encode the filtering indication information into the code stream. In this way, after the decoder receives the code stream, it can determine whether the reconstructed block needs to be filtered based on the code stream.
  • the rate distortion cost to indicate coding distortion as an example.
  • the filtering indication information indicates that the reconstruction block requires filtering
  • the rate distortion cost corresponding to each filter block is compared, and the filtering model corresponding to the filtering block with the smallest rate distortion cost is determined as the target filtering model.
  • coding distortion can also be indicated through other parameters, which is not limited in the embodiments of the present application.
  • the decoder can determine the target filtering model for filtering the reconstructed block based on the code stream.
  • each filtering model corresponds to a model index
  • different filtering models correspond to different model indexes.
  • the target index includes a target model index
  • the target model index is used to indicate the target filter model in the K group of filter models.
  • the same group of filtering models corresponds to the same quality index
  • different groups of filtering models correspond to different quality indexes.
  • Different filter models in the same group of filter models correspond to different content indexes
  • filter models in different groups may have the same content index.
  • the target index includes a target quality index and a target content index. The target quality index is used to indicate a group of filter models to which the target filter model belongs, and the target content index is used to indicate which model in the group of filter models the target filter model is.
  • the filtering indication information indicates that the reconstruction block needs to be filtered.
  • the filtering indication information may also indicate that the reconstruction block does not require filtering. If the filtering indication information indicates that the reconstruction block does not require filtering, the reconstruction block is not filtered.
  • Step 404 Filter the reconstruction block based on the target filtering model.
  • the reconstruction block is input to the target filtering model, and the target filtering model outputs the filtering block according to the relevant algorithm, thereby filtering the reconstruction block.
  • each group of filter models since each group of filter models includes M filter models, the same group of filter models corresponds to the same quantization parameter, and different groups of filter models correspond to different quantization parameters. That is, the same group of filter models is suitable for the same encoding. Different sets of filter models are suitable for coding blocks of different coding qualities, and different filter models in the same group of filter models are suitable for coding blocks of different contents. In this way, K groups of filters are determined based on the quantization parameters of the target image.
  • the target filter model can be selected from the K group of filter models based on the coding quality and the content of the coding block, and then the reconstruction block is filtered based on the target filter model, thereby reducing the coding distortion and improve filtering performance.
  • the filtering performance can be improved on the basis of a simplified network model to meet the filtering effect of coding blocks of different quality and content in the same image.
  • FIG 5 is a flow chart of another filtering method provided by an embodiment of the present application. This method is applied to the decoding end. Please refer to Figure 5. The method includes the following steps.
  • Step 501 Determine K groups of filter models.
  • Each group of filter models in the K group of filter models includes M filter models, and the same group of filter models corresponds to the same quantization parameter, and different groups of filter models correspond to different quantization parameters.
  • K and M is an integer greater than 1.
  • the decoder determines K sets of filter models based on the quantization parameters of the target image to which the reconstruction block belongs.
  • the decoder determines K sets of filter models based on the quantization parameters of the target image to which the reconstruction block belongs.
  • the quantization parameters corresponding to the K sets of filter models are also encoded into the code stream. Therefore, after receiving the code stream, the decoder can parse the quantization parameters corresponding to the K group of filter models from the code stream, and then determine the K group of filter models based on the quantization parameters corresponding to the K group of filter models.
  • Step 502 Determine the reconstruction block based on the code stream.
  • the decoder After receiving the code stream, the decoder parses the reconstruction block corresponding to the current encoding block from the code stream.
  • the process of the decoder parsing the reconstruction block corresponding to the current encoding block from the code stream can refer to the relevant description in Figure 3 above, which will not be described again here.
  • Step 503 Determine the target filter model in the K group of filter models.
  • the decoding end determines filtering indication information of the reconstruction block, and the filtering indication information is used to indicate whether the reconstruction block requires filtering.
  • the filtering indication information indicates that the reconstruction block requires filtering
  • the target filtering model in the K group of filtering models is determined.
  • the encoding end Since the encoding end determines the filtering indication information corresponding to the reconstruction block based on the current coding block, the reconstruction block and the K group of filtering models, it also encodes the filtering indication information into the code stream. Therefore, after receiving the code stream, the decoder can parse the filtering indication information from the code stream, and then determine whether the reconstruction block needs to be filtered based on the filtering indication information.
  • the decoder can also parse the target index from the code stream, and then determine the target filter model based on the target index.
  • the target index includes a target model index, or includes a target quality index and a target content index.
  • the process of the decoder determining the target filter model based on the target index is different, so the following two situations will be explained separately.
  • the target index includes the target model index.
  • the decoder directly selects the corresponding filter model from the K group of filter models based on the target model index, and determines the selected filter model as the target filter model.
  • the target index includes target quality index and target content index.
  • the decoder first selects a corresponding set of filter models from the K sets of filter models based on the target quality index. Then, based on the target content index, the filter model corresponding to the target content index is determined from the selected set of filter models to obtain the target filter model.
  • Step 504 Filter the reconstruction block based on the target filtering model.
  • the reconstruction block is input to the target filtering model, and the target filtering model outputs the filtering block according to the relevant algorithm, thereby filtering the reconstruction block.
  • each group of filter models since each group of filter models includes M filter models, the same group of filter models corresponds to the same quantization parameter, and different groups of filter models correspond to different quantization parameters. That is, the same group of filter models is suitable for the same encoding. Different sets of filter models are suitable for coding blocks of different coding qualities, and different filter models in the same group of filter models are suitable for coding blocks of different contents.
  • the target filter model can be selected from the K sets of filter models based on the coding quality and the content of the coding block, and then The reconstructed block is filtered based on the target filtering model, thereby reducing coding distortion and improving filtering performance.
  • the filtering performance can be improved on the basis of a simplified network model to meet the filtering effect of coding blocks of different quality and content in the same image.
  • FIG. 6 is a flow chart of a filter model training method provided by an embodiment of the present application. Please refer to Figure 6. The method includes the following steps.
  • Step 601 Obtain a training sample set.
  • the training sample set includes multiple sample coding blocks and a reconstruction block corresponding to each sample coding block.
  • the quantization parameters of the images to which the multiple sample coding blocks belong are the same quantization parameter.
  • the plurality of sample coding blocks are divided from a plurality of sample images, or the plurality of sample coding blocks are divided from one sample image. That is, the plurality of sample coding blocks may come from the same sample image or may come from different sample images, as long as the quantization parameters of the images to which the plurality of sample coding blocks belong are the same. Since the multiple sample coding blocks are obtained by dividing the image into multiple non-overlapping coding blocks, the contents of the multiple sample coding blocks are different.
  • Step 602 Based on the training sample set, train the filter model to be trained to obtain an initial filter model.
  • the reconstruction blocks corresponding to the multiple sample coding blocks included in the training sample set are used as inputs of the filtering model to be trained, and the multiple sample coding blocks are used as the outputs of the filtering model to be trained, and the filtering model to be trained is trained, so as to Get the initial filter model.
  • Step 603 Divide the training sample set into M initial sample subsets.
  • Each initial sample subset includes at least two sample coding blocks and reconstruction blocks corresponding to the at least two sample coding blocks.
  • the sample coding blocks included in each initial sample subset are at least two consecutive sample coding blocks in the sorting result.
  • the peak signal-to-noise ratio of the filter block corresponding to the sample coding block is determined according to the following formula (2).
  • PSNR represents the peak signal-to-noise ratio of the filter block corresponding to the sample coding block
  • n represents the number of bits required to encode each pixel in the sample coding block, usually 8
  • MSE represents the The mean square error between the pixel value of the pixel in the sample coding block and the pixel value of the corresponding pixel in the filter block.
  • the training sample set is evenly divided into M initial sample subsets, and each initial sample subset includes the same number of sample coding blocks.
  • the training sample set can also be divided into M initial sample subsets according to other standards. This application implements This example does not limit this.
  • the training sample set includes 16 sample coding blocks and a reconstruction block corresponding to each sample coding block. It is assumed that the 16 sample coding blocks are B0-B15, and the corresponding reconstruction blocks of the 16 sample coding blocks are C0-C15.
  • the 16 reconstruction blocks C0-C15 are input into the initial filtering model respectively, and the 16 filter blocks obtained are L0-L15. Then the peak signal-to-noise ratios of L0-L15 are determined according to the above formula (2), and the 16 peaks are obtained.
  • the signal-to-noise ratio is PSNR0-PSNR15. Then, sort B0-B15 according to the size order of PSNR0-PSNR15, and divide B0-B15 evenly into 4 initial sample subsets according to the sorting results of B0-B15, each initial sample subset includes 4 samples Encoding block.
  • the multiple sample coding blocks are sorted according to the order of peak signal-to-noise ratio, and then the training sample set is divided into M initial sample subsets as one type of example.
  • the training sample set can also be divided into M initial sample subsets in other ways.
  • the pixel mean value corresponding to each sample coding block is determined, and the pixel mean value refers to the average pixel value of the pixel points in the sample coding block.
  • the multiple sample coding blocks are sorted in order of the size of the pixel mean, and the training sample set is divided into M initial sample subsets according to the sorting results.
  • the pixel variance corresponding to each sample coding block is determined.
  • the pixel variance refers to the variance of the pixel values of the pixels in the sample coding block. Then, the multiple sample coding blocks are sorted according to the order of pixel variance, and the training sample set is divided into M initial sample subsets according to the sorting results.
  • Step 604 Based on the M initial sample subsets, train the initial filter models respectively to obtain M optimized filter models.
  • the reconstruction blocks corresponding to at least two sample coding blocks included in the initial sample subset are used as the input of the initial filtering model, and the at least two sample coding blocks are used as The output of the initial filtering model is used to train the initial filtering model to obtain an optimized filtering model.
  • the initial filter model can be trained according to the above steps, thereby obtaining M optimized filter models.
  • the initial filter models are trained separately, and the filter model A and the filter model can be obtained B.
  • Four optimized filtering models, filtering model C and filtering model D are optimized.
  • Step 605 Based on the training sample set, train the M optimized filter models to obtain a set of filter models.
  • the M optimized filtering models are trained through loop iteration.
  • the i-th iteration processing in the loop iteration method includes the following steps:
  • the training sample set is divided into M optimized sample subsets.
  • the M optimized sample subsets are consistent with the M processed in the i-th iteration.
  • the reconstruction blocks corresponding to the multiple sample coding blocks are input to the M filtering models of the i-th iteration process to obtain M filter blocks corresponding to each sample coding block.
  • the process of determining the peak signal-to-noise ratio of the M filter blocks corresponding to each sample coding block can refer to the relevant description of determining the peak signal-to-noise ratio according to formula (2) in the above step 603, which will not be described again here.
  • determine the filter model corresponding to the maximum peak signal-to-noise ratio among the peak signal-to-noise ratios of the M filter blocks corresponding to the sample coding block and then divide the sample coding block into The optimized sample subset corresponding to the filter model.
  • the training sample set includes 16 sample coding blocks and a reconstruction block corresponding to each sample coding block. It is assumed that the reconstruction blocks corresponding to the 16 sample coding blocks are C0-C15. Taking the reconstruction block C0 among the 16 reconstruction blocks as an example, the reconstruction block C0 is input into the four filtering models processed in the i-th iteration. The four filtering blocks corresponding to the output sample coding block B0 of the four filtering models are L0. A , L0 B , L0 C and L0 D .
  • the four peak signal-to-noise ratios corresponding to the sample coding block B0 are determined according to the above formula (2) as PSNR0 A , PSNR0 B , PSNR0 C and PSNR0 D .
  • the peak signal-to-noise ratio PSNR0 C is the largest, then the sample coding block B0 is divided into the optimized sample subset corresponding to the filter model C.
  • the reconstruction block corresponding to the sample coding block included in the optimized sample subset is used as the input of the corresponding filtering model, and the sample coding block is used as the input of the corresponding filtering model.
  • the embodiment of the present application trains the M optimized filtering models through loop iteration.
  • the iteration number i of the M filtering models is less than the iteration number threshold, it indicates that the optimized filtering model currently trained is unreliable, and the The M filter models processed in the i-th iteration of training are used as the M filter models processed in the i+1-th iteration, and the i+1-th iteration process is continued.
  • the iteration number i of the M filter models is greater than or equal to the iteration number threshold, it indicates that the optimized filter model obtained by current training is reliable, then the iterative processing is stopped, and the M filter models processed by the i-th iteration after training are regarded as a Group filter model.
  • the iteration number threshold is set in advance.
  • the iteration number threshold is a specified number of iterations or a maximum number of iterations, which can be set according to different requirements. This is not limited in the embodiments of the present application.
  • the filter model processed in the i-th iteration is trained, and the other filter models stop iterative processing.
  • the quantization parameters of the images to which the multiple sample coding blocks included in the training sample set belong are the same, an untrained filter model is trained based on the training sample set, and a set of filter models obtained is suitable for Same as Encoding block for encoding quality.
  • the M filter models included in the set of filter models are suitable for coding blocks with different contents.
  • the target filter model can be selected from the K sets of filter models based on the coding quality and the content of the coding block, and then The reconstructed block is filtered based on the target filtering model, thereby reducing coding distortion and improving filtering performance.
  • the filtering performance can be improved on the basis of a simplified network model to meet the filtering effect of coding blocks of different quality and content in the same image.
  • FIG. 7 is a schematic structural diagram of a filtering device provided by an embodiment of the present application.
  • the filtering device can be implemented by software, hardware, or a combination of the two to become part or all of the encoding end device.
  • the encoding end device can be as shown in Figure 1 source device. Referring to FIG. 7 , the device includes: a first determination module 701 , a second determination module 702 , a third determination module 703 and a first filtering module 704 .
  • the first determination module 701 is used to determine K groups of filter models according to the quantized parameters of the target image.
  • Each group of filter models in the K group of filter models includes M filter models, and the same group of filter models corresponds to the same quantized parameter.
  • Different groups of filter models correspond to different quantization parameters, and K and M are both integers greater than 1.
  • the second determination module 702 is used to determine the reconstruction block corresponding to the current coding block in the target image. For the detailed implementation process, refer to the corresponding content in each of the above embodiments, and will not be described again here.
  • the third determination module 703 is used to determine the target filter model from the K group of filter models.
  • the target filter model refers to the filter model with the smallest coding distortion after filtering the reconstruction block, and filters the reconstruction block through the target filter model.
  • the resulting coding distortion is smaller than the coding distortion of the reconstructed block.
  • the first filtering module 704 is used to filter the reconstruction block based on the target filtering model.
  • the detailed implementation process refer to the corresponding content in each of the above embodiments, and will not be described again here.
  • the third determination module 703 includes:
  • a first determination unit configured to determine filtering indication information corresponding to the reconstruction block based on the current coding block, the reconstruction block and the K group of filter models, where the filtering indication information is used to indicate whether the reconstruction block requires filtering;
  • the second determination unit is configured to determine a target filter model from the K group of filter models when the filter indication information indicates that the reconstruction block requires filtering.
  • the first determination unit is specifically used to:
  • the reconstruction block and the K*M filter blocks determine the rate distortion cost corresponding to the reconstruction block and the rate distortion cost corresponding to each filter block;
  • the filtering indication information is determined to be the first indication information, and the first indication information is used to indicate that the reconstruction block requires filtering;
  • the filtering indication information is determined to be the second indication information, and the second indication information is used to indicate that the reconstruction block does not require filtering.
  • the device also includes:
  • the second filtering module is configured to not filter the reconstruction block when the filtering indication information indicates that the reconstruction block does not require filtering. Perform filtering.
  • the third determination module 703 also includes:
  • Encoding unit used to encode filtering indication information into the code stream.
  • the device also includes:
  • the first encoding module is used to encode the target index into the code stream, and the target index is used to indicate the target filter model.
  • the device also includes:
  • the second encoding module is used to encode the quantization parameters corresponding to the K groups of filter models into the code stream.
  • each group of filter models since each group of filter models includes M filter models, the same group of filter models corresponds to the same quantization parameter, and different groups of filter models correspond to different quantization parameters. That is, the same group of filter models is suitable for the same encoding. Different sets of filter models are suitable for coding blocks of different coding qualities, and different filter models in the same group of filter models are suitable for coding blocks of different contents.
  • the target filter model can be selected from the K sets of filter models based on the coding quality and the content of the coding block, and then The reconstructed block is filtered based on the target filtering model, thereby reducing coding distortion and improving filtering performance.
  • the filtering performance can be improved on the basis of a simplified network model to meet the filtering effect of coding blocks of different quality and content in the same image.
  • the filtering device provided in the above embodiment performs filtering
  • only the division of the above functional modules is used as an example.
  • the above function allocation can be completed by different functional modules as needed, that is, the device The internal structure is divided into different functional modules to complete all or part of the functions described above.
  • the filtering device provided in the above embodiments and the filtering method embodiments belong to the same concept, and the specific implementation process can be found in the method embodiments, which will not be described again here.
  • FIG 8 is a schematic structural diagram of another filtering device provided by an embodiment of the present application.
  • the filtering device can be implemented by software, hardware, or a combination of the two to become part or all of the decoding end device.
  • the decoding end device can be as shown in Figure 1 the destination device shown. Referring to Figure 8, the device includes: a first determination module 801, a second determination module 802, a third determination module 803 and a filtering module 804.
  • the first determination module 801 is used to determine K groups of filter models.
  • Each group of filter models in the K group of filter models includes M filter models, and the same group of filter models corresponds to the same quantization parameter, and different groups of filter models correspond to different Quantization parameters, K and M are both integers greater than 1.
  • the second determination module 802 is used to determine the reconstruction block based on the code stream.
  • the detailed implementation process refer to the corresponding content in each of the above embodiments, and will not be described again here.
  • the third determination module 803 is used to determine the target filter model in the K group of filter models. For the detailed implementation process, refer to the corresponding content in each of the above embodiments, and will not be described again here.
  • the filtering module 804 is used to filter the reconstruction block based on the target filtering model.
  • the filtering module 804 is used to filter the reconstruction block based on the target filtering model.
  • the first determination module 801 is specifically used to:
  • the K sets of filter models are determined according to the quantized parameters of the target image to which the reconstruction block belongs.
  • the first determination module 801 is specifically used to:
  • the K group of filter models are determined.
  • the third determination module 803 includes:
  • a first determination unit used to determine filtering indication information of the reconstruction block, where the filtering indication information is used to indicate whether the reconstruction block requires filtering
  • the second determination unit is configured to determine the target filter model in the K group of filter models when the filter indication information indicates that the reconstruction block requires filtering.
  • the first determination unit is specifically used to:
  • the third determination module 803 is specifically used to:
  • the target filter model is determined based on the target index.
  • each group of filter models since each group of filter models includes M filter models, the same group of filter models corresponds to the same quantization parameter, and different groups of filter models correspond to different quantization parameters. That is, the same group of filter models is suitable for the same encoding. Different sets of filter models are suitable for coding blocks of different coding qualities, and different filter models in the same group of filter models are suitable for coding blocks of different contents.
  • the target filter model can be selected from the K sets of filter models based on the coding quality and the content of the coding block, and then The reconstructed block is filtered based on the target filtering model, thereby reducing coding distortion and improving filtering performance.
  • the filtering performance can be improved on the basis of a simplified network model to meet the filtering effect of coding blocks of different quality and content in the same image.
  • the filtering device provided in the above embodiment performs filtering
  • only the division of the above functional modules is used as an example.
  • the above function allocation can be completed by different functional modules as needed, that is, the device The internal structure is divided into different functional modules to complete all or part of the functions described above.
  • the filtering device provided in the above embodiments and the filtering method embodiments belong to the same concept, and the specific implementation process can be found in the method embodiments, which will not be described again here.
  • FIG 9 is a schematic structural diagram of a filter model training device provided by an embodiment of the present application.
  • the filter model training device can be implemented by software, hardware, or a combination of both to become part or all of the filter model training device.
  • the device includes: an acquisition module 901, a first training module 902, a dividing module 903, a second training module 904 and a third training module 905.
  • the acquisition module 901 is used to acquire a training sample set.
  • the training sample set includes multiple sample coding blocks and a reconstruction block corresponding to each sample coding block.
  • the quantization parameters of the images to which the multiple sample coding blocks belong are the same quantization parameter.
  • the first training module 902 is used to train the filter model to be trained based on the training sample set to obtain an initial filter model.
  • the detailed implementation process refer to the corresponding content in each of the above embodiments, and will not be described again here.
  • the dividing module 903 is used to divide the training sample set into M initial sample subsets.
  • Each initial sample subset includes at least two sample coding blocks and reconstruction blocks corresponding to the at least two sample coding blocks.
  • the second training module 904 is used to separately train the initial filtering models based on the M initial sample subsets to obtain M optimized filtering models.
  • the detailed implementation process refer to the corresponding content in each of the above embodiments, and will not be described again here.
  • the third training module 905 is used to train the M optimized filter models based on the training sample set to obtain a set of filter models.
  • the detailed implementation process refer to the corresponding content in each of the above embodiments, and will not be described again here.
  • the dividing module 903 is specifically used to:
  • the training sample set is divided into M initial sample subsets, and the sample coding blocks included in each initial sample subset are at least two consecutive sample coding blocks in the sorting result.
  • the third training module 905 is specifically used for:
  • the M optimized filtering models are trained through a loop iteration method; wherein, the i-th iteration process in the loop iteration method includes the following steps:
  • the training sample set is divided into M optimized sample subsets, and the M optimized sample subsets are combined with the M filtering models processed in the i-th iteration One-to-one correspondence, where the M filter models processed in the first iteration are the M optimized filter models;
  • the M filtering models processed in the i-th iteration are trained;
  • the M filtering models trained in the i-th iteration are used as the M filtering models in the i+1 iteration, and the i+1 iteration is performed;
  • the M filtering models processed by the i-th iteration are determined as a set of filtering models.
  • the third training module 905 is specifically used for:
  • the training sample set is divided into the M optimized sample subsets, where each sample coding block is located at the largest among its corresponding M filter blocks.
  • the peak signal-to-noise ratio filter block corresponds to the optimized sample subset of the filter model.
  • the quantization parameters of the images to which the multiple sample coding blocks included in the training sample set belong are the same, an untrained filter model is trained based on the training sample set, and a set of filter models obtained is suitable for Coding blocks with the same coding quality.
  • the M filter models included in the set of filter models are suitable for coding blocks with different contents.
  • the target filter model can be selected from the K sets of filter models based on the coding quality and the content of the coding block, and then The reconstructed block is filtered based on the target filtering model, thereby reducing coding distortion and improving filtering performance.
  • the filtering performance can be improved based on the simplified network model to meet the needs of different qualities and different contents in the same image. The filtering effect of the coding block.
  • the filter model training device provided in the above embodiment performs filter model training
  • only the division of the above functional modules is used as an example.
  • the above functions can be allocated to different functional modules as needed. Completion means dividing the internal structure of the device into different functional modules to complete all or part of the functions described above.
  • the filter model training device provided by the above embodiments and the filter model training method embodiments belong to the same concept. Please refer to the method embodiments for the specific implementation process, which will not be described again here.
  • Figure 10 is a schematic block diagram of a computer device 1000 used in an embodiment of the present application.
  • the computer device 1000 may include a processor 1001, a memory 1002 and a bus system 1003.
  • the processor 1001 and the memory 1002 are connected through a bus system 1003.
  • the memory 1002 is used to store instructions
  • the processor 1001 is used to execute the instructions stored in the memory 1002 to perform the filtering method and filtering model training described in the embodiments of this application. method. To avoid repetition, it will not be described in detail here.
  • the processor 1001 can be a central processing unit (CPU).
  • the processor 1001 can also be other general-purpose processors, DSP, ASIC, FPGA or other programmable logic devices, discrete gates. Or transistor logic devices, discrete hardware components, etc.
  • a general-purpose processor may be a microprocessor or the processor may be any conventional processor, etc.
  • the memory 1002 may include a ROM device or a RAM device. Any other suitable type of storage device may also be used as memory 1002.
  • Memory 1002 may include code and data 10021 accessed by processor 1001 using bus 1003 .
  • the memory 1002 may further include an operating system 10023 and an application program 10022.
  • the application program 10022 includes at least one program that allows the processor 1001 to execute the filtering method or the filtering model training method described in the embodiments of this application.
  • the application program 10022 may include applications 1 to N, which further include applications that perform the filtering method or the filtering model training method described in the embodiments of this application.
  • bus system 1003 may also include a power bus, a control bus, a status signal bus, etc.
  • bus system 1003 may also include a power bus, a control bus, a status signal bus, etc.
  • various buses are labeled as bus system 1003 in the figure.
  • computer device 1000 may also include one or more output devices, such as display 1004.
  • display 1004 may be a tactile display that incorporates a display with a tactile unit operable to sense touch input.
  • Display 1004 may be connected to processor 1001 via bus 1003 .
  • the computer device 1000 can execute the filtering method in the embodiment of the present application, and can also execute the filtering model training method in the embodiment of the present application.
  • Computer-readable media may include computer-readable storage media that correspond to tangible media, such as data storage media, or communication media including any medium that facilitates transfer of a computer program from one place to another (e.g., based on a communications protocol) .
  • computer-readable media generally may correspond to (1) non-transitory tangible computer-readable storage media, or (2) communication media, such as a signal or carrier wave.
  • Data storage media may be any available media that can be accessed by one or more computers or one or more processors to retrieve instructions, code and/or data structures for implementing the techniques described in this application.
  • a computer program product may include computer-readable media.
  • such computer-readable storage media may include RAM, ROM, EEPROM, CD-ROM or other optical disk storage, magnetic disk storage or other magnetic storage, flash memory or any other medium that can be used to store the desired program code in the form of instructions or data structures and accessible by a computer.
  • any connection is properly termed a computer-readable medium.
  • coaxial cable, fiber optic cable, twisted pair, digital subscriber line (DSL), or wireless technologies such as infrared, radio, and microwave are used to transmit instructions from a website, server, or other remote source
  • coaxial cable Wires, fiber optic cables, twisted pairs, DSL or wireless technologies such as infrared, radio and microwave are included in the definition of media.
  • computer-readable storage media and data storage media do not include connections, carrier waves, signals, or other transitory media, but are instead directed to non-transitory tangible storage media.
  • disks and optical discs include compact discs (CDs), laser discs, optical discs, DVDs, and Blu-ray discs, where disks typically reproduce data magnetically, while discs reproduce data optically using lasers. Combinations of the above should also be included within the scope of computer-readable media.
  • DSPs digital signal processors
  • ASICs application specific integrated circuits
  • FPGAs field programmable logic arrays
  • processor to execute instructions may refer to any of the foregoing structures or any other structure suitable for implementing the techniques described herein.
  • the functionality described by the various illustrative logical blocks, modules, and steps described herein may be provided within or within dedicated hardware and/or software modules configured for encoding and decoding. into the combined codec.
  • the techniques may be entirely implemented in one or more circuits or logic elements.
  • various illustrative logical blocks, units, and modules in the encoder and decoder can be understood as corresponding circuit devices or logical elements.
  • inventions of the present application may be implemented in a variety of devices or devices, including wireless handsets, integrated circuits (ICs), or a set of ICs (eg, chipsets).
  • ICs integrated circuits
  • a set of ICs eg, chipsets
  • Various components, modules or units are described in the embodiments of this application to emphasize the functional aspects of the apparatus for performing the disclosed technology, but do not necessarily need to be implemented by different hardware units. Indeed, as described above, the various units may be combined in a codec hardware unit in conjunction with suitable software and/or firmware, or by interoperating hardware units (including one or more processors as described above). supply.
  • the above embodiments it can be implemented in whole or in part by software, hardware, firmware, or any combination thereof.
  • software it may be implemented in whole or in part in the form of a computer program product.
  • the computer program product includes one or more computer instructions. When the computer instructions are loaded and executed on the computer, the processes or functions described in the embodiments of the present application are generated in whole or in part.
  • the computer may be a general purpose computer, a special purpose computer, a computer network, or other programmable device.
  • the computer instructions may be stored in or transmitted from one computer-readable storage medium to another, e.g., the computer instructions may be transferred from a website, computer, server, or data center Transmission to another website, computer, server or data center through wired (such as coaxial cable, optical fiber, digital subscriber line (DSL)) or wireless (such as infrared, wireless, microwave, etc.) means.
  • the computer-readable storage medium can be any available medium that can be accessed by a computer, or a data storage device such as a server or data center integrated with one or more available media.
  • the available media may be magnetic media (such as floppy disks, hard disks, tapes), optical media (such as digital versatile discs (DVD)) or semiconductor media (such as solid state disks (SSD)) wait.
  • the computer-readable storage media mentioned in the embodiments of this application may be non-volatile storage media, in other words, may be non-transitory storage media.
  • an encoding end device including a memory and a processor
  • the memory is used to store computer programs
  • the processor is used to execute the computer programs stored in the memory to achieve the above The described filtering method.
  • a decoding end device includes a memory and a processor
  • the memory is used to store computer programs
  • the processor is used to execute the computer programs stored in the memory to implement the above-mentioned filtering method.
  • a filter model training device includes a memory and a processor
  • the memory is used to store computer programs
  • the processor is used to execute the computer programs stored in the memory to implement the above-mentioned filter model training method.
  • a computer-readable storage medium is provided, and instructions are stored in the storage medium.
  • the instructions are run on the computer, the computer is caused to perform the steps of the above-described method.
  • a computer program which when executed implements the method described above.
  • the information including but not limited to user equipment information, user personal information, etc.
  • data including but not limited to data used for analysis, stored data, displayed data, etc.
  • Signals are all authorized by the user or fully authorized by all parties, and the collection, use and processing of relevant data need to comply with the relevant laws, regulations and standards of relevant countries and regions.
  • the quantization parameters, filter models, current coding blocks, and reconstruction blocks involved in the embodiments of this application are all obtained with sufficient authorization.

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Compression Or Coding Systems Of Tv Signals (AREA)
  • Compression, Expansion, Code Conversion, And Decoders (AREA)

Abstract

本申请公开了一种滤波方法、滤波模型训练方法及相关装置,属于编解码技术领域。所述方法包括:根据目标图像的量化参数,确定出K组滤波模型,确定目标图像中当前编码块对应的重建块,从该K组滤波模型中确定目标滤波模型,基于目标滤波模型,对该重建块进行滤波。由于同一组滤波模型适用于相同编码质量的编码块,不同组的滤波模型适用于不同编码质量的编码块,而且同一组滤波模型中的不同滤波模型适用于不同内容的编码块。这样,对于当前编码块对应的重建块,能够结合编码质量和编码块的内容,从该K组滤波模型中选择出目标滤波模型,进而基于目标滤波模型对重建块进行滤波,从而减小编码失真,提高滤波性能。

Description

滤波方法、滤波模型训练方法及相关装置
本申请要求于2022年05月31日提交的申请号为202210616061.5、发明名称为“滤波方法、滤波模型训练方法及相关装置”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本申请涉及编解码技术领域,特别涉及一种滤波方法、滤波模型训练方法及相关装置。
背景技术
编解码技术在多媒体服务、广播、视频通信和存储等领域存在广泛的应用。在编码的过程中,将图像划分为多个不重叠的编码块,按照顺序依次对该多个编码块进行编码。在解码的过程中,按照顺序依次从码流中解析出各个重建块,进而确定重建图像。但是在某些情况下,相邻的重建块之间可能存在过度不平滑或者像素不连续的问题,导致重建图像与原始图像之间出现图像失真的现象,因此,需要对重建块进行滤波。而且,在编码端按照帧内预测方式或者帧间预测方式对编码块进行编码的情况下,为了保证后续编码块的编码质量,编码端也需要对重建块进行滤波。
在相关技术中,对于多个量化参数中的每个量化参数,事先为每个量化参数训练一个滤波模型。在编码端对重建块进行滤波时,从事先训练的这些滤波模型中,选择与图像的量化参数相邻的多个量化参数所对应的滤波模型,以得到多个滤波模型。然后,从该多个滤波模型中选择目标滤波模型,通过目标滤波模型对重建块进行滤波。而且,编码端还可以将目标滤波模型的索引编入码流并发送给解码端。解码端接收到编码端发送的码流之后,通过解析码流能够确定出重建块和目标滤波模型的索引,进而基于目标滤波模型的索引,通过目标滤波模型对重建块进行滤波。
由于一个量化参数对应一个滤波模型,但是同一个图像中不同内容的编码块可能需要进行不同的滤波,所以,为了能够满足同一个图像中各个编码块的滤波效果,每个量化参数对应的滤波模型的网络结构就会比较复杂,进而按照上述方法进行滤波的速度就会受到影响,从而可能会影响图像的编解码速度。
发明内容
本申请实施例提供了一种滤波方法、滤波模型训练方法及相关装置,能够在简化网络模型的基础上提高滤波性能,满足同一图像中不同质量、不同内容的编码块的滤波效果。所述技术方案如下:
第一方面,提供了一种滤波方法,应用于编码端。在该方法中,根据目标图像的量化参数,确定出K组滤波模型,所述K组滤波模型中的每组滤波模型包括M个滤波模型,且同一组滤波模型对应同一个量化参数,不同组滤波模型对应不同的量化参数,K和M均为大于1的整数,确定所述目标图像中当前编码块对应的重建块,从所述K组滤波模型中确定目标 滤波模型,所述目标滤波模型是指对所述重建块进行滤波后编码失真最小的滤波模型,且通过所述目标滤波模型对所述重建块进行滤波后的编码失真小于所述重建块的编码失真,基于所述目标滤波模型,对所述重建块进行滤波。
可选地,编码端根据目标图像的量化参数,从目标对应关系中获取K个参考量化参数。由于一个量化参数对应一组滤波模型,所以,编码端能够基于该K个参考量化参数,确定出K组滤波模型。
其中,目标对应关系用于指示图像量化参数与参考量化参数之间的对应关系。作为一种示例,目标对应关系为量化参数范围与参考量化参数之间的对应关系,或者,目标对应关系为图像量化参数与参考量化参数之间的对应关系。
在目标对应关系为量化参数范围与参考量化参数之间的对应关系时,由于位于同一个量化参数范围内的各个量化参数所对应的参考量化参数是相同的。所以,编码端只需要存储量化参数范围,并不需要依次存储各个量化参数。这样,有利于节省编码端的存储空间,进而提高编码端确定K组滤波模型的效率。
在目标对应关系为图像量化参数与参考量化参数之间的对应关系时,由于一个图像量化参数对应K个参考量化参数,这K个参考量化参数与该图像量化参数之间的相关性更强。所以,编码端按照目标对应关系确定出的K组滤波模型与目标图像的量化参数相关性更强,从而能够进一步提高滤波效果。
由于编码块对应的量化参数决定了编码块的编码质量,即量化参数越小,编码质量越高,量化参数越大,编码质量越低。而且,同一组滤波模型对应同一个量化参数,不同组滤波模型对应不同的量化参数。所以,通过同一个量化参数编码后的多个编码块的编码质量相同,该多个编码质量相同的编码块能够通过同一组滤波模型进行滤波。通过不同的量化参数编码后的多个编码块的编码质量不同,该多个编码质量不同的编码块能够通过不同组的滤波模型进行滤波。也即是,同一组滤波模型适用于相同编码质量的编码块,不同组的滤波模型适用于不同编码质量的编码块。
可选地,根据目标图像的量化参数确定出K组滤波模型之后,编码端还需要将该K组滤波模型对应的量化参数编入码流。
可选地,编码端基于当前编码块、该重建块和该K组滤波模型,确定该重建块对应的滤波指示信息,滤波指示信息用于指示该重建块是否需要滤波。在滤波指示信息指示该重建块需要滤波的情况下,从该K组滤波模型中确定目标滤波模型。
编码端将该重建块输入至该K组滤波模型中的每个滤波模型,以得到K*M个滤波块,基于当前编码块、该重建块和该K*M个滤波块,确定该重建块对应的率失真代价,以及每个滤波块对应的率失真代价。如果该重建块对应的率失真代价不小于每个滤波块对应的率失真代价,则确定滤波指示信息为第一指示信息,第一指示信息用于指示该重建块需要滤波。如果该重建块对应的率失真代价小于每个滤波块对应的率失真代价,则确定滤波指示信息为第二指示信息,第二指示信息用于指示该重建块不需要滤波。
由于率失真代价用于指示重建块与原始编码块之间图像失真的程度,以及滤波块与原始编码块之间图像失真的程度。如果该重建块对应的率失真代价小于每个滤波块对应的率失真代价,则表明该重建块与原始编码块之间的图像失真最小,这样,基于该重建块复原的重建图像与原始图像之间的图像失真最小。此时,不需要对该重建块进行滤波。如果该重建块对 应的率失真代价不小于每个滤波块对应的率失真代价,则表明滤波块与原始编码块之间的图像失真最小,这样,基于滤波块复原的重建图像与原始图像之间的图像失真最小。此时,需要对该重建块进行滤波。
基于上文描述,以率失真代价指示编码失真为例。在滤波指示信息指示该重建块需要滤波的情况下,比较每个滤波块对应的率失真代价,将率失真代价最小的滤波块所对应的滤波模型,确定为目标滤波模型。当然,在实际应用中,还能够通过其他的参数指示编码失真,本申请实施例对此不做限定。
可选地,编码端基于当前编码块、该重建块和该K组滤波模型,确定出该重建块对应的滤波指示信息之后,还需要将滤波指示信息编入码流。
可选地,编码端从该K组滤波模型中确定出目标滤波模型之后,还需要将目标索引编入码流,目标索引用于指示目标滤波模型。
需要说明的是,以上内容是以滤波指示信息指示该重建块需要滤波的情况为例。当然,在实际应用中,滤波指示信息还可能指示该重建块不需要滤波。在滤波指示信息指示该重建块不需要滤波的情况下,不对该重建块进行滤波。
由于同一组滤波模型适用于相同编码质量的编码块,不同组的滤波模型适用于不同编码质量的编码块,而且同一组滤波模型中的不同滤波模型适用于不同内容的编码块。这样,在基于目标图像的量化参数确定出K组滤波模型之后,对于当前编码块对应的重建块,能够结合编码质量和编码块的内容,从该K组滤波模型中选择出目标滤波模型,进而基于目标滤波模型对重建块进行滤波,从而减小编码失真,提高滤波性能。而且,对于同一个图像中不同编码质量、不同内容的编码块来说,能够在简化网络模型的基础上提高滤波性能,满足同一图像中不同质量、不同内容的编码块的滤波效果。
第二方面,提供了一种滤波方法,应用于解码端。在该方法中,确定出K组滤波模型,所述K组滤波模型中的每组滤波模型包括M个滤波模型,且同一组滤波模型对应同一个量化参数,不同组滤波模型对应不同的量化参数,K和M均为大于1的整数,基于码流确定重建块,确定所述K组滤波模型中的目标滤波模型,基于所述目标滤波模型,对所述重建块进行滤波。
可选地,解码端根据该重建块所属的目标图像的量化参数,确定出K组滤波模型。
可选地,由于编码端根据目标图像的量化参数,确定出K组滤波模型之后,还将该K组滤波模型对应的量化参数编入码流。所以,解码端接收到码流之后,能够从码流中解析出该K组滤波模型对应的量化参数,进而基于该K组滤波模型对应的量化参数,确定出该K组滤波模型。
解码端确定该重建块的滤波指示信息,滤波指示信息用于指示该重建块是否需要滤波。在滤波指示信息指示该重建块需要滤波的情况下,确定该K组滤波模型中的目标滤波模型。
由于编码端基于当前编码块、该重建块和该K组滤波模型,确定出该重建块对应的滤波指示信息之后,还将滤波指示信息编入码流。所以,解码端接收到该码流之后,能够从该码流中解析出滤波指示信息,进而基于滤波指示信息判断是否需要对该重建块进行滤波。在滤波指示信息指示该重建块需要滤波的情况下,解码端能够从码流中解析出目标索引,进而基于目标索引确定目标滤波模型。
由于同一组滤波模型适用于相同编码质量的编码块,不同组的滤波模型适用于不同编码质量的编码块,而且同一组滤波模型中的不同滤波模型适用于不同内容的编码块。这样,在基于目标图像的量化参数确定出K组滤波模型之后,对于当前编码块对应的重建块,能够结合编码质量和编码块的内容,从该K组滤波模型中选择出目标滤波模型,进而基于目标滤波模型对重建块进行滤波,从而减小编码失真,提高滤波性能。而且,对于同一个图像中不同编码质量、不同内容的编码块来说,能够在简化网络模型的基础上提高滤波性能,满足同一图像中不同质量、不同内容的编码块的滤波效果。
第三方面,提供了一种滤波模型训练方法,在该方法中,获取训练样本集,所述训练样本集包括多个样本编码块和每个样本编码块对应的重建块,所述多个样本编码块所属的图像的量化参数为同一个量化参数。基于所述训练样本集,对待训练的滤波模型进行训练,以得到初始滤波模型。将所述训练样本集划分为M个初始样本子集,每个初始样本子集包括至少两个样本编码块和所述至少两个样本编码块对应的重建块。基于所述M个初始样本子集,对所述初始滤波模型分别进行训练,以得到M个优化滤波模型。基于所述训练样本集,对所述M个优化滤波模型进行训练,以得到一组滤波模型。
该多个样本编码块为从多个样本图像划分得到的,或者,该多个样本编码块为从一个样本图像划分得到的。即,该多个样本编码块可能来自同一个样本图像,也可能来自不同的样本图像,只要该多个样本编码块所属的图像的量化参数相同即可。由于该多个样本编码块是将图像划分为多个不重叠的编码块得到的,所以,该多个样本编码块的内容不同。
将该多个样本编码块对应的重建块输入至初始滤波模型,以得到每个样本编码块对应的滤波块,基于该多个样本编码块和每个样本编码块对应的滤波块,确定每个样本编码块对应的滤波块的峰值信噪比,按照峰值信噪比的大小顺序,对该多个样本编码块进行排序,按照排序结果,将该训练样本集划分为M个初始样本子集,每个初始样本子集包括的样本编码块为排序结果中连续的至少两个样本编码块。
基于该训练样本集,通过循环迭代方式,对该M个优化滤波模型进行训练。其中,循环迭代方式中的第i次迭代处理包括如下步骤:
(1)基于该多个样本编码块和每个样本编码块对应的重建块,将该训练样本集划分为M个优化样本子集,该M个优化样本子集与第i次迭代处理的M个滤波模型一一对应,其中,第一次迭代处理的M个滤波模型为该M个优化滤波模型。
将该多个样本编码块对应的重建块输入至第i次迭代处理的M个滤波模型,以得到每个样本编码块对应的M个滤波块,基于该多个样本编码块和每个样本编码块对应的M个滤波块,确定每个样本编码块对应的M个滤波块的峰值信噪比,基于每个样本编码块对应的M个滤波块的峰值信噪比,将该训练样本集划分为该M个优化样本子集,其中,每个样本编码块位于其对应的M个滤波块中最大峰值信噪比的滤波块所对应的滤波模型的优化样本子集中。
(2)基于该M个优化样本子集,对第i次迭代处理的M个滤波模型进行训练。
(3)如果i小于迭代次数阈值,则将经训练的第i次迭代处理的M个滤波模型作为第i+1次迭代处理的M个滤波模型,并执行第i+1次迭代处理。
(4)如果i大于或等于迭代次数阈值,则将经训练的第i次迭代处理的M个滤波模型确定为一组滤波模型。
本申请是通过循环迭代的方式对该M个优化滤波模型进行训练,在该M个滤波模型的迭代次数i小于迭代次数阈值时,表明当前训练得到的优化滤波模型不可靠,则将经训练的第i次迭代处理的M个滤波模型作为第i+1次迭代处理的M个滤波模型,继续执行第i+1次迭代处理。在该M个滤波模型的迭代次数i大于或等于迭代次数阈值时,表明当前训练得到的优化滤波模型可靠,则停止迭代处理,并将经训练的第i次迭代处理的M个滤波模型作为一组滤波模型。
其中,迭代次数阈值是事先设置的,该迭代次数阈值为指定迭代次数,或者为最大迭代次数,能够按照不同的需求来设置,本申请实施例对此不做限定。
需要说明的是,在基于每个样本编码块对应的M个滤波块的峰值信噪比,将该训练样本集划分为M个优化样本子集的过程中,可能存在将该训练样本集只划分为1个优化样本子集的情况。即,该训练样本集中每个样本编码块对应的M个滤波块的峰值信噪比中最大峰值信噪比均对应同一个滤波模型。此时,基于划分得到的1个优化样本子集,对第i次迭代处理的该滤波模型进行训练,其他滤波模型则停止迭代处理。
在本申请中,由于训练样本集包括的多个样本编码块所属的图像的量化参数相同,所以,基于该训练样本集对未经训练的滤波模型进行训练,得到的一组滤波模型适用于相同编码质量的编码块。此外,由于训练样本集包括的多个样本编码块的内容不同,所以,该组滤波模型包括的M个滤波模型适用于不同内容的编码块。
第四方面,提供了一种滤波装置,所述滤波装置具有实现上述第一方面中滤波方法行为的功能。所述滤波装置包括至少一个模块,该至少一个模块用于实现上述第一方面所提供的滤波方法。
第五方面,提供了一种滤波装置,所述滤波装置具有实现上述第二方面中滤波方法行为的功能。所述滤波装置包括至少一个模块,该至少一个模块用于实现上述第二方面所提供的滤波方法。
第六方面,提供了一种滤波模型训练装置,所述滤波模型训练装置具有实现上述第三方面中滤波模型训练方法行为的功能。所述滤波模型训练装置包括至少一个模块,该至少一个模块用于实现上述第三方面所提供的滤波模型训练方法。
第七方面,提供了一种编码端设备,所述编码端设备包括处理器和存储器,所述存储器用于存储执行上述第一方面所提供的滤波方法的计算机程序。所述处理器被配置为用于执行所述存储器中存储的计算机程序,以实现上述第一方面所述的滤波方法。
可选地,所述编码端设备还可以包括通信总线,该通信总线用于该处理器与存储器之间建立连接。
第八方面,提供了一种解码端设备,所述解码端设备包括处理器和存储器,所述存储器用于存储执行上述第二方面所提供的滤波方法的计算机程序。所述处理器被配置为用于执行所述存储器中存储的计算机程序,以实现上述第二方面所述的滤波方法。
可选地,所述解码端设备还可以包括通信总线,该通信总线用于该处理器与存储器之间建立连接。
第九方面,提供了一种滤波模型训练设备,所述滤波模型训练设备包括处理器和存储器,所述存储器用于存储执行上述第三方面所提供的滤波模型训练方法的计算机程序。所述处理器被配置为用于执行所述存储器中存储的计算机程序,以实现上述第三方面所述的滤波模型训练方法。
可选地,所述滤波模型训练设备还可以包括通信总线,该通信总线用于该处理器与存储器之间建立连接。
第十方面,提供了一种计算机可读存储介质,所述存储介质内存储有指令,当所述指令在计算机上运行时,使得计算机执行上述第一方面所述的滤波方法的步骤、上述第二方面所述的滤波方法的步骤或者执行上述第三方面所述的滤波模型训练方法的步骤。
第十一方面,提供了一种包含指令的计算机程序产品,当所述指令在计算机上运行时,使得计算机执行上述第一方面所述的滤波方法的步骤、上述第二方面所述的滤波方法的步骤或者执行上述第三方面所述的滤波模型训练方法的步骤。或者说,提供了一种计算机程序,当所述计算机程序在计算机上运行时,使得计算机执行上述第一方面所述的滤波方法的步骤、上述第二方面所述的滤波方法的步骤或者执行上述第三方面所述的滤波模型训练方法的步骤。
上述第四方面至第十一方面所获得的技术效果与第一方面、第二方面或第三方面中对应的技术手段获得的技术效果近似,在这里不再赘述。
附图说明
图1是本申请实施例提供的一种实施环境的示意图;
图2是本申请实施例提供的一种编码端的示例性结构框图;
图3是本申请实施例提供的一种解码端的示例性结构框图;
图4是本申请实施例提供的一种滤波方法的流程图;
图5是本申请实施例提供的另一种滤波方法的流程图;
图6是本申请实施例提供的一种滤波模型训练方法的流程图;
图7是本申请实施例提供的一种滤波装置的结构示意图;
图8是本申请实施例提供的另一种滤波装置的结构示意图;
图9是本申请实施例提供的一种滤波模型训练装置的结构示意图;
图10是本申请实施例提供的一种计算机设备的结构示意图。
具体实施方式
为使本申请的目的、技术方案和优点更加清楚,下面将结合附图对本申请实施方式作进一步地详细描述。
在对本申请实施例提供的滤波方法进行详细地解释说明之前,先对本申请实施例涉及的 术语和实施环境进行介绍。
为了便于理解,首先对本申请实施例涉及的术语进行解释。
编码:是指将待编码的图像压缩成码流的处理过程。其中,该图像为静态图像、动态图像,或者视频包括的任一视频帧。
解码:是指将编码码流按照特定的语法规则和处理方法恢复成重建图像的处理过程。
编码块:是指将待编码的图像进行划分后得到的编码区域,一张图像可被划分为多个编码块,这多个编码块共同组成一张图像。其中,每个编码块能够独立编码,示例地,该编码块的大小为128*128。
量化:是指将信号的连续取值映射成多个离散幅值的过程。量化能够有效减小信号的取值范围,进而获得更好的压缩效果,并且量化是造成失真的根本原因。
量化参数(quantization parameter,QP)是控制量化程度的重要参数,反映了图像的压缩情况。一般来说,QP越小,量化越精细,更多的图像细节会被保留,编码后的质量越高,因而消耗的编码码率也越多。QP越大,量化越粗糙,图像细节丢失越严重,编码后的质量越低,失真越明显,消耗的编码码率也越少。也就是说,量化参数和编码码率是负相关的关系。
帧内预测:是指基于同一张图像中位于当前编码块之前已编码的编码块对应的重建块,对当前编码块进行预测。示例地,通过当前编码块左边和上边已编码的编码块对应的重建块,对当前编码块进行预测。
帧间预测:是指将位于当前图像之前已编码的图像对应的重建图像确定为参考图像,并基于参考图像中与当前编码块相似的重建块,对当前编码块进行预测。
接下来对本申请实施例涉及的实施环境进行介绍。
编解码技术在多媒体服务、广播、视频通信和存储等领域存在广泛的应用。在编码的过程中,将图像划分为多个不重叠的编码块,按照顺序依次对该多个编码块进行编码。在解码的过程中,按照顺序依次从码流中解析出各个重建块,进而确定重建图像。但是在某些情况下,相邻的重建块之间可能存在过度不平滑或者像素不连续的问题,导致重建图像与原始图像之间出现图像失真的现象,因此,需要对重建块进行滤波。而且,在编码端按照帧内预测方式或者帧间预测方式对编码块进行编码的情况下,为了保证后续编码块的编码质量,编码端也需要对重建块进行滤波。
请参考图1,图1是本申请实施例提供的一种实施环境的示意图。该实施环境包括源装置10、目的地装置20、链路30和存储装置40。其中,源装置10用于对图像中的各个编码块进行编码,而且在按照帧内预测方式或者帧间预测方式进行编码的过程中,还用于对编码块的重建块进行滤波。目的地装置20用于对码流进行解析,以确定重建块,还用于对重建块进行滤波。
由于源装置10用于对图像进行编码来产生码流。因此,源装置10也被称为图像编码装置,或者图像编码端。目的地装置20用于对由源装置10所产生的码流进行解码。因此,目的地装置20也被称为图像解码装置,或者图像解码端。
链路30用于接收源装置10所产生的码流,并将该码流传输给目的地装置20。存储装置40用于接收源装置10所产生的码流,并将该码流进行存储,这样的条件下,目的地装置20能够直接从存储装置40中获取码流。或者,存储装置40对应于文件服务器或能够保存由源 装置10产生的码流的另一中间存储装置,这样的条件下,目的地装置20能够经由流式传输或下载存储装置40存储的码流。
源装置10和目的地装置20均包括一个或多个处理器以及耦合到该一个或多个处理器的存储器,该存储器包括随机存取存储器(random access memory,RAM)、只读存储器(read-only memory,ROM)、带电可擦可编程只读存储器(electrically erasable programmable read-only memory,EEPROM)、快闪存储器、可用于以可由计算机存取的指令或数据结构的形式存储所要的程序代码的任何其它媒体等。例如,源装置10和目的地装置20均包括桌上型计算机、移动计算装置、笔记型(例如,膝上型)计算机、平板计算机、机顶盒、例如所谓的“智能”电话等电话手持机、电视机、相机、显示装置、数字媒体播放器、视频游戏控制台、车载计算机或其类似者。
链路30包括能够将码流从源装置10传输到目的地装置20的一个或多个媒体或装置。在一种可能的实现方式中,链路30包括能够使源装置10实时地将码流直接发送到目的地装置20的一个或多个通信媒体。在本申请实施例中,源装置10根据通信标准来调制码流,该通信标准为无线通信协议等,并且将码流发送给目的地装置20。该一个或多个通信媒体包括无线和/或有线通信媒体,例如该一个或多个通信媒体包括射频(radio frequency,RF)频谱或一个或多个物理传输线。该一个或多个通信媒体能够形成基于分组的网络的一部分,基于分组的网络为局域网、广域网或全球网络(例如,因特网)等。该一个或多个通信媒体包括路由器、交换器、基站或促进从源装置10到目的地装置20的通信的其它设备等,本申请实施例对此不做具体限定。
在一种可能的实现方式中,存储装置40用于将接收到的由源装置10发送的码流进行存储,目的地装置20能够直接从存储装置40中获取码流。这样的条件下,存储装置40包括多种分布式或本地存取的数据存储媒体中的任一者,例如,该多种分布式或本地存取的数据存储媒体中的任一者为硬盘驱动器、蓝光光盘、数字多功能光盘(digital versatile disc,DVD)、只读光盘(compact disc read-only memory,CD-ROM)、快闪存储器、易失性或非易失性存储器,或用于存储码流的任何其它合适的数字存储媒体等。
在一种可能的实现方式中,存储装置40对应于文件服务器或能够保存由源装置10产生的码流的另一中间存储装置,目的地装置20可经由流式传输或下载存储装置40存储的图像。文件服务器为能够存储码流并且将码流发送给目的地装置20的任意类型的服务器。在一种可能的实现方式中,文件服务器包括网络服务器、文件传输协议(file transfer protocol,FTP)服务器、网络附属存储(network attached storage,NAS)装置或本地磁盘驱动器等。目的地装置20能够通过任意标准数据连接(包括因特网连接)来获取码流。任意标准数据连接包括无线信道(例如,Wi-Fi连接)、有线连接(例如,数字用户线路(digital subscriber line,DSL)、电缆调制解调器等),或适合于获取存储在文件服务器上的码流的两者的组合。码流从存储装置40的传输可为流式传输、下载传输或两者的组合。
图1所示的实施环境仅为一种可能的实现方式,并且本申请实施例的技术不仅适用于图1所示的能够对图像进行编码的源装置10,以及对码流进行解码的目的地装置20,还适用于其他能够对图像进行编码和对码流进行解码的装置,本申请实施例对此不做具体限定。
在图1所示的实施环境中,源装置10包括数据源120、编码器100和输出接口140。在一些实施例中,输出接口140包括调节器/解调器(调制解调器)和/或发送器,其中发送器也 称为发射器。数据源120包括图像捕获装置(例如,摄像机等)、含有先前捕获的图像的存档、用于从图像内容提供者接收图像的馈入接口,和/或用于产生图像的计算机图形系统,或图像的这些来源的组合。
数据源120用于向编码器100发送图像,编码器100用于对接收到由数据源120发送的图像进行编码,得到码流。编码器将码流发送给输出接口。在一些实施例中,源装置10经由输出接口140将码流直接发送到目的地装置20。在其它实施例中,码流还可存储到存储装置40上,供目的地装置20以后获取并用于解码和/或显示。
在图1所示的实施环境中,目的地装置20包括输入接口240、解码器200和显示装置220。在一些实施例中,输入接口240包括接收器和/或调制解调器。输入接口240可经由链路30和/或从存储装置40接收码流,然后再发送给解码器200,解码器200用于对接收到的码流进行解码,得到重建图像。解码器将重建图像发送给显示装置220。显示装置220可与目的地装置20集成或可在目的地装置20外部。一般来说,显示装置220显示重建图像。显示装置220为多种类型中的任一种类型的显示装置,例如,显示装置220为液晶显示器(liquid crystal display,LCD)、等离子显示器、有机发光二极管(organic light-emitting diode,OLED)显示器或其它类型的显示装置。
尽管图1中未示出,但在一些方面,编码器100和解码器200可各自与音频编码器和解码器集成,且包括适当的多路复用器-多路分用器(multiplexer-demultiplexer,MUX-DEMUX)单元或其它硬件和软件,用于共同数据流或单独数据流中的音频和视频两者的编码。在一些实施例中,如果适用的话,那么MUX-DEMUX单元可符合ITU H.223多路复用器协议,或例如用户数据报协议(user datagram protocol,UDP)等其它协议。
编码器100和解码器200各自可为以下各项电路中的任一者:一个或多个微处理器、数字信号处理器(digital signal processing,DSP)、专用集成电路(application specific integrated circuit,ASIC)、现场可编程门阵列(field-programmable gate array,FPGA)、离散逻辑、硬件或其任何组合。如果部分地以软件来实施本申请实施例的技术,那么装置可将用于软件的指令存储在合适的非易失性计算机可读存储媒体中,且可使用一个或多个处理器在硬件中执行所述指令从而实施本申请实施例的技术。前述内容(包括硬件、软件、硬件与软件的组合等)中的任一者可被视为一个或多个处理器。编码器100和解码器200中的每一者都包括在一个或多个编码器或解码器中,所述编码器或所述解码器中的任一者能够集成为相应装置中的组合编码器/解码器(编码解码器)的一部分。
本申请实施例可大体上将编码器100称为将某些信息“发信号通知”或“发送”到例如解码器200的另一装置。术语“发信号通知”或“发送”可大体上指代用于对码流进行解码的语法元素和/或其它数据的传送。此传送可实时或几乎实时地发生。替代地,此通信可经过一段时间后发生,例如可在编码时在经编码位流中将语法元素存储到计算机可读存储媒体时发生,解码装置接着可在所述语法元素存储到此媒体之后的任何时间检索所述语法元素。
请参考图2,图2是本申请实施例提供的一种编码端的示例性结构框图。该编码端包括预测器、变换器、量化器、熵编码器、反量化器、反变换器、滤波器以及存储器。该预测器为帧内预测器或者帧间预测器。也即是,对于待编码的目标图像中的当前编码块来说,编码端能够通过帧内预测器对当前编码块进行帧内预测,还能够通过帧间预测器对当前编码块进 行帧间预测。当编码端对当前编码块进行帧内预测时,从存储器中获取第一参考重建块,并基于第一参考重建块,通过帧内预测器对当前编码块进行帧内预测,以得到当前编码块对应的预测块,第一参考重建块为目标图像中位于当前编码块之前已编码的编码块所对应的重建块。或者,当编码端对当前编码块进行帧间预测时,从存储器中获取第二参考重建块,进而基于第二重建块,通过帧间预测器对当前编码块进行预测,以得到当前编码块对应的预测块,第二参考重建块为目标图像之前已编码的图像中与当前编码块相似的重建块。
编码端按照上述方法通过帧内预测器或者帧间预测器确定得到当前编码块对应的预测块之后,将当前编码块与预测块之间的差值确定为残差块。然后,将残差块通过变换器进行变换,以得到变换后的残差块,进而将变换后的残差块通过量化器进行量化,以得到量化变换后的残差块。最后,通过熵编码器将量化变换后的残差块和预测指示信息编入码流,并将码流发送给解码端,该预测指示信息用于指示对当前编码块进行预测时所采用的预测模式。
为了保证与当前编码块相邻的下一个编码块的编码质量,在编码端对下一个编码块进行编码之前,还需要将量化变换后的残差块通过反量化器进行反量化,以得到变换后的残差块,进而将变换后的残差块通过反变换器进行反变换,以得到重建残差块。然后,将重建残差块与预测块相加得到当前编码块对应的重建块。编码端确定出当前编码块对应的重建块之后,通过滤波器,按照本申请实施例提供的滤波方法,对当前编码块对应的重建块进行滤波,以得到当前编码块对应的滤波块,进而将当前编码块对应的滤波块存储至存储器中,以便于对下一个编码块进行编码。
基于上文描述,QP是控制量化程度的重要参数,所以,编码端对当前编码块进行编码的过程中,需要确定当前编码块对应的QP。在实际应用中,同一张图像中不同编码块对应的QP可能相同,也可能不相同。例如,编码端将目标图像划分为多个不重叠的编码块,对于该多个编码块中的任一编码块,将目标图像的QP作为该编码块对应的QP。又例如,编码端将目标图像划分为多个不重叠的编码块,对于该多个编码块中的任一编码块,以目标图像的QP为基准,适应性地调整该编码块的QP。
其中,该多个编码块可能是大小相同的编码块,也可能是大小不相同的编码块。即,编码端将目标图像划分为大小相同的编码块,或者,编码端根据目标图像的内容,将目标图像划分为大小不相同的编码块。编码块的形状为方形,或者编码块的形状为其他形状,本申请实施例对编码块的形状不做限定。
可选地,变换器为离散余弦变换(discrete cosine transform,DCT)器、离散正弦变换(discrete sine transform,DST)器或者K-L变换(karhunen-loève transform,KLT)器中的任一种变换器。
请参考图3,图3是本申请实施例提供的一种解码端的示例性结构框图。该解码端包括熵解码器、预测器、反量化器、反变换器、存储器以及滤波器。该预测器为帧内预测器或者帧间预测器。也即是,对于目标图像来说,在编码端对目标图像中的各个编码块进行帧内预测的情况下,解码端也需要通过帧内预测器来确定预测块。在编码端对目标图像中的各个编码块进行帧间预测的情况下,解码端也需要通过帧间预测器来确定预测块。
解码端接收到码流之后,通过熵解码器对接收到的码流进行解码,以得到量化变换后的残差块和预测指示信息,该预测指示信息用于指示对当前编码块进行预测时所采用的预测模 式。然后,解码端基于该预测指示信息,确定通过帧内预测器进行预测,还是通过帧间预测器进行预测。在确定通过帧内预测器进行预测的情况下,解码端从存储器中获取第一参考重建块,并通过帧内预测器确定当前编码块对应的预测块。在确定通过帧间预测器进行预测的情况下,解码端从存储器中获取第二参考重建块,并通过帧间预测器确定当前编码块对应的预测块。然后,将量化变换后的残差块依次通过反量化器和反变换器得到重建残差块,将重建残差块与预测块相加得到当前编码块对应的重建块。
为了避免当前编码块对应的重建块与当前编码块之间出现图像失真,以及避免相邻重建块之间存在过度不平滑,或者像素不连续的问题,解码端还能够通过滤波器,按照本申请实施例提供的滤波方法,对该重建块进行滤波。
需要说明的是,本申请实施例描述的业务场景是为了更加清楚的说明本申请实施例的技术方案,并不构成对于本申请实施例提供的技术方案的限定,本领域普通技术人员可知,随着新业务场景的出现,本申请实施例提供的技术方案对于类似的技术问题,同样适用。
接下来对本申请实施例提供的滤波方法进行详细地解释说明。
图4是本申请实施例提供的一种滤波方法的流程图。该方法应用于编码端,请参考图4,该方法包括如下步骤。
步骤401:根据目标图像的量化参数,确定出K组滤波模型,该K组滤波模型中的每组滤波模型包括M个滤波模型,且同一组滤波模型对应同一个量化参数,不同组滤波模型对应不同的量化参数,K和M均为大于1的整数。
在一些实施例中,编码端根据目标图像的量化参数,从目标对应关系中获取K个参考量化参数。由于一个量化参数对应一组滤波模型,所以,编码端能够基于该K个参考量化参数,确定出K组滤波模型。
其中,目标对应关系用于指示图像量化参数与参考量化参数之间的对应关系。作为一种示例,目标对应关系为量化参数范围与参考量化参数之间的对应关系,或者,目标对应关系为图像量化参数与参考量化参数之间的对应关系。
在目标对应关系为量化参数范围与参考量化参数之间的对应关系时,编码端先确定目标图像的量化参数所在的量化参数范围,以得到目标量化参数范围。然后,基于目标量化参数范围,从目标对应关系中获取目标量化参数范围所对应的K个参考量化参数。
例如,目标对应关系如下述表1所示。在表1中,1个量化参数范围对应3个参考量化参数。表1是以各个量化参数范围均对应3个参考量化参数为例,即各个量化参数范围对应的参考量化参数的数量是相同的。当然,在实际应用中,各个量化参数范围对应的参考量化参数的数量也可能不相同。
表1
在目标对应关系为图像量化参数与参考量化参数之间的对应关系时,编码端直接基于目 标图像的量化参数,从目标对应关系中获取目标图像的量化参数所对应的K个参考量化参数。
例如,目标对应关系如下述表2所示。在表2中,1个图像量化参数对应3个参考量化参数。表2是以各个图像量化参数均对应3个参考量化参数为例,即各个图像量化参数对应的参考量化参数的数量是相同的。当然,在实际应用中,各个图像量化参数对应的参考量化参数的数量也可能不相同。
表2
在目标对应关系为量化参数范围与参考量化参数之间的对应关系时,由于位于同一个量化参数范围内的各个量化参数所对应的参考量化参数是相同的。所以,编码端只需要存储量化参数范围,并不需要依次存储各个量化参数。这样,有利于节省编码端的存储空间,进而提高编码端确定K组滤波模型的效率。
在目标对应关系为图像量化参数与参考量化参数之间的对应关系时,由于一个图像量化参数对应K个参考量化参数,这K个参考量化参数与该图像量化参数之间的相关性更强。所以,编码端按照目标对应关系确定出的K组滤波模型与目标图像的量化参数相关性更强,从而能够进一步提高滤波效果。
由于编码块对应的量化参数决定了编码块的编码质量,即量化参数越小,编码质量越高,量化参数越大,编码质量越低。而且,同一组滤波模型对应同一个量化参数,不同组滤波模型对应不同的量化参数。所以,通过同一个量化参数编码后的多个编码块的编码质量相同,该多个编码质量相同的编码块能够通过同一组滤波模型进行滤波。通过不同的量化参数编码后的多个编码块的编码质量不同,该多个编码质量不同的编码块能够通过不同组的滤波模型进行滤波。也即是,同一组滤波模型适用于相同编码质量的编码块,不同组的滤波模型适用于不同编码质量的编码块。
在一些实施例中,根据目标图像的量化参数确定出K组滤波模型之后,编码端还需要将该K组滤波模型对应的量化参数编入码流。这样,解码端接收到码流之后,能够从码流中解析出该K组滤波模型对应的量化参数,并基于该K组滤波模型对应的量化参数,确定出该K组滤波模型。
滤波模型的结构可能为卷积神经网络(convolutional neural networks,CNN)结构,或者也可能为其他的结构,本申请实施例对滤波模型的结构不做限定。
步骤402:确定目标图像中当前编码块对应的重建块。
编码端确定目标图像中当前编码块对应的重建块的过程,可以参考上述图2中的相关描述,此处不再赘述。
步骤403:从该K组滤波模型中确定目标滤波模型,目标滤波模型是指对该重建块进行滤波后编码失真最小的滤波模型,且通过目标滤波模型对该重建块进行滤波后的编码失真小于该重建块的编码失真。
在一些实施例中,编码端能够按照下述步骤(1)-(2),从该K组滤波模型中确定目标 滤波模型。
(1)基于当前编码块、该重建块和该K组滤波模型,确定该重建块对应的滤波指示信息,滤波指示信息用于指示该重建块是否需要滤波。
编码端将该重建块输入至该K组滤波模型中的每个滤波模型,以得到K*M个滤波块,基于当前编码块、该重建块和该K*M个滤波块,确定该重建块对应的率失真代价,以及每个滤波块对应的率失真代价。如果该重建块对应的率失真代价不小于每个滤波块对应的率失真代价,则确定滤波指示信息为第一指示信息,第一指示信息用于指示该重建块需要滤波。如果该重建块对应的率失真代价小于每个滤波块对应的率失真代价,则确定滤波指示信息为第二指示信息,第二指示信息用于指示该重建块不需要滤波。
作为一种示例,能够按照下述公式(1)来确定该重建块对应的率失真代价。
J=D+λR    (1)
其中,在上述公式(1)中,J代表率失真代价,D代表该重建块中像素点的像素值与当前编码块中像素点的像素值之间的误差,λ代表失真参数,通常为默认值,R代表不需要对该重建块进行滤波时将当前编码块编入码流所需的比特数。当然,在实际应用中,也能够按照上述公式(1)来确定每个滤波块对应的率失真代价。此时,上述公式(1)中的D代表该滤波块中像素点的像素值与当前编码块中像素点的像素值之间的误差,R代表需要对该重建块进行滤波时将当前编码块编入码流所需的比特数。
在不需要对该重建块进行滤波时,将当前编码块编入码流所需的比特数包括编码滤波指示信息所需的比特数、编码量化变换后的残差块所需的比特数,以及编码预测指示信息所需的比特数。在需要对该重建块进行滤波时,将当前编码块编入码流所需的比特数包括编码滤波指示信息所需的比特数、编码量化变换后的残差块所需的比特数、编码预测指示信息所需的比特数,以及编码滤波模型索引所需的比特数。
在一些实施例中,编码端存储有滤波模型索引与编码滤波模型索引所需的比特数之间的对应关系。所以,在编码端确定出该K*M个滤波模型之后,能够基于该K*M个滤波模型的模型索引,从存储的滤波模型索引与编码滤波模型索引所需的比特数之间的对应关系中,获取编码该K*M个滤波模型索引所需的比特数,进而按照上述公式(1)确定每个滤波块对应的率失真代价。
以上内容是以编码不同的滤波模型索引所需的比特数不同为例,即不同的滤波模型索引对应不同的编码比特数。当然,在实际应用中,编码不同的滤波模型索引所需的比特数也可能相同,即不同的滤波模型索引对应相同的编码比特数。这样,在按照上述公式(1)确定每个滤波块对应的率失真代价时,该K*M个滤波块对应的率失真代价主要取决于该K*M个滤波块与当前编码块之间的像素误差。
需要说明的是,在按照上述公式(1)确定该重建块对应的率失真代价,以及每个滤波块对应的率失真代价时,该重建块中像素点的像素值与当前编码块中像素点的像素值之间的误差,以及该K*M个滤波块中像素点的像素值与当前编码块中像素点的像素值之间的误差为绝对误差和(sum of absolute differences,SAD)、绝对变换误差和(sum of absolute transformed differences,SATD)或者均方差(mean squared error,MSE)中的任一种。
由于率失真代价用于指示重建块与原始编码块之间图像失真的程度,以及滤波块与原始编码块之间图像失真的程度。如果该重建块对应的率失真代价小于每个滤波块对应的率失真 代价,则表明该重建块与原始编码块之间的图像失真最小,这样,基于该重建块复原的重建图像与原始图像之间的图像失真最小。此时,不需要对该重建块进行滤波。如果该重建块对应的率失真代价不小于每个滤波块对应的率失真代价,则表明滤波块与原始编码块之间的图像失真最小,这样,基于滤波块复原的重建图像与原始图像之间的图像失真最小。此时,需要对该重建块进行滤波。
第一指示信息和第二指示信息的形式包括多种,比如数值、字符等等。在第一指示信息和第二指示信息为数值的情况下,第一指示信息为0,第二指示信息为1。当然,第一指示信息和第二指示信息也可以反过来,或者为其他的数值,本申请实施例对此不做限定。
在一些实施例中,编码端基于当前编码块、该重建块和该K组滤波模型,确定出该重建块对应的滤波指示信息之后,还需要将滤波指示信息编入码流。这样,解码端接收到码流之后,能够基于码流确定该重建块是否需要进行滤波。
(2)在滤波指示信息指示该重建块需要滤波的情况下,从该K组滤波模型中确定目标滤波模型。
基于上文描述,以率失真代价指示编码失真为例。在滤波指示信息指示该重建块需要滤波的情况下,比较每个滤波块对应的率失真代价,将率失真代价最小的滤波块所对应的滤波模型,确定为目标滤波模型。当然,在实际应用中,还能够通过其他的参数指示编码失真,本申请实施例对此不做限定。
在一些实施例中,编码端从该K组滤波模型中确定出目标滤波模型之后,还需要将目标索引编入码流,目标索引用于指示目标滤波模型。这样,解码端接收到码流之后,能够基于码流确定对该重建块进行滤波的目标滤波模型。
作为一种示例,为了区分不同的滤波模型,每个滤波模型对应一个模型索引,不同的滤波模型对应不同的模型索引。此时,目标索引包括目标模型索引,目标模型索引用于指示K组滤波模型中的目标滤波模型。
作为另一种示例,由于同一组滤波模型对应同一个量化参数,不同组滤波模型对应不同的量化参数。所以,为了区分不同的滤波模型,同一组滤波模型对应同一个质量索引,不同组滤波模型对应不同的质量索引。同一组滤波模型中不同的滤波模型对应不同的内容索引,不同组的滤波模型可能存在相同的内容索引。此时,目标索引包括目标质量索引和目标内容索引。目标质量索引用于指示目标滤波模型所属的一组滤波模型,目标内容索引用于指示目标滤波模型为这组滤波模型中的哪个模型。
需要说明的是,以上内容是以滤波指示信息指示该重建块需要滤波的情况为例。当然,在实际应用中,滤波指示信息还可能指示该重建块不需要滤波。在滤波指示信息指示该重建块不需要滤波的情况下,不对该重建块进行滤波。
步骤404:基于目标滤波模型,对该重建块进行滤波。
将该重建块输入至目标滤波模型,目标滤波模型按照相关算法输出滤波块,以此来实现对该重建块进行滤波。
在本申请实施例中,由于每组滤波模型包括M个滤波模型,同一组滤波模型对应同一个量化参数,不同组滤波模型对应不同的量化参数,也即是,同一组滤波模型适用于相同编码质量的编码块,不同组的滤波模型适用于不同编码质量的编码块,而且同一组滤波模型中的不同滤波模型适用于不同内容的编码块。这样,在基于目标图像的量化参数确定出K组滤波 模型之后,对于当前编码块对应的重建块,能够结合编码质量和编码块的内容,从该K组滤波模型中选择出目标滤波模型,进而基于目标滤波模型对重建块进行滤波,从而减小编码失真,提高滤波性能。而且,对于同一个图像中不同编码质量、不同内容的编码块来说,能够在简化网络模型的基础上提高滤波性能,满足同一图像中不同质量、不同内容的编码块的滤波效果。
图5是本申请实施例提供的另一种滤波方法的流程图。该方法应用于解码端,请参考图5,该方法包括如下步骤。
步骤501:确定出K组滤波模型,该K组滤波模型中的每组滤波模型包括M个滤波模型,且同一组滤波模型对应同一个量化参数,不同组滤波模型对应不同的量化参数,K和M均为大于1的整数。
在一些实施例中,解码端根据该重建块所属的目标图像的量化参数,确定出K组滤波模型。详细实现过程参考上述步骤401的相关描述,此处不再赘述。
在另一些实施例中,由于编码端根据目标图像的量化参数,确定出K组滤波模型之后,还将该K组滤波模型对应的量化参数编入码流。所以,解码端接收到码流之后,能够从码流中解析出该K组滤波模型对应的量化参数,进而基于该K组滤波模型对应的量化参数,确定出该K组滤波模型。
步骤502:基于码流确定重建块。
解码端接收到码流之后,从该码流中解析出当前编码块对应的重建块。解码端从码流中解析出当前编码块对应的重建块的过程,可以参考上述图3中的相关描述,此处不再赘述。
步骤503:确定该K组滤波模型中的目标滤波模型。
在一些实施例中,解码端确定该重建块的滤波指示信息,滤波指示信息用于指示该重建块是否需要滤波。在滤波指示信息指示该重建块需要滤波的情况下,确定该K组滤波模型中的目标滤波模型。
由于编码端基于当前编码块、该重建块和该K组滤波模型,确定出该重建块对应的滤波指示信息之后,还将滤波指示信息编入码流。所以,解码端接收到该码流之后,能够从该码流中解析出滤波指示信息,进而基于滤波指示信息判断是否需要对该重建块进行滤波。
在滤波指示信息指示该重建块需要滤波的情况下,由于编码端从该K组滤波模型中确定目标滤波模型之后,还将用于指示目标滤波模型的目标索引编入码流。所以,解码端还能够从码流中解析出目标索引,进而基于目标索引确定目标滤波模型。
基于上文描述,目标索引包括目标模型索引,或者,包括目标质量索引和目标内容索引。在不同的情况下,解码端基于目标索引确定目标滤波模型的过程有所不同,因此接下来将分为以下两种情况分别进行说明。
第一种情况,目标索引包括目标模型索引。此时,解码端直接基于目标模型索引,从该K组滤波模型中选择对应的滤波模型,并将选择出的滤波模型确定为目标滤波模型。
第二种情况,目标索引包括目标质量索引和目标内容索引。此时,解码端先基于目标质量索引,从该K组滤波模型中选择对应的一组滤波模型。然后,基于目标内容索引从选择出的这组滤波模型中,确定目标内容索引所对应的滤波模型,以得到目标滤波模型。
步骤504:基于目标滤波模型,对该重建块进行滤波。
将该重建块输入至目标滤波模型,目标滤波模型按照相关算法输出滤波块,以此来实现对该重建块进行滤波。
在本申请实施例中,由于每组滤波模型包括M个滤波模型,同一组滤波模型对应同一个量化参数,不同组滤波模型对应不同的量化参数,也即是,同一组滤波模型适用于相同编码质量的编码块,不同组的滤波模型适用于不同编码质量的编码块,而且同一组滤波模型中的不同滤波模型适用于不同内容的编码块。这样,在基于目标图像的量化参数确定出K组滤波模型之后,对于当前编码块对应的重建块,能够结合编码质量和编码块的内容,从该K组滤波模型中选择出目标滤波模型,进而基于目标滤波模型对重建块进行滤波,从而减小编码失真,提高滤波性能。而且,对于同一个图像中不同编码质量、不同内容的编码块来说,能够在简化网络模型的基础上提高滤波性能,满足同一图像中不同质量、不同内容的编码块的滤波效果。
在编码端和解码端按照上述步骤的相关内容对重建块进行滤波之前,还需要对未经训练的滤波模型进行训练,以得到一个量化参数对应的一组滤波模型,该组滤波模型包括M个滤波模型。图6是本申请实施例提供的一种滤波模型训练方法的流程图,请参考图6,该方法包括如下步骤。
步骤601:获取训练样本集,该训练样本集包括多个样本编码块和每个样本编码块对应的重建块,该多个样本编码块所属的图像的量化参数为同一个量化参数。
该多个样本编码块为从多个样本图像划分得到的,或者,该多个样本编码块为从一个样本图像划分得到的。即,该多个样本编码块可能来自同一个样本图像,也可能来自不同的样本图像,只要该多个样本编码块所属的图像的量化参数相同即可。由于该多个样本编码块是将图像划分为多个不重叠的编码块得到的,所以,该多个样本编码块的内容不同。
其中,获取每个样本编码块对应的重建块的过程,可以参考上述图2中编码端确定当前编码块对应的重建块的相关描述,此处不再赘述。
步骤602:基于该训练样本集,对待训练的滤波模型进行训练,以得到初始滤波模型。
将该训练样本集包括的多个样本编码块对应的重建块作为待训练的滤波模型的输入,将该多个样本编码块作为待训练的滤波模型的输出,对待训练的滤波模型进行训练,以得到初始滤波模型。
步骤603:将该训练样本集划分为M个初始样本子集,每个初始样本子集包括至少两个样本编码块和该至少两个样本编码块对应的重建块。
将该多个样本编码块对应的重建块输入至初始滤波模型,以得到每个样本编码块对应的滤波块,基于该多个样本编码块和每个样本编码块对应的滤波块,确定每个样本编码块对应的滤波块的峰值信噪比,按照峰值信噪比的大小顺序,对该多个样本编码块进行排序,按照排序结果,将该训练样本集划分为M个初始样本子集,每个初始样本子集包括的样本编码块为排序结果中连续的至少两个样本编码块。
对于该多个样本编码块中的任一样本编码块,按照下述公式(2)来确定该样本编码块对应的滤波块的峰值信噪比。
其中,在上述公式(2)中,PSNR代表该样本编码块对应的滤波块的峰值信噪比,n代表该样本编码块中编码每个像素所需的比特数,通常为8,MSE代表该样本编码块中像素点的像素值与对应的滤波块中像素点的像素值之间的均方差。
示例地,按照该多个样本编码块的排序结果,将该训练样本集平均划分为M个初始样本子集,每个初始样本子集包括的样本编码块的数量相同。当然,在实际应用中,按照峰值信噪比的大小顺序,对该多个样本编码块进行排序之后,还能够按照其他的标准将该训练样本集划分为M个初始样本子集,本申请实施例对此不做限定。
例如,该训练样本集包括16个样本编码块和每个样本编码块对应的重建块。假设,该16个样本编码块为B0-B15,该16个样本编码块对应的重建块为C0-C15。将C0-C15这16个重建块分别输入至初始滤波模型,得到的16个滤波块为L0-L15,进而按照上述公式(2)分别确定L0-L15的峰值信噪比,得到的16个峰值信噪比为PSNR0-PSNR15。然后,按照PSNR0-PSNR15的大小顺序,将B0-B15进行排序,并按照B0-B15的排序结果,将B0-B15平均划分为4个初始样本子集,每个初始样本子集包括4个样本编码块。
需要说明的是,按照峰值信噪比的大小顺序,对该多个样本编码块进行排序,进而将该训练样本集划分为M个初始样本子集为一种示例。在另一些实施例中,还能够按照其他的方式,将该训练样本集划分为M个初始样本子集。例如,确定每个样本编码块对应的像素均值,该像素均值是指该样本编码块中像素点的像素值的平均值。然后,按照像素均值的大小顺序,对该多个样本编码块进行排序,按照排序结果,将该训练样本集划分为M个初始样本子集。又例如,确定每个样本编码块对应的像素方差,该像素方差是指该样本编码块中像素点的像素值的方差。然后,按照像素方差的大小顺序,对该多个样本编码块进行排序,按照排序结果,将该训练样本集划分为M个初始样本子集。
步骤604:基于该M个初始样本子集,对初始滤波模型分别进行训练,以得到M个优化滤波模型。
对于该M个初始样本子集中的任一初始样本子集,将该初始样本子集包括的至少两个样本编码块对应的重建块作为初始滤波模型的输入,将该至少两个样本编码块作为初始滤波模型的输出,对初始滤波模型进行训练,以得到一个优化滤波模型。这样,对于该M个初始样本子集中的每一个初始样本子集,均能够按照上述步骤对初始滤波模型进行训练,从而得到M个优化滤波模型。
基于上文描述,假设将该训练样本集B0-B15平均划分为4个初始样本子集,那么基于该4个初始样本子集,对初始滤波模型分别进行训练,能够得到滤波模型A、滤波模型B、滤波模型C以及滤波模型D这4个优化滤波模型。
步骤605:基于该训练样本集,对该M个优化滤波模型进行训练,以得到一组滤波模型。
基于该训练样本集,通过循环迭代方式,对该M个优化滤波模型进行训练。其中,循环迭代方式中的第i次迭代处理包括如下步骤:
(1)基于该多个样本编码块和每个样本编码块对应的重建块,将该训练样本集划分为M个优化样本子集,该M个优化样本子集与第i次迭代处理的M个滤波模型一一对应,其中, 第一次迭代处理的M个滤波模型为该M个优化滤波模型。
将该多个样本编码块对应的重建块输入至第i次迭代处理的M个滤波模型,以得到每个样本编码块对应的M个滤波块,基于该多个样本编码块和每个样本编码块对应的M个滤波块,确定每个样本编码块对应的M个滤波块的峰值信噪比,基于每个样本编码块对应的M个滤波块的峰值信噪比,将该训练样本集划分为该M个优化样本子集,其中,每个样本编码块位于其对应的M个滤波块中最大峰值信噪比的滤波块所对应的滤波模型的优化样本子集中。
其中,确定每个样本编码块对应的M个滤波块的峰值信噪比的过程,可以参考上述步骤603中按照公式(2)来确定峰值信噪比的相关描述,此处不再赘述。对于该多个样本编码块中的任一样本编码块,确定该样本编码块对应的M个滤波块的峰值信噪比中最大峰值信噪比对应的滤波模型,进而将该样本编码块划分至该滤波模型对应的优化样本子集中。
基于上文描述,该训练样本集包括16个样本编码块和每个样本编码块对应的重建块。假设,该16个样本编码块对应的重建块为C0-C15。以该16个重建块中的重建块C0为例,将重建块C0分别输入至第i次迭代处理的4个滤波模型中,4个滤波模型输出样本编码块B0对应的4个滤波块为L0A、L0B、L0C和L0D。假设,按照上述公式(2)确定出样本编码块B0对应的4个峰值信噪比为PSNR0A、PSNR0B、PSNR0C和PSNR0D。其中,峰值信噪比PSNR0C最大,则将样本编码块B0划分至滤波模型C对应的优化样本子集中。
(2)基于该M个优化样本子集,对第i次迭代处理的M个滤波模型进行训练。
对于该M个优化样本子集中的任一优化样本子集,将该优化样本子集包括的样本编码块对应的重建块作为相应的滤波模型的输入,将该样本编码块作为相应的滤波模型的输出,对相应的滤波模型进行训练。
(3)如果i小于迭代次数阈值,则将经训练的第i次迭代处理的M个滤波模型作为第i+1次迭代处理的M个滤波模型,并执行第i+1次迭代处理。
(4)如果i大于或等于迭代次数阈值,则将经训练的第i次迭代处理的M个滤波模型确定为一组滤波模型。
本申请实施例是通过循环迭代的方式对该M个优化滤波模型进行训练,在该M个滤波模型的迭代次数i小于迭代次数阈值时,表明当前训练得到的优化滤波模型不可靠,则将经训练的第i次迭代处理的M个滤波模型作为第i+1次迭代处理的M个滤波模型,继续执行第i+1次迭代处理。在该M个滤波模型的迭代次数i大于或等于迭代次数阈值时,表明当前训练得到的优化滤波模型可靠,则停止迭代处理,并将经训练的第i次迭代处理的M个滤波模型作为一组滤波模型。
其中,迭代次数阈值是事先设置的,该迭代次数阈值为指定迭代次数,或者为最大迭代次数,能够按照不同的需求来设置,本申请实施例对此不做限定。
需要说明的是,在基于每个样本编码块对应的M个滤波块的峰值信噪比,将该训练样本集划分为M个优化样本子集的过程中,可能存在将该训练样本集只划分为1个优化样本子集的情况。即,该训练样本集中每个样本编码块对应的M个滤波块的峰值信噪比中最大峰值信噪比均对应同一个滤波模型。此时,基于划分得到的1个优化样本子集,对第i次迭代处理的该滤波模型进行训练,其他滤波模型则停止迭代处理。
在本申请实施例中,由于训练样本集包括的多个样本编码块所属的图像的量化参数相同,所以,基于该训练样本集对未经训练的滤波模型进行训练,得到的一组滤波模型适用于相同 编码质量的编码块。此外,由于训练样本集包括的多个样本编码块的内容不同,所以,该组滤波模型包括的M个滤波模型适用于不同内容的编码块。这样,在基于目标图像的量化参数确定出K组滤波模型之后,对于当前编码块对应的重建块,能够结合编码质量和编码块的内容,从该K组滤波模型中选择出目标滤波模型,进而基于目标滤波模型对重建块进行滤波,从而减小编码失真,提高滤波性能。而且,对于同一个图像中不同编码质量、不同内容的编码块来说,能够在简化网络模型的基础上提高滤波性能,满足同一图像中不同质量、不同内容的编码块的滤波效果。
图7是本申请实施例提供的一种滤波装置的结构示意图,该滤波装置可以由软件、硬件或者两者的结合实现成为编码端设备的部分或者全部,该编码端设备可以为图1所示的源装置。参见图7,该装置包括:第一确定模块701、第二确定模块702、第三确定模块703和第一滤波模块704。
第一确定模块701,用于根据目标图像的量化参数,确定出K组滤波模型,该K组滤波模型中的每组滤波模型包括M个滤波模型,且同一组滤波模型对应同一个量化参数,不同组滤波模型对应不同的量化参数,K和M均为大于1的整数。详细实现过程参考上述各个实施例中对应的内容,此处不再赘述。
第二确定模块702,用于确定目标图像中当前编码块对应的重建块。详细实现过程参考上述各个实施例中对应的内容,此处不再赘述。
第三确定模块703,用于从该K组滤波模型中确定目标滤波模型,目标滤波模型是指对该重建块进行滤波后编码失真最小的滤波模型,且通过目标滤波模型对该重建块进行滤波后的编码失真小于该重建块的编码失真。详细实现过程参考上述各个实施例中对应的内容,此处不再赘述。
第一滤波模块704,用于基于目标滤波模型,对该重建块进行滤波。详细实现过程参考上述各个实施例中对应的内容,此处不再赘述。
可选地,第三确定模块703包括:
第一确定单元,用于基于当前编码块、该重建块和该K组滤波模型,确定该重建块对应的滤波指示信息,滤波指示信息用于指示该重建块是否需要滤波;
第二确定单元,用于在滤波指示信息指示该重建块需要滤波的情况下,从该K组滤波模型中确定目标滤波模型。
可选地,第一确定单元具体用于:
将该重建块输入至该K组滤波模型中的每个滤波模型,以得到K*M个滤波块;
基于当前编码块、该重建块和该K*M个滤波块,确定该重建块对应的率失真代价,以及每个滤波块对应的率失真代价;
如果该重建块对应的率失真代价不小于每个滤波块对应的率失真代价,则确定滤波指示信息为第一指示信息,第一指示信息用于指示该重建块需要滤波;
如果该重建块对应的率失真代价小于每个滤波块对应的率失真代价,则确定滤波指示信息为第二指示信息,第二指示信息用于指示该重建块不需要滤波。
可选地,该装置还包括:
第二滤波模块,用于在滤波指示信息指示该重建块不需要滤波的情况下,不对该重建块 进行滤波。
可选地,第三确定模块703还包括:
编码单元,用于将滤波指示信息编入码流。
可选地,该装置还包括:
第一编码模块,用于将目标索引编入码流,目标索引用于指示目标滤波模型。
可选地,该装置还包括:
第二编码模块,用于将该K组滤波模型对应的量化参数编入码流。
在本申请实施例中,由于每组滤波模型包括M个滤波模型,同一组滤波模型对应同一个量化参数,不同组滤波模型对应不同的量化参数,也即是,同一组滤波模型适用于相同编码质量的编码块,不同组的滤波模型适用于不同编码质量的编码块,而且同一组滤波模型中的不同滤波模型适用于不同内容的编码块。这样,在基于目标图像的量化参数确定出K组滤波模型之后,对于当前编码块对应的重建块,能够结合编码质量和编码块的内容,从该K组滤波模型中选择出目标滤波模型,进而基于目标滤波模型对重建块进行滤波,从而减小编码失真,提高滤波性能。而且,对于同一个图像中不同编码质量、不同内容的编码块来说,能够在简化网络模型的基础上提高滤波性能,满足同一图像中不同质量、不同内容的编码块的滤波效果。
需要说明的是:上述实施例提供的滤波装置在进行滤波时,仅以上述各功能模块的划分进行举例说明,实际应用中,可以根据需要而将上述功能分配由不同的功能模块完成,即将装置的内部结构划分成不同的功能模块,以完成以上描述的全部或者部分功能。另外,上述实施例提供的滤波装置与滤波方法实施例属于同一构思,其具体实现过程详见方法实施例,这里不再赘述。
图8是本申请实施例提供的另一种滤波装置的结构示意图,该滤波装置可以由软件、硬件或者两者的结合实现成为解码端设备的部分或者全部,该解码端设备可以为图1所示的目的装置。参见图8,该装置包括:第一确定模块801、第二确定模块802、第三确定模块803和滤波模块804。
第一确定模块801,用于确定出K组滤波模型,该K组滤波模型中的每组滤波模型包括M个滤波模型,且同一组滤波模型对应同一个量化参数,不同组滤波模型对应不同的量化参数,K和M均为大于1的整数。详细实现过程参考上述各个实施例中对应的内容,此处不再赘述。
第二确定模块802,用于基于码流确定重建块。详细实现过程参考上述各个实施例中对应的内容,此处不再赘述。
第三确定模块803,用于确定该K组滤波模型中的目标滤波模型。详细实现过程参考上述各个实施例中对应的内容,此处不再赘述。
滤波模块804,用于基于目标滤波模型,对该重建块进行滤波。详细实现过程参考上述各个实施例中对应的内容,此处不再赘述。
可选地,第一确定模块801具体用于:
根据该重建块所属的目标图像的量化参数,确定出该K组滤波模型。
可选地,第一确定模块801具体用于:
从码流中解析出该K组滤波模型对应的量化参数;
基于该K组滤波模型对应的量化参数,确定出该K组滤波模型。
可选地,第三确定模块803包括:
第一确定单元,用于确定该重建块的滤波指示信息,滤波指示信息用于指示该重建块是否需要滤波;
第二确定单元,用于在滤波指示信息指示该重建块需要滤波的情况下,确定该K组滤波模型中的目标滤波模型。
可选地,第一确定单元具体用于:
从码流中解析出滤波指示信息。
可选地,第三确定模块803具体用于:
从码流中解析出目标索引,目标索引用于指示目标滤波模型;
基于目标索引确定目标滤波模型。
在本申请实施例中,由于每组滤波模型包括M个滤波模型,同一组滤波模型对应同一个量化参数,不同组滤波模型对应不同的量化参数,也即是,同一组滤波模型适用于相同编码质量的编码块,不同组的滤波模型适用于不同编码质量的编码块,而且同一组滤波模型中的不同滤波模型适用于不同内容的编码块。这样,在基于目标图像的量化参数确定出K组滤波模型之后,对于当前编码块对应的重建块,能够结合编码质量和编码块的内容,从该K组滤波模型中选择出目标滤波模型,进而基于目标滤波模型对重建块进行滤波,从而减小编码失真,提高滤波性能。而且,对于同一个图像中不同编码质量、不同内容的编码块来说,能够在简化网络模型的基础上提高滤波性能,满足同一图像中不同质量、不同内容的编码块的滤波效果。
需要说明的是:上述实施例提供的滤波装置在进行滤波时,仅以上述各功能模块的划分进行举例说明,实际应用中,可以根据需要而将上述功能分配由不同的功能模块完成,即将装置的内部结构划分成不同的功能模块,以完成以上描述的全部或者部分功能。另外,上述实施例提供的滤波装置与滤波方法实施例属于同一构思,其具体实现过程详见方法实施例,这里不再赘述。
图9是本申请实施例提供的一种滤波模型训练装置的结构示意图,该滤波模型训练装置可以由软件、硬件或者两者的结合实现成为滤波模型训练设备的部分或者全部。参见图9,该装置包括:获取模块901、第一训练模块902、划分模块903、第二训练模块904和第三训练模块905。
获取模块901,用于获取训练样本集,该训练样本集包括多个样本编码块和每个样本编码块对应的重建块,该多个样本编码块所属的图像的量化参数为同一个量化参数。详细实现过程参考上述各个实施例中对应的内容,此处不再赘述。
第一训练模块902,用于基于该训练样本集,对待训练的滤波模型进行训练,以得到初始滤波模型。详细实现过程参考上述各个实施例中对应的内容,此处不再赘述。
划分模块903,用于将该训练样本集划分为M个初始样本子集,每个初始样本子集包括至少两个样本编码块和该至少两个样本编码块对应的重建块。详细实现过程参考上述各个实施例中对应的内容,此处不再赘述。
第二训练模块904,用于基于该M个初始样本子集,对初始滤波模型分别进行训练,以得到M个优化滤波模型。详细实现过程参考上述各个实施例中对应的内容,此处不再赘述。
第三训练模块905,用于基于该训练样本集,对该M个优化滤波模型进行训练,以得到一组滤波模型。详细实现过程参考上述各个实施例中对应的内容,此处不再赘述。
可选地,划分模块903具体用于:
将该多个样本编码块对应的重建块输入至初始滤波模型,以得到每个样本编码块对应的滤波块;
基于该多个样本编码块和每个样本编码块对应的滤波块,确定每个样本编码块对应的滤波块的峰值信噪比;
按照峰值信噪比的大小顺序,对该多个样本编码块进行排序;
按照排序结果,将该训练样本集划分为M个初始样本子集,每个初始样本子集包括的样本编码块为排序结果中连续的至少两个样本编码块。
可选地,第三训练模块905具体用于:
基于该训练样本集,通过循环迭代方式,对该M个优化滤波模型进行训练;其中,循环迭代方式中的第i次迭代处理包括如下步骤:
基于该多个样本编码块和每个样本编码块对应的重建块,将该训练样本集划分为M个优化样本子集,该M个优化样本子集与第i次迭代处理的M个滤波模型一一对应,其中,第一次迭代处理的M个滤波模型为该M个优化滤波模型;
基于该M个优化样本子集,对第i次迭代处理的M个滤波模型进行训练;
如果i小于迭代次数阈值,则将经训练的第i次迭代处理的M个滤波模型作为第i+1次迭代处理的M个滤波模型,并执行第i+1次迭代处理;
如果i大于或等于迭代次数阈值,则将经训练的第i次迭代处理的M个滤波模型确定为一组滤波模型。
可选地,第三训练模块905具体用于:
将该多个样本编码块对应的重建块输入至第i次迭代处理的M个滤波模型,以得到每个样本编码块对应的M个滤波块;
基于该多个样本编码块和每个样本编码块对应的M个滤波块,确定每个样本编码块对应的M个滤波块的峰值信噪比;
基于每个样本编码块对应的M个滤波块的峰值信噪比,将该训练样本集划分为该M个优化样本子集,其中,每个样本编码块位于其对应的M个滤波块中最大峰值信噪比的滤波块所对应的滤波模型的优化样本子集中。
在本申请实施例中,由于训练样本集包括的多个样本编码块所属的图像的量化参数相同,所以,基于该训练样本集对未经训练的滤波模型进行训练,得到的一组滤波模型适用于相同编码质量的编码块。此外,由于训练样本集包括的多个样本编码块的内容不同,所以,该组滤波模型包括的M个滤波模型适用于不同内容的编码块。这样,在基于目标图像的量化参数确定出K组滤波模型之后,对于当前编码块对应的重建块,能够结合编码质量和编码块的内容,从该K组滤波模型中选择出目标滤波模型,进而基于目标滤波模型对重建块进行滤波,从而减小编码失真,提高滤波性能。而且,对于同一个图像中不同编码质量、不同内容的编码块来说,能够在简化网络模型的基础上提高滤波性能,满足同一图像中不同质量、不同内 容的编码块的滤波效果。
需要说明的是:上述实施例提供的滤波模型训练装置在进行滤波模型训练时,仅以上述各功能模块的划分进行举例说明,实际应用中,可以根据需要而将上述功能分配由不同的功能模块完成,即将装置的内部结构划分成不同的功能模块,以完成以上描述的全部或者部分功能。另外,上述实施例提供的滤波模型训练装置与滤波模型训练方法实施例属于同一构思,其具体实现过程详见方法实施例,这里不再赘述。
图10为用于本申请实施例的一种计算机设备1000的示意性框图。其中,该计算机设备1000可以包括处理器1001、存储器1002和总线系统1003。其中,处理器1001和存储器1002通过总线系统1003相连,该存储器1002用于存储指令,该处理器1001用于执行该存储器1002存储的指令,以执行本申请实施例描述的滤波方法以及滤波模型训练方法。为避免重复,这里不再详细描述。
在本申请实施例中,该处理器1001可以是中央处理单元(central processing unit,CPU),该处理器1001还可以是其他通用处理器、DSP、ASIC、FPGA或者其他可编程逻辑器件、分立门或者晶体管逻辑器件、分立硬件组件等。通用处理器可以是微处理器或者该处理器也可以是任何常规的处理器等。
该存储器1002可以包括ROM设备或者RAM设备。任何其他适宜类型的存储设备也可以用作存储器1002。存储器1002可以包括由处理器1001使用总线1003访问的代码和数据10021。存储器1002可以进一步包括操作系统10023和应用程序10022,该应用程序10022包括允许处理器1001执行本申请实施例描述的滤波方法或滤波模型训练方法的至少一个程序。例如,应用程序10022可以包括应用1至N,其进一步包括执行在本申请实施例描述的滤波方法或滤波模型训练方法的应用。
该总线系统1003除包括数据总线之外,还可以包括电源总线、控制总线和状态信号总线等。但是为了清楚说明起见,在图中将各种总线都标为总线系统1003。
可选地,计算机设备1000还可以包括一个或多个输出设备,诸如显示器1004。在一个示例中,显示器1004可以是触感显示器,其将显示器与可操作地感测触摸输入的触感单元合并。显示器1004可以经由总线1003连接到处理器1001。
需要指出的是,计算机设备1000可以执行本申请实施例中的滤波方法,也可执行本申请实施例中的滤波模型训练方法。
本领域技术人员能够领会,结合本文公开描述的各种说明性逻辑框、模块和算法步骤所描述的功能可以硬件、软件、固件或其任何组合来实施。如果以软件来实施,那么各种说明性逻辑框、模块、和步骤描述的功能可作为一或多个指令或代码在计算机可读媒体上存储或传输,且由基于硬件的处理单元执行。计算机可读媒体可包含计算机可读存储媒体,其对应于有形媒体,例如数据存储媒体,或包括任何促进将计算机程序从一处传送到另一处的媒体(例如,基于通信协议)的通信媒体。以此方式,计算机可读媒体大体上可对应于(1)非暂时性的有形计算机可读存储媒体,或(2)通信媒体,例如信号或载波。数据存储媒体可为可由一或多个计算机或一或多个处理器存取以检索用于实施本申请中描述的技术的指令、代码和/或数据结构的任何可用媒体。计算机程序产品可包含计算机可读媒体。
作为实例而非限制,此类计算机可读存储媒体可包括RAM、ROM、EEPROM、CD-ROM 或其它光盘存储装置、磁盘存储装置或其它磁性存储装置、快闪存储器或可用来存储指令或数据结构的形式的所要程序代码并且可由计算机存取的任何其它媒体。并且,任何连接被恰当地称作计算机可读媒体。举例来说,如果使用同轴缆线、光纤缆线、双绞线、数字订户线(DSL)或例如红外线、无线电和微波等无线技术从网站、服务器或其它远程源传输指令,那么同轴缆线、光纤缆线、双绞线、DSL或例如红外线、无线电和微波等无线技术包含在媒体的定义中。但是,应理解,所述计算机可读存储媒体和数据存储媒体并不包括连接、载波、信号或其它暂时媒体,而是实际上针对于非暂时性有形存储媒体。如本文中所使用,磁盘和光盘包含压缩光盘(CD)、激光光盘、光学光盘、DVD和蓝光光盘,其中磁盘通常以磁性方式再现数据,而光盘利用激光以光学方式再现数据。以上各项的组合也应包含在计算机可读媒体的范围内。
可通过例如一或多个数字信号处理器(DSP)、通用微处理器、专用集成电路(ASIC)、现场可编程逻辑阵列(FPGA)或其它等效集成或离散逻辑电路等一或多个处理器来执行指令。因此,如本文中所使用的术语“处理器”可指前述结构或适合于实施本文中所描述的技术的任一其它结构中的任一者。另外,在一些方面中,本文中所描述的各种说明性逻辑框、模块、和步骤所描述的功能可以提供于经配置以用于编码和解码的专用硬件和/或软件模块内,或者并入在组合编解码器中。而且,所述技术可完全实施于一或多个电路或逻辑元件中。在一种示例下,编码器及解码器中的各种说明性逻辑框、单元、模块可以理解为对应的电路器件或逻辑元件。
本申请实施例的技术可在各种各样的装置或设备中实施,包含无线手持机、集成电路(IC)或一组IC(例如,芯片组)。本申请实施例中描述各种组件、模块或单元是为了强调用于执行所揭示的技术的装置的功能方面,但未必需要由不同硬件单元实现。实际上,如上文所描述,各种单元可结合合适的软件和/或固件组合在编码解码器硬件单元中,或者通过互操作硬件单元(包含如上文所描述的一或多个处理器)来提供。
也就是说,在上述实施例中,可以全部或部分地通过软件、硬件、固件或者其任意结合来实现。当使用软件实现时,可以全部或部分地以计算机程序产品的形式实现。所述计算机程序产品包括一个或多个计算机指令。在计算机上加载和执行所述计算机指令时,全部或部分地产生按照本申请实施例所述的流程或功能。所述计算机可以是通用计算机、专用计算机、计算机网络或其他可编程装置。所述计算机指令可以存储在计算机可读存储介质中,或者从一个计算机可读存储介质向另一个计算机可读存储介质传输,例如,所述计算机指令可以从一个网站站点、计算机、服务器或数据中心通过有线(例如:同轴电缆、光纤、数据用户线(digital subscriber line,DSL))或无线(例如:红外、无线、微波等)方式向另一个网站站点、计算机、服务器或数据中心进行传输。所述计算机可读存储介质可以是计算机能够存取的任何可用介质,或者是包含一个或多个可用介质集成的服务器、数据中心等数据存储设备。所述可用介质可以是磁性介质(例如:软盘、硬盘、磁带)、光介质(例如:数字通用光盘(digital versatile disc,DVD))或半导体介质(例如:固态硬盘(solid state disk,SSD))等。值得注意的是,本申请实施例提到的计算机可读存储介质可以为非易失性存储介质,换句话说,可以是非瞬时性存储介质。
在一些实施例中,提供了一种编码端设备,该编码端设备包括存储器和处理器;
存储器用于存储计算机程序,处理器用于执行存储器中存储的计算机程序,以实现上述 所述的滤波方法。
在一些实施例中,提供了一种解码端设备,解码端设备包括存储器和处理器;
存储器用于存储计算机程序,处理器用于执行存储器中存储的计算机程序,以实现上述所述的滤波方法。
在一些实施例中,提供了一种滤波模型训练设备,该滤波模型训练设备包括存储器和处理器;
存储器用于存储计算机程序,处理器用于执行存储器中存储的计算机程序,以实现上述所述的滤波模型训练方法。
在一些实施例中,提供了一种计算机可读存储介质,该存储介质内存储有指令,当该指令在所述计算机上运行时,使得计算机执行上述所述的方法的步骤。
在一些实施例中,提供了一种计算机程序,该计算机程序被执行时实现上述所述的方法。
应当理解的是,本文提及的“多个”是指两个或两个以上。在本申请实施例的描述中,除非另有说明,“/”表示或的意思,例如,A/B可以表示A或B;本文中的“和/或”仅仅是一种描述关联对象的关联关系,表示可以存在三种关系,例如,A和/或B,可以表示:单独存在A,同时存在A和B,单独存在B这三种情况。另外,为了便于清楚描述本申请实施例的技术方案,在本申请实施例中,采用了“第一”、“第二”等字样对功能和作用基本相同的相同项或相似项进行区分。本领域技术人员可以理解“第一”、“第二”等字样并不对数量和执行次序进行限定,并且“第一”、“第二”等字样也并不限定一定不同。
需要说明的是,本申请实施例所涉及的信息(包括但不限于用户设备信息、用户个人信息等)、数据(包括但不限于用于分析的数据、存储的数据、展示的数据等)以及信号,均为经用户授权或者经过各方充分授权的,且相关数据的收集、使用和处理需要遵守相关国家和地区的相关法律法规和标准。例如,本申请实施例中涉及到的量化参数、滤波模型、当前编码块以及重建块都是在充分授权的情况下获取的。
以上所述仅为本申请的示例性实施例,并不用以限制本申请,凡在本申请的精神和原则之内,所作的任何修改、等同替换、改进等,均应包含在本申请的保护范围之内。

Claims (39)

  1. 一种滤波方法,其特征在于,应用于编码端,所述方法包括:
    根据目标图像的量化参数,确定出K组滤波模型,所述K组滤波模型中的每组滤波模型包括M个滤波模型,且同一组滤波模型对应同一个量化参数,不同组滤波模型对应不同的量化参数,K和M均为大于1的整数;
    确定所述目标图像中当前编码块对应的重建块;
    从所述K组滤波模型中确定目标滤波模型,所述目标滤波模型是指对所述重建块进行滤波后编码失真最小的滤波模型,且通过所述目标滤波模型对所述重建块进行滤波后的编码失真小于所述重建块的编码失真;
    基于所述目标滤波模型,对所述重建块进行滤波。
  2. 如权利要求1所述的方法,其特征在于,所述从所述K组滤波模型中确定目标滤波模型,包括:
    基于所述当前编码块、所述重建块和所述K组滤波模型,确定所述重建块对应的滤波指示信息,所述滤波指示信息用于指示所述重建块是否需要滤波;
    在所述滤波指示信息指示所述重建块需要滤波的情况下,从所述K组滤波模型中确定目标滤波模型。
  3. 如权利要求2所述的方法,其特征在于,所述基于所述当前编码块、所述重建块和所述K组滤波模型,确定所述重建块对应的滤波指示信息,包括:
    将所述重建块输入至所述K组滤波模型中的每个滤波模型,以得到K*M个滤波块;
    基于所述当前编码块、所述重建块和所述K*M个滤波块,确定所述重建块对应的率失真代价,以及每个滤波块对应的率失真代价;
    如果所述重建块对应的率失真代价不小于每个滤波块对应的率失真代价,则确定所述滤波指示信息为第一指示信息,所述第一指示信息用于指示所述重建块需要滤波;
    如果所述重建块对应的率失真代价小于每个滤波块对应的率失真代价,则确定所述滤波指示信息为第二指示信息,所述第二指示信息用于指示所述重建块不需要滤波。
  4. 如权利要求2或3所述的方法,其特征在于,所述方法还包括:
    在所述滤波指示信息指示所述重建块不需要滤波的情况下,不对所述重建块进行滤波。
  5. 如权利要求2-4任一所述的方法,其特征在于,所述基于所述当前编码块、所述重建块和所述K组滤波模型,确定所述重建块对应的滤波指示信息之后,还包括:
    将所述滤波指示信息编入码流。
  6. 如权利要求1-5任一所述的方法,其特征在于,所述从所述K组滤波模型中确定目标滤波模型之后,还包括:
    将目标索引编入码流,所述目标索引用于指示所述目标滤波模型。
  7. 如权利要求1-6任一所述的方法,其特征在于,所述根据目标图像的量化参数,确定出K组滤波模型之后,还包括:
    将所述K组滤波模型对应的量化参数编入码流。
  8. 一种滤波方法,其特征在于,应用于解码端,所述方法包括:
    确定出K组滤波模型,所述K组滤波模型中的每组滤波模型包括M个滤波模型,且同一组滤波模型对应同一个量化参数,不同组滤波模型对应不同的量化参数,K和M均为大于1的整数;
    基于码流确定重建块;
    确定所述K组滤波模型中的目标滤波模型;
    基于所述目标滤波模型,对所述重建块进行滤波。
  9. 如权利要求8所述的方法,其特征在于,所述确定出K组滤波模型,包括:
    根据所述重建块所属的目标图像的量化参数,确定出所述K组滤波模型。
  10. 如权利要求8所述的方法,其特征在于,所述确定出K组滤波模型,包括:
    从所述码流中解析出所述K组滤波模型对应的量化参数;
    基于所述K组滤波模型对应的量化参数,确定出所述K组滤波模型。
  11. 如权利要求8-10任一所述的方法,其特征在于,所述确定所述K组滤波模型中的目标滤波模型,包括:
    确定所述重建块的滤波指示信息,所述滤波指示信息用于指示所述重建块是否需要滤波;
    在所述滤波指示信息指示所述重建块需要滤波的情况下,确定所述K组滤波模型中的目标滤波模型。
  12. 如权利要求11所述的方法,其特征在于,所述确定所述重建块的滤波指示信息,包括:
    从所述码流中解析出所述滤波指示信息。
  13. 如权利要求8-12任一所述的方法,其特征在于,所述确定所述K组滤波模型中的目标滤波模型,包括:
    从所述码流中解析出目标索引,所述目标索引用于指示所述目标滤波模型;
    基于所述目标索引确定所述目标滤波模型。
  14. 一种滤波模型训练方法,其特征在于,所述方法包括:
    获取训练样本集,所述训练样本集包括多个样本编码块和每个样本编码块对应的重建块,所述多个样本编码块所属的图像的量化参数为同一个量化参数;
    基于所述训练样本集,对待训练的滤波模型进行训练,以得到初始滤波模型;
    将所述训练样本集划分为M个初始样本子集,每个初始样本子集包括至少两个样本编码块和所述至少两个样本编码块对应的重建块;
    基于所述M个初始样本子集,对所述初始滤波模型分别进行训练,以得到M个优化滤波模型;
    基于所述训练样本集,对所述M个优化滤波模型进行训练,以得到一组滤波模型。
  15. 如权利要求14所述的方法,其特征在于,所述将所述训练样本集划分为M个初始样本子集,包括:
    将所述多个样本编码块对应的重建块输入至所述初始滤波模型,以得到每个样本编码块对应的滤波块;
    基于所述多个样本编码块和每个样本编码块对应的滤波块,确定每个样本编码块对应的滤波块的峰值信噪比;
    按照峰值信噪比的大小顺序,对所述多个样本编码块进行排序;
    按照排序结果,将所述训练样本集划分为M个初始样本子集,每个初始样本子集包括的样本编码块为所述排序结果中连续的至少两个样本编码块。
  16. 如权利要求14或15所述的方法,其特征在于,所述基于所述训练样本集,对所述M个优化滤波模型进行训练,以得到一组滤波模型,包括:
    基于所述训练样本集,通过循环迭代方式,对所述M个优化滤波模型进行训练;其中,所述循环迭代方式中的第i次迭代处理包括如下步骤:
    基于所述多个样本编码块和每个样本编码块对应的重建块,将所述训练样本集划分为M个优化样本子集,所述M个优化样本子集与第i次迭代处理的M个滤波模型一一对应,其中,第一次迭代处理的M个滤波模型为所述M个优化滤波模型;
    基于所述M个优化样本子集,对所述第i次迭代处理的M个滤波模型进行训练;
    如果所述i小于迭代次数阈值,则将经训练的第i次迭代处理的M个滤波模型作为第i+1次迭代处理的M个滤波模型,并执行第i+1次迭代处理;
    如果所述i大于或等于所述迭代次数阈值,则将经训练的第i次迭代处理的M个滤波模型确定为一组滤波模型。
  17. 如权利要求16所述的方法,其特征在于,所述基于所述多个样本编码块和每个样本编码块对应的重建块,将所述训练样本集划分为M个优化样本子集,包括:
    将所述多个样本编码块对应的重建块输入至所述第i次迭代处理的M个滤波模型,以得到每个样本编码块对应的M个滤波块;
    基于所述多个样本编码块和每个样本编码块对应的M个滤波块,确定每个样本编码块对应的M个滤波块的峰值信噪比;
    基于每个样本编码块对应的M个滤波块的峰值信噪比,将所述训练样本集划分为所述M个优化样本子集,其中,每个样本编码块位于其对应的M个滤波块中最大峰值信噪比的滤波块所对应的滤波模型的优化样本子集中。
  18. 一种滤波装置,其特征在于,应用于编码端,所述装置包括:
    第一确定模块,用于根据目标图像的量化参数,确定出K组滤波模型,所述K组滤波模型中的每组滤波模型包括M个滤波模型,且同一组滤波模型对应同一个量化参数,不同组滤波模型对应不同的量化参数,K和M均为大于1的整数;
    第二确定模块,用于确定所述目标图像中当前编码块对应的重建块;
    第三确定模块,用于从所述K组滤波模型中确定目标滤波模型,所述目标滤波模型是指对所述重建块进行滤波后编码失真最小的滤波模型,且通过所述目标滤波模型对所述重建块进行滤波后的编码失真小于所述重建块的编码失真;
    第一滤波模块,用于基于所述目标滤波模型,对所述重建块进行滤波。
  19. 如权利要求18所述的装置,其特征在于,所述第三确定模块包括:
    第一确定单元,用于基于所述当前编码块、所述重建块和所述K组滤波模型,确定所述重建块对应的滤波指示信息,所述滤波指示信息用于指示所述重建块是否需要滤波;
    第二确定单元,用于在所述滤波指示信息指示所述重建块需要滤波的情况下,从所述K组滤波模型中确定目标滤波模型。
  20. 如权利要求19所述的装置,其特征在于,所述第一确定单元具体用于:
    将所述重建块输入至所述K组滤波模型中的每个滤波模型,以得到K*M个滤波块;
    基于所述当前编码块、所述重建块和所述K*M个滤波块,确定所述重建块对应的率失真代价,以及每个滤波块对应的率失真代价;
    如果所述重建块对应的率失真代价不小于每个滤波块对应的率失真代价,则确定所述滤波指示信息为第一指示信息,所述第一指示信息用于指示所述重建块需要滤波;
    如果所述重建块对应的率失真代价小于每个滤波块对应的率失真代价,则确定所述滤波指示信息为第二指示信息,所述第二指示信息用于指示所述重建块不需要滤波。
  21. 如权利要求19或20所述的装置,其特征在于,所述装置还包括:
    第二滤波模块,用于在所述滤波指示信息指示所述重建块不需要滤波的情况下,不对所述重建块进行滤波。
  22. 如权利要求19-21任一所述的装置,其特征在于,所述第三确定模块还包括:
    编码单元,用于将所述滤波指示信息编入码流。
  23. 如权利要求18-22任一所述的装置,其特征在于,所述装置还包括:
    第一编码模块,用于将目标索引编入码流,所述目标索引用于指示所述目标滤波模型。
  24. 如权利要求18-23任一所述的装置,其特征在于,所述装置还包括:
    第二编码模块,用于将所述K组滤波模型对应的量化参数编入码流。
  25. 一种滤波装置,其特征在于,应用于解码端,所述装置包括:
    第一确定模块,用于确定出K组滤波模型,所述K组滤波模型中的每组滤波模型包括M个滤波模型,且同一组滤波模型对应同一个量化参数,不同组滤波模型对应不同的量化参数,K和M均为大于1的整数;
    第二确定模块,用于基于码流确定重建块;
    第三确定模块,用于确定所述K组滤波模型中的目标滤波模型;
    滤波模块,用于基于所述目标滤波模型,对所述重建块进行滤波。
  26. 如权利要求25所述的装置,其特征在于,所述第一确定模块具体用于:
    根据所述重建块所属的目标图像的量化参数,确定出所述K组滤波模型。
  27. 如权利要求25所述的装置,其特征在于,所述第一确定模块具体用于:
    从所述码流中解析出所述K组滤波模型对应的量化参数;
    基于所述K组滤波模型对应的量化参数,确定出所述K组滤波模型。
  28. 如权利要求25-27任一所述的装置,其特征在于,所述第三确定模块包括:
    第一确定单元,用于确定所述重建块的滤波指示信息,所述滤波指示信息用于指示所述重建块是否需要滤波;
    第二确定单元,用于在所述滤波指示信息指示所述重建块需要滤波的情况下,确定所述K组滤波模型中的目标滤波模型。
  29. 如权利要求28所述的装置,其特征在于,所述第一确定单元具体用于:
    从所述码流中解析出所述滤波指示信息。
  30. 如权利要求25-29任一所述的装置,其特征在于,所述第三确定模块具体用于:
    从所述码流中解析出目标索引,所述目标索引用于指示所述目标滤波模型;
    基于所述目标索引确定所述目标滤波模型。
  31. 一种滤波模型训练装置,其特征在于,所述装置包括:
    获取模块,用于获取训练样本集,所述训练样本集包括多个样本编码块和每个样本编码块对应的重建块,所述多个样本编码块所属的图像的量化参数为同一个量化参数;
    第一训练模块,用于基于所述训练样本集,对待训练的滤波模型进行训练,以得到初始滤波模型;
    划分模块,用于将所述训练样本集划分为M个初始样本子集,每个初始样本子集包括至少两个样本编码块和所述至少两个样本编码块对应的重建块;
    第二训练模块,用于基于所述M个初始样本子集,对所述初始滤波模型分别进行训练,以得到M个优化滤波模型;
    第三训练模块,用于基于所述训练样本集,对所述M个优化滤波模型进行训练,以得到一组滤波模型。
  32. 如权利要求31所述的装置,其特征在于,所述划分模块具体用于:
    将所述多个样本编码块对应的重建块输入至所述初始滤波模型,以得到每个样本编码块对应的滤波块;
    基于所述多个样本编码块和每个样本编码块对应的滤波块,确定每个样本编码块对应的滤波块的峰值信噪比;
    按照峰值信噪比的大小顺序,对所述多个样本编码块进行排序;
    按照排序结果,将所述训练样本集划分为M个初始样本子集,每个初始样本子集包括的样本编码块为所述排序结果中连续的至少两个样本编码块。
  33. 如权利要求31或32所述的装置,其特征在于,所述第三训练模块具体用于:
    基于所述训练样本集,通过循环迭代方式,对所述M个优化滤波模型进行训练;其中,所述循环迭代方式中的第i次迭代处理包括如下步骤:
    基于所述多个样本编码块和每个样本编码块对应的重建块,将所述训练样本集划分为M个优化样本子集,所述M个优化样本子集与第i次迭代处理的M个滤波模型一一对应,其中,第一次迭代处理的M个滤波模型为所述M个优化滤波模型;
    基于所述M个优化样本子集,对所述第i次迭代处理的M个滤波模型进行训练;
    如果所述i小于迭代次数阈值,则将经训练的第i次迭代处理的M个滤波模型作为第i+1次迭代处理的M个滤波模型,并执行第i+1次迭代处理;
    如果所述i大于或等于所述迭代次数阈值,则将经训练的第i次迭代处理的M个滤波模型确定为一组滤波模型。
  34. 如权利要求33所述的装置,其特征在于,所述第三训练模块具体用于:
    将所述多个样本编码块对应的重建块输入至所述第i次迭代处理的M个滤波模型,以得到每个样本编码块对应的M个滤波块;
    基于所述多个样本编码块和每个样本编码块对应的M个滤波块,确定每个样本编码块对应的M个滤波块的峰值信噪比;
    基于每个样本编码块对应的M个滤波块的峰值信噪比,将所述训练样本集划分为所述M个优化样本子集,其中,每个样本编码块位于其对应的M个滤波块中最大峰值信噪比的滤波块所对应的滤波模型的优化样本子集中。
  35. 一种编码端设备,其特征在于,所述编码端设备包括存储器和处理器;
    所述存储器用于存储计算机程序,所述处理器用于执行所述存储器中存储的计算机程序,以实现权利要求1-7任一所述方法的步骤。
  36. 一种解码端设备,其特征在于,所述解码端设备包括存储器和处理器;
    所述存储器用于存储计算机程序,所述处理器用于执行所述存储器中存储的计算机程序,以实现权利要求8-13任一所述方法的步骤。
  37. 一种滤波模型训练设备,其特征在于,所述滤波模型训练设备包括存储器和处理器,所述存储器用于存储计算机程序,所述处理器被配置为用于执行所述存储器中存储的计算机程序,以实现权利要求14-17任一项所述方法的步骤。
  38. 一种计算机可读存储介质,其特征在于,所述存储介质内存储有指令,当所述指令在所述计算机上运行时,使得所述计算机执行权利要求1-17任一所述的方法的步骤。
  39. 一种计算机程序,其特征在于,所述计算机程序包括指令,当所述指令在所述计算机上运行时,使得所述计算机执行权利要求1-17任一项所述方法的步骤。
PCT/CN2023/094769 2022-05-31 2023-05-17 滤波方法、滤波模型训练方法及相关装置 WO2023231775A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202210616061.5 2022-05-31
CN202210616061.5A CN117201782A (zh) 2022-05-31 2022-05-31 滤波方法、滤波模型训练方法及相关装置

Publications (1)

Publication Number Publication Date
WO2023231775A1 true WO2023231775A1 (zh) 2023-12-07

Family

ID=88991214

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2023/094769 WO2023231775A1 (zh) 2022-05-31 2023-05-17 滤波方法、滤波模型训练方法及相关装置

Country Status (3)

Country Link
CN (1) CN117201782A (zh)
TW (1) TW202349966A (zh)
WO (1) WO2023231775A1 (zh)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110351568A (zh) * 2019-06-13 2019-10-18 天津大学 一种基于深度卷积网络的视频环路滤波器
CN111010568A (zh) * 2018-10-06 2020-04-14 华为技术有限公司 插值滤波器的训练方法、装置及视频图像编解码方法、编解码器
CN113422966A (zh) * 2021-05-27 2021-09-21 绍兴市北大信息技术科创中心 一种多模型cnn环路滤波方法
US20210329286A1 (en) * 2020-04-18 2021-10-21 Alibaba Group Holding Limited Convolutional-neutral-network based filter for video coding
CN113766249A (zh) * 2020-06-01 2021-12-07 腾讯科技(深圳)有限公司 视频编解码中的环路滤波方法、装置、设备及存储介质

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111010568A (zh) * 2018-10-06 2020-04-14 华为技术有限公司 插值滤波器的训练方法、装置及视频图像编解码方法、编解码器
CN110351568A (zh) * 2019-06-13 2019-10-18 天津大学 一种基于深度卷积网络的视频环路滤波器
US20210329286A1 (en) * 2020-04-18 2021-10-21 Alibaba Group Holding Limited Convolutional-neutral-network based filter for video coding
CN113766249A (zh) * 2020-06-01 2021-12-07 腾讯科技(深圳)有限公司 视频编解码中的环路滤波方法、装置、设备及存储介质
CN113422966A (zh) * 2021-05-27 2021-09-21 绍兴市北大信息技术科创中心 一种多模型cnn环路滤波方法

Also Published As

Publication number Publication date
CN117201782A (zh) 2023-12-08
TW202349966A (zh) 2023-12-16

Similar Documents

Publication Publication Date Title
CN114501010B (zh) 图像编码方法、图像解码方法及相关装置
US8942292B2 (en) Efficient significant coefficients coding in scalable video codecs
US20150189269A1 (en) Recursive block partitioning
US20080165844A1 (en) Cavlc enhancements for svc cgs enhancement layer coding
JP6464192B2 (ja) ディスプレイストリーム圧縮(dsc)のための平坦度検出のためのシステムおよび方法
EP2191650A1 (en) Architecture for multi-stage decoding of a cabac bitstream
CN113132728B (zh) 编码方法及编码器
CN111669588A (zh) 一种超低时延的超高清视频压缩编解码方法
US11323706B2 (en) Method and apparatus for aspect-ratio dependent filtering for intra-prediction
WO2023231775A1 (zh) 滤波方法、滤波模型训练方法及相关装置
CN115866297A (zh) 视频处理方法、装置、设备及存储介质
WO2021180220A1 (zh) 图像编码和解码方法及装置
CN104159106B (zh) 视频编码方法和视频解码方法及其装置
CN107945108A (zh) 视频处理方法及装置
CN113259673B (zh) 伸缩性视频编码方法、装置、设备及存储介质
WO2023185305A1 (zh) 编码方法、装置、存储介质及计算机程序产品
TWI821013B (zh) 視頻編解碼方法及裝置
WO2023185806A9 (zh) 一种图像编解码方法、装置、电子设备及存储介质
WO2022258036A1 (zh) 编解码方法、装置、设备、存储介质及计算机程序
US20240121392A1 (en) Neural-network media compression using quantized entropy coding distribution parameters
WO2024164590A1 (zh) 编解码网络模型的量化方法和相关装置
US20230262267A1 (en) Entropy coding for neural-based media compression
US20240323416A1 (en) Sliding-window rate-distortion optimization in neural network-based video coding
WO2022120829A1 (zh) 图像编码及解码方法和装置、图像处理装置、可移动平台
WO2024091925A1 (en) Improved entropy bypass coding

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 23814960

Country of ref document: EP

Kind code of ref document: A1