WO2023221599A1 - 图像滤波方法、装置及设备 - Google Patents

图像滤波方法、装置及设备 Download PDF

Info

Publication number
WO2023221599A1
WO2023221599A1 PCT/CN2023/079134 CN2023079134W WO2023221599A1 WO 2023221599 A1 WO2023221599 A1 WO 2023221599A1 CN 2023079134 W CN2023079134 W CN 2023079134W WO 2023221599 A1 WO2023221599 A1 WO 2023221599A1
Authority
WO
WIPO (PCT)
Prior art keywords
image
image block
filtered
block
training
Prior art date
Application number
PCT/CN2023/079134
Other languages
English (en)
French (fr)
Other versions
WO2023221599A9 (zh
Inventor
王力强
常仁杰
Original Assignee
腾讯科技(深圳)有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 腾讯科技(深圳)有限公司 filed Critical 腾讯科技(深圳)有限公司
Publication of WO2023221599A1 publication Critical patent/WO2023221599A1/zh
Publication of WO2023221599A9 publication Critical patent/WO2023221599A9/zh

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • G06T7/11Region-based segmentation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T5/00Image enhancement or restoration
    • G06T5/20Image enhancement or restoration using local operators
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/09Supervised learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T5/00Image enhancement or restoration
    • G06T5/60Image enhancement or restoration using machine learning, e.g. neural networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T9/00Image coding
    • G06T9/002Image coding using neural networks
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/102Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
    • H04N19/117Filters, e.g. for pre-processing or post-processing
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/169Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
    • H04N19/17Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object
    • H04N19/176Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object the region being a block, e.g. a macroblock
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/80Details of filtering operations specially adapted for video compression, e.g. for pixel interpolation
    • H04N19/82Details of filtering operations specially adapted for video compression, e.g. for pixel interpolation involving filtering within a prediction loop
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/85Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using pre-processing or post-processing specially adapted for video compression
    • H04N19/86Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using pre-processing or post-processing specially adapted for video compression involving reduction of coding artifacts, e.g. of blockiness
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/0464Convolutional networks [CNN, ConvNet]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10016Video; Image sequence
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20021Dividing image into blocks, subimages or windows
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]

Definitions

  • the embodiments of the present application relate to the field of image processing technology, and in particular, to an image filtering method, device and equipment.
  • Digital video technology can be incorporated into a variety of video devices, such as digital televisions, smartphones, computers, e-readers, or video players.
  • video data includes a larger amount of data.
  • video devices implement video compression technology to make the video data more efficiently transmitted or stored.
  • neural network-based filters are used to filter reconstructed images.
  • some neural network-based filters have poor filtering effects.
  • This application provides an image filtering method, device and equipment.
  • the technical solution is as follows.
  • this application provides an image filtering method, including:
  • the N image blocks to be filtered are filtered respectively to obtain a filtered image.
  • this application provides an image filtering device, including:
  • Acquisition unit used to obtain the image to be filtered
  • a dividing unit used to determine a filter based on a neural network; divide the image to be filtered according to the blocking method corresponding to the filter based on the neural network to obtain N image blocks to be filtered, and the blocking method is The block method of the training image used by the neural network-based filter during the training process, and the N is a positive integer;
  • a filtering unit is configured to use the neural network-based filter to filter the N image blocks to be filtered, respectively, to obtain a filtered image.
  • an electronic device including: a processor and a memory.
  • the memory is used to store a computer program.
  • the processor is used to call and run the computer program stored in the memory to perform implementation of various aspects of the application. The method provided in the example.
  • a chip for implementing the methods of various aspects provided by this application.
  • the chip includes: a processor, configured to call and run a computer program from a memory, so that the device installed with the chip executes the method provided by the embodiments of various aspects in this application.
  • a computer-readable storage medium for storing a computer program, which causes the computer to execute the methods provided by the embodiments of various aspects of the present application.
  • a computer program product including computer program instructions, which cause a computer to execute the methods provided by the embodiments of various aspects of the present application.
  • a computer program which, when run on a computer, causes the computer to execute the methods provided by the embodiments of various aspects in this application.
  • N image blocks to be filtered are obtained.
  • N is a positive integer; use a filter based on a neural network to filter the N image blocks to be filtered to obtain the filtered image.
  • the above-mentioned blocking method is the blocking method of the training image used by the neural network-based filter in the training process. That is, in the embodiment of the present application, by dividing the neural network-based filter in the actual use process, Consistent with the blocking method used during training, the neural network-based filter can exert the best filtering performance, thus improving the filtering effect of the image.
  • Figure 1 is a schematic diagram of an application scenario involved in the embodiment of the present application.
  • Figure 2 is a schematic block diagram of a video encoding and decoding system related to an embodiment of the present application
  • Figure 3 is a schematic diagram of the coding framework provided by the embodiment of the present application.
  • Figure 4 is a schematic diagram of the decoding framework provided by the embodiment of the present application.
  • FIGS. 5 to 7 are schematic diagrams of image types
  • Figure 8 is a schematic diagram of filtering involved in the embodiment of the present application.
  • Figures 9 to 12 show an image blocking method
  • Figure 13 is a flow chart of an image filtering method provided by an embodiment of the present application.
  • Figures 14 to 17 are schematic diagrams of image blocking according to embodiments of the present application.
  • FIGS 18 to 21 are schematic diagrams of another image blocking involved in the embodiment of the present application.
  • FIGS 22 to 25 are schematic diagrams of another image blocking involved in the embodiment of the present application.
  • FIGS. 26 to 29 are schematic diagrams of image block expansion related to embodiments of the present application.
  • FIGS. 30 to 33 are schematic diagrams of another image block expansion involved in the embodiment of the present application.
  • Figures 34 and 35 are schematic diagrams of a spatial reference image block involved in the embodiment of the present application.
  • Figures 36 and 37 are schematic diagrams of another spatial domain reference image block involved in the embodiment of the present application.
  • Figures 38 and 39 are schematic diagrams of another spatial domain reference image block involved in the embodiment of the present application.
  • Figures 40 and 41 are schematic diagrams of another spatial domain reference image block involved in the embodiment of the present application.
  • Figures 42 and 43 are schematic diagrams of a time domain reference image block involved in the embodiment of the present application.
  • FIGS 44 and 45 are schematic diagrams of another time domain reference image block involved in the embodiment of the present application.
  • Figure 46 is a schematic flow chart of an image filtering method provided by an embodiment of the present application.
  • Figure 47 is a schematic block diagram of an image filtering device provided by an embodiment of the present application.
  • Figure 48 is a schematic block diagram of an electronic device provided by an embodiment of the present application.
  • FIG. 1 is a schematic diagram of an application scenario related to an embodiment of the present application, including an electronic device 100 with a neural network-based filter 200 installed on the electronic device 100 .
  • the electronic device 100 obtains the image to be filtered
  • the point filtered image is input into the neural network-based filter 200 for filtering.
  • the electronic device 100 includes a display device, so that the electronic device 100 can display the filtered image through the display device.
  • the embodiment of the present application does not limit the specific type of the electronic device 100, and it can be any device with data processing functions.
  • the electronic device 100 may be a terminal device, including, for example, a smartphone, a desktop computer, a mobile computing device, a notebook (eg, laptop) computer, a tablet computer, a set-top box, a television, a camera, a display device, a digital media Players, video game consoles, car computers, etc.
  • a terminal device including, for example, a smartphone, a desktop computer, a mobile computing device, a notebook (eg, laptop) computer, a tablet computer, a set-top box, a television, a camera, a display device, a digital media Players, video game consoles, car computers, etc.
  • the above-mentioned electronic device 100 may also be a server.
  • the server can be one or more.
  • there are multiple servers there are at least two servers for providing different services, and/or there are at least two servers for providing the same service, such as providing the same service in a load balancing manner. This is the case in the embodiment of the present application. Not limited.
  • the above server can be an independent physical server or a server composed of multiple physical servers.
  • Clusters or distributed systems can also provide cloud services, cloud databases, cloud computing, cloud functions, cloud storage, network services, cloud communications, middleware services, domain name services, security services, CDN (Content Delivery Network, content distribution network) ), and cloud servers for basic cloud computing services such as big data and artificial intelligence platforms.
  • Servers can also become nodes of the blockchain.
  • the above-mentioned server is a cloud server with powerful computing resources and is highly virtualized and highly distributed.
  • the above-mentioned image to be filtered is collected by an image acquisition device.
  • the image acquisition device sends the acquired image to be filtered to the electronic device 100, and the electronic device 100 filters the captured image to be filtered through a neural network-based filter.
  • the electronic device 100 has an image acquisition function, so that the electronic device 100 can acquire images and input the acquired images to be filtered into a filter based on a neural network for filtering.
  • the above-mentioned electronic device 100 may be a coding device, and the image to be filtered may be understood as a reconstructed image.
  • the encoding device encodes and reconstructs the current image to obtain a reconstructed image, and inputs the reconstructed image into a filter based on a neural network for filtering.
  • the electronic device 100 may be a decoding device.
  • the decoding device decodes the code stream and performs image reconstruction to obtain a reconstructed image. Then, the reconstructed image is input to a filter based on a neural network for filtering.
  • embodiments of the present application can be applied to various scenarios, including but not limited to cloud technology (such as cloud gaming), artificial intelligence, smart transportation, assisted driving, etc.
  • cloud technology such as cloud gaming
  • artificial intelligence smart transportation, assisted driving, etc.
  • the present application can be applied to the fields of image coding and decoding, video coding and decoding, hardware video coding and decoding, dedicated circuit video coding and decoding, real-time video coding and decoding, etc.
  • AVC H.264/Audio Video Coding
  • AVS High Efficiency Video Coding
  • HEVC High Efficiency Video Coding
  • VVC Very Video Coding
  • the solution of this application can be incorporated into other proprietary or industry standards, including ITU-TH.261, ISO/IECMPEG-1Visual, ITU-TH.262 or ISO/IECMPEG-2Visual, ITU-TH.263, ISO/IECMPEG-4Visual, ITU-TH.264 (also known as ISO/IECMPEG-4AVC), includes scalable video codec (SVC) and multi-view video codec (MVC) extensions. It should be understood that the technology of this application is not limited to any specific codec standard or technology.
  • Figure 2 is a schematic block diagram of a video encoding and decoding system related to an embodiment of the present application. It should be noted that Figure 2 is only an example, and the video encoding and decoding system in the embodiment of the present application includes but is not limited to what is shown in Figure 2 .
  • the video coding and decoding system includes an encoding device 110 and a decoding device 120.
  • the encoding device is used to encode the video data (which can be understood as compression) to generate a code stream, and transmit the code stream to the decoding device.
  • the decoding device decodes the code stream generated by the encoding device to obtain decoded video data.
  • the encoding device 110 in the embodiment of the present application can be understood as a device with a video encoding function
  • the decoding device 120 can be understood as a device with a video decoding function. That is, the embodiment of the present application includes a wider range of devices for the encoding device 110 and the decoding device 120. Examples include smartphones, desktop computers, mobile computing devices, notebook (eg, laptop) computers, tablet computers, set-top boxes, televisions, cameras, display devices, digital media players, video game consoles, vehicle-mounted computers, and the like.
  • the encoding device 110 may transmit the encoded video data (eg, code stream) to the decoding device 120 via the channel 130 .
  • Channel 130 may include one or more media and/or devices capable of transmitting encoded video data from encoding device 110 to decoding device 120 .
  • channel 130 includes one or more communication media that enables encoding device 110 to transmit encoded video data directly to decoding device 120 in real time.
  • encoding device 110 may modulate the encoded video data according to the communication standard and transmit the modulated video data to decoding device 120.
  • the communication media includes wireless communication media, such as radio frequency spectrum.
  • the communication media may also include wired communication media, such as one or more physical wires. Transmission line.
  • channel 130 includes a storage medium that can store video data encoded by encoding device 110 .
  • Storage media include a variety of local access data storage media, such as optical disks, DVDs, flash memories, etc.
  • the decoding device 120 may obtain the encoded video data from the storage medium.
  • channel 130 may include a storage server that may store video data encoded by encoding device 110 .
  • the decoding device 120 may download the stored encoded video data from the storage server.
  • the storage server may store the encoded video data and may transmit the encoded video data to the decoding device 120, such as a web server (eg, for a website), a File Transfer Protocol (FTP) server, etc.
  • FTP File Transfer Protocol
  • the encoding device 110 includes a video encoder 112 and an output interface 113.
  • the output interface 113 may include a modulator/demodulator (modem) and/or a transmitter.
  • the encoding device 110 may include a video source 111 in addition to the video encoder 112 and the input interface 113 .
  • Video source 111 may include at least one of a video capture device (eg, a video camera), a video archive, a video input interface for receiving video data from a video content provider, a computer graphics system Used to generate video data.
  • a video capture device eg, a video camera
  • a video archive e.g., a video archive
  • video input interface for receiving video data from a video content provider
  • computer graphics system Used to generate video data.
  • the video encoder 112 encodes the video data from the video source 111 to generate a code stream.
  • Video data may include one or more images (pictures) or sequence of pictures (sequence of pictures).
  • the code stream contains the encoding information of an image or image sequence in the form of a bit stream.
  • Encoded information may include encoded image data and associated data.
  • the associated data may include sequence parameter set (Sequence Parameter Set, referred to as SPS), picture parameter set (Picture Parameter Set, referred to as PPS) and other grammatical structures.
  • SPS Sequence Parameter Set
  • PPS Picture Parameter Set
  • An SPS can contain parameters that apply to one or more sequences.
  • a PPS can contain parameters that apply to one or more images.
  • a syntax structure refers to a collection of zero or more syntax elements arranged in a specified order in a code stream.
  • the video encoder 112 transmits the encoded video data directly to the decoding device 120 via the output interface 113 .
  • the encoded video data can also be stored on a storage medium or storage server for subsequent reading by the decoding device 120 .
  • decoding device 120 includes input interface 121 and video decoder 122.
  • the decoding device 120 may also include a display device 123.
  • the input interface 121 includes a receiver and/or a modem. Input interface 121 may receive encoded video data over channel 130.
  • the video decoder 122 is used to decode the encoded video data to obtain decoded video data, and transmit the decoded video data to the display device 123 .
  • the display device 123 displays the decoded video data.
  • Display device 123 may be integrated with decoding device 120 or external to decoding device 120 .
  • Display device 123 may include a variety of display devices, such as a liquid crystal display (LCD), a plasma display, an organic light emitting diode (OLED) display, or other types of display devices.
  • LCD liquid crystal display
  • plasma display a plasma display
  • OLED organic light emitting diode
  • Figure 2 is only an example, and the technical solution of the embodiment of the present application is not limited to Figure 2.
  • the technology of the present application can also be applied to unilateral video encoding or unilateral video decoding.
  • Figure 3 is a schematic diagram of the coding framework provided by the embodiment of the present application.
  • this encoding framework can be used for lossy compression of images (lossy compression), and can also be used for lossless compression of images (lossless compression).
  • the lossless compression can be visually lossless compression (visually lossless compression) or mathematically lossless compression (mathematically lossless compression).
  • This encoding framework can be applied to image data in luminance-chrominance (YCbCr, YUV) format.
  • the coding framework reads video data, and for each frame of image in the video data, divides one frame of image into several Coding Tree Units (CTU).
  • CTU can be called a "tree"Block”,"Largest Coding Unit” (LCU) or “Coding Tree Block” (CTB).
  • every CTUs can be associated with equal-sized blocks of pixels within an image.
  • Each pixel can correspond to one luminance (luminance or luma) sample and two chrominance (chrominance or chroma) samples. Therefore, each CTU can be associated with one block of luma samples and two blocks of chroma samples.
  • a CTU size is 128 ⁇ 128, 64 ⁇ 64, 32 ⁇ 32, etc.
  • a CTU can be further divided into several coding units (Coding Units, CUs) for encoding, and a CU can be a rectangular block or a square block.
  • CU can be further divided into Prediction Unit (PU for short) and Transform Unit (TU for short), thus enabling coding, prediction, and transformation to be separated and processing more flexible.
  • PU Prediction Unit
  • TU Transform Unit
  • the CTU is divided into CUs in a quad-tree manner
  • the CU is divided into TUs and PUs in a quad-tree manner.
  • Video encoders and video decoders can support various PU sizes. Assuming that the size of a specific CU is 2N ⁇ 2N, the video encoder and video decoder can support a PU size of 2N ⁇ 2N or N ⁇ N for intra prediction, and support 2N ⁇ 2N, 2N ⁇ N, N ⁇ 2N, N ⁇ N or similar sized symmetric PU for inter prediction. The video encoder and video decoder can also support 2N ⁇ nU, 2N ⁇ nD, nL ⁇ 2N and nR ⁇ 2N asymmetric PUs for inter prediction.
  • the coding framework includes: prediction unit 11, residual generation unit 12, transformation unit 13, quantization unit 14, inverse quantization unit 15, inverse transformation unit 16, reconstruction unit 17, filter unit 18 and entropy coding unit 19.
  • the prediction unit 11 includes an inter prediction unit 101 and an intra prediction unit 102.
  • the inter prediction unit 101 includes a motion estimation unit 1011 and a motion compensation unit 1012. It should be noted that the coding framework can contain more, fewer, or different functional components.
  • the current block may be called the current coding unit (CU) or the current prediction unit (PU), etc.
  • the prediction block may also be called a predicted image block or an image prediction block
  • the reconstructed image block may also be called a reconstruction block or an image reconstructed image block.
  • the encoding end After receiving the video, the encoding end divides each frame of image that constitutes the video into a plurality of image blocks to be encoded. For the current image block to be encoded, the prediction unit 11 first predicts the current image block to be encoded by referring to the reconstructed image block to obtain prediction information of the current image block to be encoded. Among them, the encoding end can use inter-frame prediction or intra-frame prediction technology to obtain prediction information.
  • the motion estimation unit 1011 in the inter prediction unit 101 may search the reference pictures in the list of reference pictures to find the reference block of the image block to be encoded.
  • the motion estimation unit 1011 may generate an index indicating the reference block and a motion vector indicating a spatial displacement between the image block to be encoded and the reference block.
  • the motion estimation unit 1011 may output the index of the reference block and the motion vector as motion information of the image block to be encoded.
  • the motion compensation unit 1012 may obtain the prediction information of the image block to be encoded based on the motion information of the image block to be encoded.
  • the intra prediction unit 102 may use an intra prediction mode to generate prediction information for the current image block to be encoded. There are currently 15 intra prediction modes, including Planar mode, DC mode and 13 angle prediction modes.
  • the intra prediction unit 102 may also adopt intra block copy (Intra Block Copy, IBC), intra string copy (Intra String Copy, ISC) technology, etc.
  • the intra-frame prediction modes used by HEVC include planar mode (Planar), DC and 33 angle modes, for a total of 35 prediction modes.
  • the intra-frame modes used by VVC include Planar, DC and 65 angle modes, for a total of 67 prediction modes.
  • the intra-frame modes used by AVS3 include DC, Plane, Bilinear and 63 angle modes, for a total of 66 prediction modes.
  • the residual generation unit 12 is used to subtract prediction information from the original signal of the current image block to be encoded to obtain a residual signal. After prediction, the amplitude of the residual signal is much smaller than the original signal.
  • the transform unit 13 and the quantization unit 14 are used to transform and quantize the residual signal. After transform and quantization, the transform quantization coefficient is obtained.
  • the entropy encoding unit 19 is used to encode the quantized coefficients and other indication information in encoding through entropy encoding technology to obtain a code stream.
  • the encoding end also needs to reconstruct the current image block to be encoded to provide reference pixels for encoding subsequent image blocks to be encoded.
  • the inverse quantization unit 15 and the inverse transform unit 16 perform inverse quantization and inverse transformation on the transform quantization coefficient of the current image block to be encoded to obtain a reconstructed residual signal
  • the reconstruction unit 17 adds the reconstructed residual signal to the prediction information corresponding to the current image block to be encoded to obtain a reconstructed signal of the current image block to be encoded, and obtains the reconstructed image block based on the reconstructed signal.
  • the filtering unit 18 can filter the reconstructed image blocks, in which deblocking filter (DeBlocking Filter, DBF), adaptive sample offset (Sample Adaptive Offset, SAO) or adaptive loop filtering (Adaptive Loop Filter, ALF), etc.
  • DBF deblocking Filter
  • SAO sample Adaptive Offset
  • ALF adaptive Loop Filter
  • the reconstructed image blocks may be stored in the decoded image cache, and inter prediction unit 101 may use reference images containing the reconstructed pixel blocks to perform inter prediction on PUs of other images. Additionally, intra prediction unit 102 may perform intra prediction on other PUs in the same image as the CU using reconstructed image blocks in the decoded image buffer.
  • Figure 4 is a schematic diagram of a decoding framework provided by an embodiment of the present application.
  • the decoding framework includes: an entropy decoding unit 21 , a prediction unit 22 , an inverse quantization unit 23 , an inverse transform unit 24 , a reconstruction unit 25 , and a filtering unit 26 .
  • the prediction unit 22 includes a motion compensation unit 221 and an intra prediction unit 222.
  • the entropy decoding unit 21 performs entropy decoding on the code stream to obtain the transform and quantized coefficients of the current image block to be reconstructed, and then the inverse quantization unit 23 and the inverse transform unit 24 perform the transformation and quantization coefficients. Inverse quantization and inverse transformation are performed to obtain the reconstructed residual signal of the current image block to be reconstructed.
  • the prediction unit 22 predicts the current image block to be reconstructed and obtains prediction information of the current image block to be reconstructed. If the prediction unit 22 adopts inter prediction, the motion compensation unit 221 may construct a first reference picture list (List 0) and a second reference picture list (List 1) according to syntax elements parsed from the code stream.
  • the entropy decoding unit 21 can parse the motion information of the image block to be reconstructed.
  • the motion compensation unit 221 may determine one or more reference blocks of the image block to be reconstructed according to the motion information.
  • the motion compensation unit 221 may generate prediction information of the image block to be reconstructed according to one or more reference blocks. If the prediction unit 22 adopts intra prediction, the entropy decoding unit 21 can parse the index of the intra prediction mode used, and the intra prediction unit 222 can use the intra prediction mode to perform intra prediction according to the index to obtain the image to be reconstructed. Block prediction information.
  • the intra prediction unit 222 may also adopt IBC or ISC technology.
  • the reconstruction unit 25 is configured to add the prediction information and the above-mentioned reconstructed residual signal to obtain the reconstructed signal of the current image block to be reconstructed, and then obtain the current reconstructed image block corresponding to the current image block to be reconstructed based on the reconstructed signal, where , the current reconstructed image block can predict other subsequent image blocks to be reconstructed.
  • the filtering unit 26 at the decoding end may filter the current reconstructed image block.
  • the block division information determined by the encoding end as well as mode information or parameter information such as prediction, transformation, quantization, entropy coding, loop filtering, etc., are carried in the code stream when necessary.
  • the decoding end determines the same block division information as the encoding end by parsing the code stream and analyzing the existing information, prediction, transformation, quantization, entropy coding, loop filtering and other mode information or parameter information, thereby ensuring the decoded image obtained by the encoding end It is the same as the decoded image obtained by the decoding end.
  • the encoded image is divided into a full intra-frame encoded image and an inter-frame encoded image, as shown in Figure 5 to Figure 7, where the image includes a frame, a slice, and a slice.
  • the dotted box in Figure 5 represents the boundary of the maximum coding unit CTU
  • the black solid line in Figure 6 represents the strip
  • the black solid line in Figure 7 represents the boundary of the slice.
  • the reference information for prediction of the full intra-frame coded image all comes from the spatial domain information of the current image.
  • the time domain reference information of other reference frames can be referred to during the prediction process of inter-frame coded images.
  • the main technology involved in the embodiments of this application is a neural network-based filter, such as a neural network-based loop filter (Neural Network Loop Filter, NNLF).
  • NNLF Neural Network Loop Filter
  • the image to be filtered is input into the trained neural network-based loop filter NNLF for filtering to obtain the filtered image.
  • an image needs to be input, and a target image is designated as an optimization target to train the filter parameters.
  • the input images and target images are spatially aligned.
  • the input image is selected from the reconstructed distorted image, and the target image is selected from the original image as the best optimization target.
  • the loss function is used during model training.
  • the loss function measures the difference between the predicted value and the true value. The larger the loss value, the greater the difference, and the goal of training is to reduce the loss.
  • commonly used loss functions are: L1 norm loss function, L2 norm loss function and smooth L1 loss function.
  • the neural network-based filter When the neural network-based filter is actually used, it generally does not directly input the entire frame image for filtering, but divides the image into sub-images, and then gradually inputs the sub-images into the neural network-based filter for filtering.
  • the input image and the target image will also be divided to obtain a matching pair composed of the input image block and the target image block, and then the neural network-based filter will be trained based on the matching pair. train.
  • a matching pair formed by the input image block and the target image block is selected by random cropping in a frame of image, as shown in Figures 9 and 10 , use random cropping to randomly crop the input image and the target image to obtain image blocks No. 1 to 5.
  • CTU is usually used as the basic unit for filtering. For example, as shown in Figure 11 and Figure 12, five CTUs A to E are used for actual filtering.
  • the embodiment of the present application divides the image to be filtered according to the same block method as the training image used in the training process of the neural network-based filter, and obtains N to-be-filtered images.
  • To filter the image block use a filter based on a neural network to filter the image block to be filtered using a filter based on a neural network for each of the N image blocks to be filtered, to obtain a final filtered image.
  • the blocking method used in the actual use of the neural network-based filter is consistent with the blocking method used in training, so that the neural network-based filter can perform its best. filtering performance, thereby improving the filtering effect of the image.
  • Figure 13 is a flow chart of an image filtering method provided by an embodiment of the present application. As shown in Figure 13, the embodiment of the present application includes the following steps:
  • the method of obtaining the image to be filtered includes but is not limited to the following situations:
  • the image to be filtered may be collected by an image collection device, such as a camera.
  • the image to be filtered may be generated by an image generating device, such as drawn by an image drawing device.
  • Case 2 For video encoding and decoding scenarios, the methods of obtaining the image to be filtered include at least the following:
  • the image to be filtered can be the image before encoding. That is to say, at the encoding side, before encoding the image to be filtered, the image to be filtered is first filtered, and then the filtered image is encoded. .
  • the image 1 to be filtered is first input into a filter based on a neural network for filtering, and the filtered image 2 is obtained.
  • the image 2 is divided into blocks to obtain multiple coding blocks.
  • Each coding block is predicted using an inter-frame or intra-frame prediction method to obtain a prediction block of the coding block.
  • the coding block is compared with the prediction block to obtain the residual Difference block, transform and quantize the residual block to obtain quantized coefficients, and finally encode the quantized coefficients.
  • the image to be filtered can be a reconstructed image, that is, the current image is reconstructed to obtain a reconstructed image of the current image; the reconstructed image is determined as the image to be filtered.
  • the encoding end divides the current image into blocks to obtain multiple encoding blocks.
  • prediction methods such as inter-frame or intra-frame prediction are used to obtain the prediction block of the encoding block, and the encoding block and the prediction block are differentiated.
  • the encoding end also performs inverse quantization on the quantization parameters to obtain the transform coefficients of the coding block, inversely transforms the transform coefficients to obtain the residual block, and adds the residual block and the prediction block to obtain the reconstruction block of the coding block.
  • the reconstructed blocks of all coding blocks in the current image are combined to obtain the reconstructed image of the current image.
  • the reconstructed image is used as the image to be filtered and input into the filter based on the neural network for filtering to obtain the filtered image.
  • the image to be filtered may be a reconstructed image, that is, the current image is reconstructed to obtain a reconstructed image of the current image; the reconstructed image is determined as the image to be filtered.
  • the decoder decodes the received code stream to obtain the quantization coefficient of the current block in the current image, then inversely quantizes the quantization coefficient to obtain the transform coefficient of the current block, and inversely transforms the transform coefficient to obtain the residual block.
  • the current block is predicted using an inter-frame or intra-frame prediction method to obtain a prediction block of the current block, and the residual block and the prediction block are added to obtain a reconstructed block of the current block. Combine the reconstructed blocks of all blocks in the current image to obtain the reconstructed image of the current image.
  • the reconstructed image is used as the image to be filtered and input into the filter based on the neural network for filtering to obtain the filtered image.
  • the methods for determining the image to be filtered include but are not limited to the above.
  • the present application can also obtain the image to be filtered through other methods, and the embodiment of the present application does not limit this.
  • the above-mentioned blocking method is the blocking method of the training image used by the neural network-based filter in the training process, that is, based on the same partitioning method of the training image used by the neural network-based filter in the training process.
  • the image to be filtered is divided to obtain N image blocks to be filtered, where N is a positive integer.
  • the filter before using a neural network-based filter to filter the image to be filtered, the filter first needs to be determined.
  • the above-mentioned neural network-based filter is preset or default, so that the preset or default neural network-based filter can be directly used for filtering.
  • the network structures of at least two candidate filters among the plurality of candidate filters are not exactly the same.
  • candidate filter 1 determines a CTU in the input image as an input block, determines a CTU at the same position in the target image as a target block, and then uses the input block as input, and uses the target Block is the target for training this candidate filter 1.
  • candidate filter 2 determines 2 CTUs in the input image as an input block, determines 2 CTUs at the same position in the target image as a target block, and then uses the input block as input. The candidate filter 2 is trained with the target block as the target.
  • the methods of determining a neural network-based filter from multiple candidate filters include but are not limited to the following:
  • Method 1 determine any candidate filter among the candidate filters as the filter of the basic neural network in this step.
  • Method 2 Use the candidate filter with the best filtering effect among the plurality of candidate filters mentioned above as the neural network-based filter of the present application.
  • use each of the above multiple candidate filters to filter the image to be filtered, obtain a filtered image under each candidate filter, compare the multiple filtered images, and determine the image with the best effect. The best filtered image is obtained, and then the candidate filter corresponding to the filtered image with the best effect is determined as the basis in this step. Filters for neural networks.
  • the method of determining the image effect of the filtered image is not limited.
  • the image effect of the filtered image is determined by determining image indicators such as image clarity, sharpness, and artifacts.
  • Method 3 Use the candidate filter with the smallest distortion among the plurality of candidate filters mentioned above as the neural network-based filter of the present application.
  • use each candidate filter among the plurality of candidate filters to filter the image to be filtered obtain a filtered image under each candidate filter, and compare the filtered image under each candidate filter with the image to be filtered. The images are compared to determine the distortion corresponding to each candidate filter, and then the candidate filter with the smallest distortion is determined as the neural network-based filter in this step.
  • the embodiments of the present application do not limit the above method of determining the distortion corresponding to the candidate filter.
  • the difference between the filtered image under the candidate filter and the image to be filtered is determined as the distortion corresponding to the candidate filter.
  • the methods of determining filters based on neural networks include but are not limited to the above.
  • the block method of the training image used by the neural network-based filter in the training process is obtained.
  • the file of the neural network-based filter includes the block method of the training image used by the neural network-based filter in the training process, so that it can be read directly from the file of the neural network-based filter.
  • the block method of training images used by this neural network-based filter during training is obtained.
  • the blocking method of the neural network-based filter during actual use is consistent with the blocking method used during training.
  • the embodiment of the present application uses the training image block method used in the training process of the neural network-based filter to divide the image to be filtered to obtain N image blocks to be filtered. That is to say, in the embodiment of the present application, during actual filtering, the block method of the training image used in the training process of the neural network-based filter is used to divide the image to be filtered to obtain N image blocks to be filtered. For example, during the training process of a filter based on a neural network, each CTU in the training image is determined as a training image block for model training.
  • each CTU in the image to be filtered is also determined as An image block to be filtered is filtered, thereby ensuring that the blocking method of the neural network-based filter during actual use is consistent with the blocking method during the training process, allowing the neural network-based filter to exert its best performance , thereby improving the filtering effect.
  • the embodiments of the present application do not limit the specific type of blocking method of the training image used in the training process of the neural network-based filter, and it can be any blocking method.
  • the blocking method of the training image used by the above neural network-based filter during training includes determining M CTUs in the training image as one training image block, and M is a positive integer, then the above S802
  • the image to be filtered is divided according to the block method corresponding to the neural network-based filter, and N image blocks to be filtered are obtained, including the following S802-A:
  • S802-A Determine M CTUs in the image to be filtered as an image block to be filtered, and obtain N image blocks to be filtered.
  • N image blocks to be filtered are determined from the image to be filtered by using M CTUs as one image block to be filtered.
  • the above training image includes an input image and a target image, where the target image can be understood as a supervised image.
  • a CTU of the input image is determined as an input image block.
  • the target image is A CTU of the image is determined as a target image block.
  • the input image block and the target image block form a matching pair. That is to say, the position of the input image block in a matching pair in the input image is consistent with the position of the target image block in the target image. The positions in the images are consistent.
  • the input image block is input into a filter based on the neural network for filtering, and a filtered image block corresponding to the input image block is obtained.
  • the filtered image patch of the input image patch is compared with the target image patch, the loss is calculated, and the parameters of the neural network-based filter are adjusted based on the loss.
  • the above-mentioned neural network-based filter determines a CTU in the training image as a training image block during the training process.
  • the filter based on the neural network is trained through a supervised training method
  • the above training image includes an input image and a target image, where the target image can be understood as a supervised image.
  • the 4 CTUs of the input image are determined as an input image block.
  • the 4 CTUs of the target image are determined as a target image block.
  • the input image The block and the target image block form a matching pair, that is, the position of the input image block in a matching pair in the input image is consistent with the position of the target image block in the target image.
  • the input image block is input into a filter based on the neural network for filtering, and a filtered image block corresponding to the input image block is obtained.
  • the filtered image patch of the input image patch is compared with the target image patch, the loss is calculated, and the parameters of the neural network-based filter are adjusted based on the loss.
  • the above-mentioned neural network-based filter determines 4 CTUs in the training image as a training image block during the training process.
  • the actual filtering process as shown in Figure 21 shows that four CTUs in the image to be filtered are determined as one image block to be filtered, and N image blocks to be filtered are obtained.
  • M can also be any positive integer such as 2, 3, 5, etc., and the embodiments of the present application do not limit this.
  • the blocking method of the training image used in training the above-mentioned neural network-based filter includes determining P incomplete CTUs in the training image as one training image block, and when P is a positive integer, then In the above S802, the image to be filtered is divided according to the same block method as the training image used in the training process of the neural network-based filter, and N image blocks to be filtered are obtained, including the following S802-B:
  • S802-B Determine P incomplete CTUs in the image to be filtered as an image block to be filtered, and obtain N image blocks to be filtered.
  • P incomplete CTUs are used as an image block to be filtered, and N image blocks to be filtered are determined from the image to be filtered.
  • the four incomplete CTUs of the training image are determined as a training image block, and the neural network-based filter is trained. .
  • the filter based on the neural network is trained through a supervised training method
  • the above training image includes an input image and a target image, where the target image can be understood as a supervised image.
  • the target image can be understood as a supervised image.
  • the four incomplete CTUs of the input image are determined as one input image block
  • the four incomplete CTUs of the target image are determined as one target image block.
  • the input image patch and the target image patch form a matching pair.
  • the input image block is input into the filter based on the neural network for filtering, and the input image block pair is obtained. corresponding filtered image block.
  • the filtered image patch of the input image patch is compared with the target image patch, the loss is calculated, and the parameters of the neural network-based filter are adjusted based on the loss. Then, refer to the above method, continue to use the input image block in the next matching pair as input, and use the target image block in the next matching pair as the target to train the neural network-based filter to obtain the trained neural network-based filter. filter.
  • the above-mentioned neural network-based filter determines the four incomplete CTUs in the training image as a training image block during the training process.
  • the four incomplete CTUs in the image to be filtered are determined as one image block to be filtered, and then N image blocks to be filtered are obtained.
  • P 4 as an example.
  • the above P can also be any positive integer such as 1, 2, 3, 5, etc., and the embodiments of the present application do not limit this.
  • the P residual CTUs may not all be adjacent. That is to say, all the residual CTUs among the P residual CTUs are not adjacent, or some of the residual CTUs among the P residual CTUs are adjacent and some of the residual CTUs are not adjacent.
  • the blocking method of the training image includes determining M CTUs in the training image as one training image block, or determining P incomplete CTUs in the training image as one training image block.
  • the block division method of training images involved in the embodiment of the present application includes but is not limited to the above examples, and the embodiment of the present application does not limit this.
  • the same blocking method as that of the training image is used to divide the image to be filtered to obtain N image blocks to be filtered.
  • the image block to be filtered is input into a filter based on a neural network for filtering, and a filtered image block of the image block to be filtered is obtained.
  • the filtered image block of each of the N image blocks to be filtered can be determined, and the filtered image blocks of the N image blocks to be filtered constitute the filtered image.
  • the above-mentioned neural network-based filter is trained by an expanded image block of the training image block. That is to say, in the embodiment of the present application, during the training process of the filter, in addition to dividing the training image according to the above-mentioned blocking method of the training image to obtain the training image blocks, the training image blocks can also be processed outwardly. Expand, obtain the expanded image patch, and use the expanded image patch to train the filter based on the neural network.
  • the above S803 includes the following steps from S803-A1 to S803-A3:
  • S803-A3 Determine the image area corresponding to the image block to be filtered in the filtered extended image block as the filtered image block of the image block to be filtered.
  • the embodiment of the present application uses the same expansion method as the training image block to expand the image block to be filtered.
  • the same blocking method as that of the training image is used to divide the image to be filtered to obtain N image blocks to be filtered.
  • For each of the N image blocks to be filtered use the same expansion method as the training image block to expand the image block to be filtered outward to obtain an expanded image block to be filtered.
  • the expanded image block to be filtered is input into a filter based on a neural network for filtering, and a filtered expanded image block of the expanded image block to be filtered is obtained.
  • the filtered expanded image block is The image block is cropped, specifically, the image area corresponding to the image block to be filtered in the filtered extended image block is determined as the filtered image block of the image block to be filtered. According to the above steps, the filtered image block of each of the N to-be-filtered image blocks can be determined, and the filtered image blocks of these N to-be-filtered image blocks are spliced to obtain the final filtered image.
  • the expansion method of the training image block includes expanding at least one boundary area of the training image block outward.
  • the to-be-filtered image block is expanded according to the expansion method of the training image block to obtain the expansion.
  • the final image block to be filtered includes: expanding at least one boundary area of the image block to be filtered outward to obtain an expanded image block to be filtered.
  • the training image patch is expanded outward on all sides, and the expanded training image patch is used to perform the neural network-based filter train.
  • the filter based on the neural network is trained through a supervised training method
  • the above training image includes an input image and a target image, where the target image can be understood as a supervised image.
  • a CTU of the input image is determined as an input image block, and the surroundings of the input image block are expanded outward to obtain the expanded input image block.
  • one CTU of the target image is determined as a target image block, and the surroundings of the target image block are expanded outward to obtain an expanded target image block.
  • the expanded input image block is input into a filter based on a neural network for filtering, and a filtered image block corresponding to the expanded input image block is obtained.
  • the filtered image block of the expanded input image block is compared with the expanded target image block, the loss is calculated, and the parameters of the neural network-based filter are adjusted according to the loss. Then, refer to the above method, continue to use the expanded input image block in the next matching pair as input, and use the expanded target image block in the next matching pair as the target to train the filter based on the neural network, and obtain the training Good neural network based filters.
  • the above-mentioned neural network-based filter determines a CTU in the training image as a training image block, and expands the training image block outwards, and obtains the expanded training image patches.
  • a CTU in the image to be filtered is determined as an image block to be filtered, and then, using the above-mentioned expansion method of the training image block, the surrounding areas of the image block to be filtered are outward Expand to obtain an expanded image block to be filtered.
  • the filtered expanded image block is then The image area corresponding to the image block to be filtered is determined as the filtered image block of the image block to be filtered.
  • the left boundary and upper boundary of the training image block are expanded outward, and the expanded training image block is used to filter the neural network-based filter. filters are trained.
  • the filter based on the neural network is trained through a supervised training method
  • the above training image includes an input image and a target image, where the target image can be understood as a supervised image.
  • a CTU of the input image is determined as an input image block, and the left boundary and upper boundary of the input image block are expanded outward to obtain the expanded input image block.
  • one CTU of the target image is determined as a target image block, and the left and upper boundaries of the target image block are expanded outward to obtain an expanded target image block.
  • the expanded input image block is input into a filter based on a neural network for filtering, and a filtered image block corresponding to the expanded input image block is obtained.
  • the filtered image block of the expanded input image block is compared with the expanded target image block, the loss is calculated, and the parameters of the neural network-based filter are adjusted according to the loss. Then, refer to the above method, continue to use the expanded input image block in the next matching pair as input, and use the expanded target image block in the next matching pair as the target to train the filter based on the neural network, and obtain the training Good neural network based filters.
  • the A CTU is determined as a training image block, and the left and upper boundaries of the training image block are expanded outward to obtain an expanded training image block.
  • a CTU in the image to be filtered is determined as an image block to be filtered, and then, using the above expansion method of the training image block, the left boundary of the image block to be filtered, The upper boundary is expanded outward to obtain an expanded image block to be filtered.
  • the expanded image block to be filtered is input into a filter based on a neural network for filtering to obtain a filtered expanded image block.
  • the filtered image block is then The image area corresponding to the image block to be filtered in the expanded image block is determined as the filtered image block of the image block to be filtered.
  • the expansion method of the training image block also includes expanding other boundaries of the training image block outwards, which is not limited in the embodiments of the present application.
  • a reference image block of the input image block is also input.
  • the above S803 includes the following steps from S803-B1 to S803-B2:
  • S803-B2 Input the image block to be filtered and the reference image block of the image block to be filtered into a filter based on the neural network for filtering, and obtain the filtered image block of the image block to be filtered.
  • the input information of the neural network-based filter includes an input image block and a reference image block of the input image block during the training process, then during the actual filtering process, the input information includes in addition to the image block to be filtered.
  • the reference image block of the image block to be filtered is also included.
  • the embodiments of the present application do not limit the above-mentioned method of determining the reference image block of the image block to be filtered.
  • the determination method of the reference image block of the image block to be filtered is different from the determination method of the reference image block of the training image block.
  • the reference image block of the image block to be filtered is determined in the same manner as the reference image block of the input image block. At this time, determining the reference image block of the image block to be filtered in the above S803-B1 includes the following steps of S803-B11 and S803-B12:
  • the determination method is used to determine the corresponding reference image block based on at least one of the spatial domain information and the time domain information of the input image block;
  • the determination method of the reference image block of the input image block can be read from the file of the neural network-based filter. In some embodiments, if the training device of the neural network-based filter and the actual filtering device are the same device, the determination method of the reference image block of the input image block is stored on the device.
  • the reference image block of the image block to be filtered is determined using the determination method of the reference image block of the input image block. That is to say, in the embodiment of the present application, the determination method of the reference image block of the image block to be filtered is consistent with the determination method of the reference image block of the input image block.
  • the embodiment of the present application does not limit the specific type of the reference image block.
  • the reference image block of the input image block includes at least one of the temporal reference image block and the spatial domain reference image block of the input image block
  • the reference image block of the image block to be filtered includes the temporal reference image block of the image block to be filtered. At least one of a domain reference image block and a spatial domain reference image block.
  • the spatial reference image block can select an image area with a fixed relative position of the current input image block, that is, the spatial reference image block of the input image block and the input image block are in the same frame, that is, they are both in the input image.
  • the difference between a temporal reference image block and a spatial domain reference image block is that the temporal reference image block and the current input image block are in different frames, and the reference position of the temporal reference image block can be a reference block with the same spatial position as the current input image.
  • the type of the reference image block of the image block to be filtered is the same as the type of the reference image block of the input image block.
  • Example 1 if the reference image block of the input image block includes a spatial reference image block, then the reference image of the image block to be filtered Blocks also include spatial reference image blocks.
  • the above S803-B12 includes the following steps:
  • the spatial reference image block of the image block to be filtered is determined according to the determination method of the spatial reference image block of the input image block, and then the image to be filtered is realized. Accurate determination of spatial reference image patches for patches.
  • the method of determining the spatial reference image block of the input image block can be used to determine the spatial reference image block of the image block to be filtered, thus ensuring that the neural network-based filter is in the same training process as the actual filtering process.
  • the input information remains consistent and improves the filtering performance of the neural network-based filter.
  • the spatial reference image block of the input image block includes at least one of the upper left image block, the left image block and the upper image block of the input image block in the input image
  • the spatial domain of the image block to be filtered The reference image block includes at least one of an upper left image block, a left image block and an upper image block of the image block to be filtered in the image to be filtered.
  • S803-B12-A includes determining at least one of the upper left image block, left image block and upper image block located in the image block to be filtered in the image to be filtered as the spatial reference image block of the image block to be filtered.
  • the image block to be filtered is located in the image block to be filtered.
  • the upper left image block is determined as the spatial reference image block of the image to be filtered.
  • the spatial reference image block of the input image block includes an image block located on the left side of the input image block in the input image
  • the image block to be filtered is located on the left side of the image block to be filtered.
  • the side image block is determined as the spatial reference image block of the image to be filtered.
  • the spatial reference image block of the input image block includes an image block located above the input image block in the input image
  • the image located above the image block to be filtered in the image to be filtered is block, determined as the spatial reference image block of the image to be filtered.
  • the spatial reference image block of the input image block includes the upper left image block, the left image block and the upper image block of the input image block in the input image
  • the to-be-filtered The upper left image block, left image block and upper image block located in the image block to be filtered in the image are determined as the spatial reference image blocks of the image to be filtered.
  • the spatial domain reference image block of the input image block is also input to improve the filtering effect of the neural network-based filter.
  • the spatial reference image of the image block to be filtered is determined using the same determination method as the spatial reference image block of the input image block. block, and then input the image block to be filtered and the spatial reference image block of the image block to be filtered into the filter based on the neural network to achieve the filtering effect of the image to be filtered.
  • Example 2 if the reference image block of the input image block includes a temporal reference image block, then the reference image block of the image block to be filtered also includes a temporal reference image block.
  • the above S803-B12 includes the following steps:
  • the temporal reference image block of the image block to be filtered is determined according to the determination method of the temporal reference image block of the input image block, thereby achieving Accurate determination of the temporal reference image block for the image block to be filtered.
  • the time domain reference image block of the input image block can be determined using the method of determining the time domain reference image block of the image block to be filtered, thereby ensuring that the neural network-based filter is consistent with the actual training process.
  • the input information of the filtering process remains consistent, improving the filtering performance of the neural network-based filter.
  • S803-B12-B includes: determining the reference image of the image to be filtered; determining the image block at the corresponding position of the image block to be filtered in the reference image of the image to be filtered as the time domain reference of the image block to be filtered Image blocks.
  • the temporal reference image block of the input image block is the image block at the position corresponding to the input image block in the reference image of the input image. That is to say, the position of the temporal reference image block of the input image block in the reference image of the input image is consistent with the position of the input image block in the input image.
  • the process of determining the temporal reference image block of the image block to be filtered is to first determine the reference image of the filtered image, and then determine the corresponding position of the image block to be filtered in the reference image of the filtered image. The image block at is determined as the temporal reference image block of the image block to be filtered.
  • the embodiment of the present application does not limit the type of reference image.
  • the reference image of the image to be filtered can be any coded image. If the method of the embodiment of the present application is applied At the decoding end, the reference image of the image to be filtered can be any decoded image.
  • the time domain reference image block of the input image block is also input to improve the filtering effect of the neural network-based filter.
  • the time domain of the image block to be filtered is determined using the same determination method as the time domain reference image block of the input image block.
  • the reference image block of the input image block includes a spatial domain reference image block and a temporal reference image block
  • the reference image block of the image block to be filtered also includes a spatial domain reference image block and a temporal reference image block, where , for the determination process of the spatial domain reference image block and the temporal reference image block of the image block to be filtered, refer to the above-mentioned determination process of the spatial domain reference image block and the temporal reference image block, and will not be described again here.
  • a filter based on a neural network is used to filter N image blocks to be filtered respectively to obtain a filtered image.
  • the filtering method of the embodiment of the present application can be applied to the loop filtering module.
  • the method of the embodiment of the present application also includes generating a reference image for prediction based on the filtered image, and using the generated The reference image is stored in the cache for decoding as a reference image for subsequent decoded images.
  • the method of generating a reference image based on the filtered image may be to directly use the filtered image as a reference image, or to reprocess the filtered image, such as performing other types of filtering, and use the reprocessed image as a reference. image.
  • the filtered image can also be displayed by a display device.
  • the method of the embodiment of the present application can also be applied to video post-processing, that is, generating a display image based on the filtered image, inputting the generated display image to the display device for display, and skipping the filtered image.
  • the reprocessed image of the filtered image is stored in the decoding cache. That is to say, in the embodiment of the present application, the above-mentioned display image generated based on the filtered image is input to the display device for display, but is not stored in the decoding cache as a reference image.
  • the reconstructed image is stored in the decoding cache as a reference image, or a traditional loop filtering method is used, such as using at least one filter such as DBF, SAO, and ALF,
  • the reconstructed image is filtered, and the filtered image is used as a reference image and stored in the decoding cache.
  • the above-mentioned reconstructed image is used as the image to be filtered.
  • the reconstructed image is filtered using a filter based on a neural network to obtain a filtered image.
  • a display image is generated based on the filtered image, and the displayed image is The image is input to the display device for display.
  • the above-mentioned image to be filtered can also be filtered through at least one filter such as DBF, SAO, and ALF.
  • the image to be filtered in the embodiment of the present application may be an image filtered by at least one filter such as DBF, SAO, and ALF. Then, a filter based on a neural network is used through the method of the embodiment of the present application. , then filter the filtered image.
  • a filter based on a neural network is used through the method of the embodiment of the present application. , then filter the filtered image.
  • the image filtering method provided by the embodiment of the present application obtains the image to be filtered; determines the filter based on the neural network, and treats it according to the same block method as the training image used in the training process of the filter based on the neural network. Divide the filtered image to obtain N image blocks to be filtered, where N is a positive integer; use a filter based on a neural network to filter the N image blocks to be filtered respectively to obtain the filtered image. That is to say, in the embodiment of this application, the blocking method used in the actual use of the neural network-based filter is consistent with the blocking method used in training, so that the basic The filter based on the neural network exerts the best filtering performance, thereby improving the filtering effect of the image.
  • Figure 46 is a schematic flowchart of an image filtering method provided by an embodiment of the present application.
  • Figure 46 can be understood as a specific embodiment of the filtering method shown in Figure 13 above.
  • the image filtering method in this embodiment of the present application includes:
  • the image to be filtered can be collected by an image acquisition device, or drawn by an image rendering device, etc.
  • the image to be filtered can be a reconstructed image.
  • S902. Determine the filter based on the neural network; divide the image to be filtered according to the blocking method corresponding to the filter based on the neural network, and obtain N image blocks to be filtered.
  • the blocking method is the blocking method of the training image used by the neural network-based filter in the training process, that is, based on the same blocking method of the training image used by the neural network-based filter in the training process.
  • the image to be filtered is divided to obtain N image blocks to be filtered, where N is a positive integer.
  • the training image block method used by a neural network-based filter during the training process is to determine a CTU in the training image as a training image block.
  • a CTU of the image to be filtered is determined.
  • N image blocks to be filtered are obtained.
  • the reference image block of the image block to be filtered is determined in the same manner as the reference image block of the input image block.
  • the reference image block of the input image block includes at least one of the temporal reference image block and the spatial domain reference image block of the input image block
  • the reference image block of the image block to be filtered includes the reference image block of the image block to be filtered. At least one of a temporal reference image block and a spatial domain reference image block.
  • the reference image block of the input image block is also input to improve the filtering effect of the neural network-based filter.
  • the reference image block of the image block to be filtered is determined using the same determination method as the reference image block of the input image block, Then, the image block to be filtered and the reference image block of the image block to be filtered are input into the filter based on the neural network to achieve the filtering effect of the image to be filtered.
  • the image filtering method provided by the embodiment of the present application divides the image to be filtered using the same block method as the training image used in the training process of the neural network-based filter to obtain N image blocks to be filtered; for N For each of the image blocks to be filtered, the reference image block of the image block to be filtered is determined according to the determination method of the reference image block of the input image block; the image block to be filtered and the reference image of the image block to be filtered are block, input it into the filter based on the neural network for filtering, and obtain the filtered image.
  • the blocking method of the neural network-based filter in the actual use process is consistent with the blocking method used in training, and the reference image block of the input image block is determined , consistent with the method of determining the reference image block of the image block to be filtered, to further improve the filtering effect of the neural network-based filter.
  • FIG. 13 to FIG. 46 are only examples of the present application and should not be understood as limitations of the present application.
  • FIG 47 is a schematic block diagram of an image filtering device provided by an embodiment of the present application.
  • the device 10 may be an electronic device or a part of an electronic device.
  • the image filtering device 10 may include:
  • Acquisition unit 11 used to acquire the image to be filtered
  • the dividing unit 12 is used to determine the filter based on the neural network; divide the image to be filtered according to the blocking method corresponding to the filter based on the neural network to obtain N image blocks to be filtered, and the blocking method is the filter based on the neural network
  • the training image block method used in the training process, N is a positive integer;
  • the filtering unit 13 is configured to use a filter based on a neural network to filter the N image blocks to be filtered to obtain a filtered image.
  • the blocking method of the training image includes determining M CTUs in the training image as one training image block, where M is a positive integer;
  • the dividing unit 12 is used to determine M CTUs in the image to be filtered as one image block to be filtered, and obtain N image blocks to be filtered.
  • the blocking method of the training image includes determining P incomplete CTUs in the training image as a training image block, and P is a positive integer;
  • the dividing unit 12 is used to determine P incomplete CTUs in the image to be filtered as one image block to be filtered, and obtain N image blocks to be filtered.
  • the neural network-based filter is trained by augmenting image patches of training image patches
  • the filtering unit 13 is used to expand the image block to be filtered according to the expansion method of the training image block for each of the N image blocks to be filtered, and obtain the expanded image block to be filtered; using a neural network-based Filter, filter the expanded image block to be filtered to obtain a filtered expanded image block; determine the image area corresponding to the image block to be filtered in the filtered expanded image block as the filtered image corresponding to the image block to be filtered piece.
  • the way of expanding the training image block includes expanding outward at least one boundary area of the training image block
  • the filtering unit 13 is configured to expand at least one boundary area of the image block to be filtered outward to obtain an expanded image block to be filtered.
  • the training image includes an input image
  • the input data when training a neural network-based filter includes an input image block and a reference image block of the input image block.
  • the input image block is obtained by dividing the input image into blocks. divided into;
  • the filtering unit 13 is configured to determine the reference image block of the image block to be filtered for each of the N image blocks to be filtered; input the image block to be filtered and the reference image block of the image block to be filtered based on Filtering is performed in the filter of the neural network to obtain the filtered image block of the image block to be filtered.
  • the filtering unit 13 is used to obtain the determination method of the reference image block of the input image block, and the determination method is used to determine the corresponding reference image according to at least one of the spatial domain information and the time domain information of the input image block. block; determine the reference image block of the image block to be filtered according to the determination method of the reference image block of the input image block.
  • the reference image block of the input image block includes at least one of a temporal reference image block and a spatial domain reference image block of the input image block; the reference image block of the image block to be filtered includes a temporal reference of the image block to be filtered. At least one of an image block and a spatial reference image block.
  • the reference image block of the input image block also includes a spatial reference image block of the input image block
  • the filtering unit 13 is configured to determine the spatial reference image block of the image block to be filtered according to the determination method of the spatial reference image block of the input image block.
  • the spatial reference image block of the input image block includes at least one of an upper left image block, a left image block and an upper image block of the input image block in the input image;
  • the filtering unit 13 is used to filter the upper left image block, the left image block and the upper left image block of the image block to be filtered in the image to be filtered. At least one of the square image blocks is determined as a spatial reference image block of the image block to be filtered.
  • the reference image block of the input image block also includes a temporal reference image block of the input image block
  • the filtering unit 13 is configured to determine the temporal reference image block of the image block to be filtered according to the determination method of the temporal reference image block of the input image block.
  • the temporal reference image block of the input image block includes an image block at a position corresponding to the input image block in the reference image of the input image;
  • the filtering unit 13 is specifically used to determine the reference image of the image to be filtered; determine the image block at the corresponding position of the image block to be filtered in the reference image of the image to be filtered as the time domain reference image block of the image block to be filtered.
  • the training image includes an input image and a target image corresponding to the input image.
  • the filter based on the neural network is trained with the input image block as input data and the target image block as the target.
  • the input image block is trained by The image block method divides the input image into image blocks, and the target image block is obtained by dividing the target image into blocks using the training image block method.
  • the acquisition unit 11 is used to reconstruct the current image to obtain a reconstructed image of the current image; and determine the reconstructed image as the image to be filtered.
  • the filtering unit 13 is also used to generate a reference image for prediction based on the filtered image, and store it in the decoding cache.
  • the filtering unit 13 is also used to generate a display image based on the filtered image, input the display image to the display device for display, and skip storing the filtered image or the reprocessed image of the filtered image into the decoding cache. .
  • Figure 48 is a schematic block diagram of an electronic device provided by an embodiment of the present application.
  • the electronic device is used to execute the above method embodiment.
  • the electronic device 30 may include:
  • Memory 31 and processor 32 the memory 31 is used to store a computer program 33 and transmit the computer program 33 to the processor 32 .
  • the processor 32 can call and run the computer program 33 from the memory 31 to implement the method in the embodiment of the present application.
  • the processor 32 may be configured to perform the above method steps according to instructions in the computer program 33 .
  • the processor 32 may include but is not limited to:
  • DSP Digital Signal Processor
  • ASIC Application Specific Integrated Circuit
  • FPGA Field Programmable Gate Array
  • the memory 31 includes but is not limited to:
  • Non-volatile memory can be read-only memory (Read-Only Memory, ROM), programmable read-only memory (Programmable ROM, PROM), erasable programmable read-only memory (Erasable PROM, EPROM), electrically removable memory. Erase programmable read-only memory (Electrically EPROM, EEPROM) or flash memory. Volatile memory may be Random Access Memory (RAM), which is used as an external cache.
  • RAM Random Access Memory
  • RAM static random access memory
  • DRAM dynamic random access memory
  • DRAM synchronous dynamic random access memory
  • SDRAM double data rate synchronous dynamic random access memory
  • Double Data Rate SDRAM DDR SDRAM
  • ESDRAM enhanced synchronous dynamic random access memory
  • Synch Link DRAM SLDRAM
  • Direct Rambus RAM Direct Rambus RAM
  • the computer program 33 can be divided into one or more modules, and the one or more modules are stored in the memory 31 and executed by the processor 32 to complete the provisions of this application. method of recording a page.
  • the one or more modules may be a series of computer program instruction segments capable of completing specific functions. The instruction segments are used to describe the execution process of the computer program 33 in the electronic device.
  • the electronic device 30 may also include:
  • Transceiver 34 the transceiver 34 can be connected to the processor 32 or the memory 31 .
  • the processor 32 can control the transceiver 34 to communicate with other devices. For example, it can send information or data to other devices, or receive information or data sent by other devices.
  • Transceiver 34 may include a transmitter and a receiver.
  • the transceiver 34 may further include an antenna, and the number of antennas may be one or more.
  • bus system where in addition to the data bus, the bus system also includes a power bus, a control bus and a status signal bus.
  • a computer storage medium is provided, with a computer program stored thereon.
  • the computer program When the computer program is executed by a computer, the computer can perform the method of the above method embodiment.
  • An embodiment of the present application also provides a computer program product containing instructions, which when executed by a computer causes the computer to perform the method of the above method embodiment.
  • a computer program product or computer program including computer instructions stored in a computer-readable storage medium.
  • the processor of the electronic device reads the computer instructions from the computer-readable storage medium, and the processor executes the computer instructions, so that the electronic device executes the method of the above method embodiment.

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Computation (AREA)
  • Data Mining & Analysis (AREA)
  • General Engineering & Computer Science (AREA)
  • Biophysics (AREA)
  • Biomedical Technology (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Computational Linguistics (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Compression Or Coding Systems Of Tv Signals (AREA)

Abstract

本申请提供了一种图像滤波方法、装置及设备,可应用于通过获取待滤波图像(S801);确定基于神经网络的滤波器;根据基于神经网络的滤波器对应的分块方式,对待滤波图像进行划分,得到N个待滤波图像块,N为正整数(S802);使用基于神经网络的滤波器,对N个待滤波图像块分别进行滤波,得到滤波后图像(S803)。其中,上述分块方式是基于神经网络的滤波器所使用的训练图像的分块方式,即本申请实施例中,通过将基于神经网络的滤波器在实际使用过程中的分块方式,与在训练时所使用的分块方式保持一致,以使得基于神经网络的滤波器发挥出最佳滤波性能,进而提高了图像的滤波效果。

Description

图像滤波方法、装置及设备
本申请要求于2022年05月18日提交的申请号为202210551039.7、发明名称为“图像滤波方法、装置及设备”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本申请实施例涉及图像处理技术领域,尤其涉及一种图像滤波方法、装置及设备。
背景技术
数字视频技术可以并入多种视频装置中,例如数字电视、智能手机、计算机、电子阅读器或视频播放器等。随着视频技术的发展,视频数据所包括的数据量较大,为了便于视频数据的传输,视频装置执行视频压缩技术,以使视频数据更加有效的传输或存储。
在视频压缩技术中会造成图像损失,为了降低损失,则对重建图像进行滤波。随着神经网络技术的快速发展,在一些场景中,使用基于神经网络的滤波器对重建图像进行滤波。但是,一些基于神经网络的滤波器的滤波效果差。
发明内容
本申请提供一种图像滤波方法、装置及设备,技术方案如下。
一个方面,本申请提供一种图像滤波方法,包括:
获取待滤波图像;
确定基于神经网络的滤波器;
根据所述基于神经网络的滤波器对应的分块方式对所述待滤波图像进行划分,得到N个待滤波图像块,所述分块方式是所述基于神经网络的滤波器在训练过程中所使用的训练图像的分块方式,所述N为正整数;
使用所述基于神经网络的滤波器,对所述N个待滤波图像块分别进行滤波,得到滤波后图像。
另一方面,本申请提供一种图像滤波装置,包括:
获取单元,用于获取待滤波图像;
划分单元,用于确定基于神经网络的滤波器;根据所述基于神经网络的滤波器对应的分块方式对所述待滤波图像进行划分,得到N个待滤波图像块,所述分块方式是所述基于神经网络的滤波器在训练过程中所使用的训练图像的分块方式,所述N为正整数;
滤波单元,用于使用所述基于神经网络的滤波器,对所述N个待滤波图像块分别进行滤波,得到滤波后图像。
另一方面,提供了一种电子设备,包括:处理器和存储器,该存储器用于存储计算机程序,该处理器用于调用并运行该存储器中存储的计算机程序,以执行本申请中各个方面的实施例提供的方法。
另一方面,提供了一种芯片,用于实现本申请提供的各个方面的方法。示例性的,所述芯片包括:处理器,用于从存储器中调用并运行计算机程序,使得安装有所述芯片的设备执行如本申请中各个方面的实施例提供的方法。
另一方面,提供了一种计算机可读存储介质,用于存储计算机程序,该计算机程序使得计算机执行本申请中各个方面的实施例提供的方法。
另一方面,提供了一种计算机程序产品,包括计算机程序指令,所述计算机程序指令使得计算机执行本申请中各个方面的实施例提供的方法。
另一方面,提供了一种计算机程序,当其在计算机上运行时,使得计算机执行本申请中各个方面的实施例提供的方法。
综上所述,在本申请中,通过获取待滤波图像;确定基于神经网络的滤波器;根据基于神经网络的滤波器对应的分块方式对待滤波图像进行划分,得到N个待滤波图像块,N为正整数;使用基于神经网络的滤波器,对N个待滤波图像块分别进行滤波,得到滤波后图像。上述分块方式是基于神经网络的滤波器在训练过程中所使用的训练图像的分块方式,即本申请实施例中,通过将基于神经网络的滤波器在实际使用过程中的分块方式,与在训练时所使用的分块方式保持一致,使得基于神经网络的滤波器发挥出最佳滤波性能,进而提高了图像的滤波效果。
附图说明
图1为本申请实施例涉及的一种应用场景示意图;
图2为本申请实施例涉及的一种视频编解码系统的示意性框图;
图3为本申请实施例提供的编码框架的示意图;
图4为本申请实施例提供的解码框架的示意图;
图5至图7为图像类型示意图;
图8为本申请实施例涉及的滤波示意图;
图9至图12为一种图像分块方式;
图13为本申请一实施例提供的图像滤波方法流程图;
图14至图17为本申请实施例涉及的一种图像分块示意图;
图18至图21为本申请实施例涉及的另一种图像分块示意图;
图22至图25为本申请实施例涉及的另一种图像分块示意图;
图26至图29为本申请实施例涉及的一种图像块扩充示意图;
图30至图33为本申请实施例涉及的另一种图像块扩充示意图;
图34和图35为本申请实施例涉及的一种空域参考图像块示意图;
图36和图37为本申请实施例涉及的另一种空域参考图像块示意图;
图38和图39为本申请实施例涉及的另一种空域参考图像块示意图;
图40和图41为本申请实施例涉及的另一种空域参考图像块示意图;
图42和图43为本申请实施例涉及的一种时域参考图像块示意图;
图44和图45为本申请实施例涉及的另一种时域参考图像块示意图;
图46为本申请一实施例提供的图像滤波方法流程示意图;
图47是本申请一实施例提供的图像滤波装置的示意性框图;
图48是本申请实施例提供的电子设备的示意性框图。
具体实施方式
图1为本申请实施例涉及的一种应用场景示意图,包括电子设备100,该电子设备100上安装有基于神经网络的滤波器200。这样,电子设备100获得待滤波图像后,将该点滤波图像输入基于神经网络的滤波器200中进行滤波。
在一些实施例中,电子设备100包括显示装置,这样,电子设备100可以通过显示装置将滤波后图像进行显示。
本申请实施例对电子设备100的具体类型不做限制,可以是任意具有数据处理功能的设备。
在一些实施例中,电子设备100可以为终端设备,例如包含智能手机、台式计算机、移动计算装置、笔记本(例如,膝上型)计算机、平板计算机、机顶盒、电视、相机、显示设备、数字媒体播放器、视频游戏控制台、车载计算机等。
在一些实施例中,上述电子设备100还可以是服务器。该服务器可以是一台或多台。服务器是多台时,存在至少两台服务器用于提供不同的服务,和/或,存在至少两台服务器用于提供相同的服务,比如以负载均衡方式提供同一种服务,本申请实施例对此不加以限定。
可选的,上述服务器可以是独立的物理服务器,也可以是多个物理服务器构成的服务器 集群或者分布式系统,还可以是提供云服务、云数据库、云计算、云函数、云存储、网络服务、云通信、中间件服务、域名服务、安全服务、CDN(Content Delivery Network,内容分发网络)、以及大数据和人工智能平台等基础云计算服务的云服务器。服务器也可以成为区块链的节点。
在一些实施例中,上述服务器为拥有强大计算资源的云服务器,具备高度虚拟化、高度分布式的特点。
在一些实施例中,上述待滤波图像是图像采集设备采集的。例如,图像采集设备将采集的待滤波图像发送给电子设备100,电子设备100通过基于神经网络的滤波器对拍摄的待滤波图像进行滤波。再例如,电子设备100具有图像采集功能,这样电子设备100可以采集图像,并将采集的待滤波图像输入基于神经网络的滤波器中进行滤波。
在一些实施例中,上述电子设备100可以为编码设备,待滤波图像可以理解为重建图像。这样,编码设备对当前图像进行编码后重建,得到重建图像,将该重建图像输入基于神经网络的滤波器进行滤波。
在一些实施例中,上述电子设备100可以为解码设备,解码设备解码码流后进行图像重建,得到重建图像,接着,将该重建图像输入基于神经网络的滤波器进行滤波。
本申请实施例可以应用于任意需要对图像进行滤波的场景。
在一些实施例中,本申请实施例可应用于各种场景,包括但不限于云技术(例如云游戏)、人工智能、智慧交通、辅助驾驶等。
在一些实施例中,本申请可应用于图像编解码领域、视频编解码领域、硬件视频编解码领域、专用电路视频编解码领域、实时视频编解码领域等。例如,本申请的方案可结合至音视频编码标准(Audio Video coding Standard,简称AVS),例如,H.264/音视频编码(Audio Video Coding,简称AVC)标准,H.265/高效视频编码(High Efficiency Video Coding,简称HEVC)标准以及H.266/多功能视频编码(Versatile Video Coding,简称VVC)标准。或者,本申请的方案可结合至其它专属或行业标准中,所述标准包含ITU-TH.261、ISO/IECMPEG-1Visual、ITU-TH.262或ISO/IECMPEG-2Visual、ITU-TH.263、ISO/IECMPEG-4Visual,ITU-TH.264(还称为ISO/IECMPEG-4AVC),包含可分级视频编解码(SVC)及多视图视频编解码(MVC)扩展。应理解,本申请的技术不限于任何特定编解码标准或技术。
为了便于理解,首先结合图2对本申请实施例涉及的视频编解码系统进行介绍。
图2为本申请实施例涉及的一种视频编解码系统的示意性框图。需要说明的是,图2只是一种示例,本申请实施例的视频编解码系统包括但不限于图2所示。如图2所示,该视频编解码系统包含编码设备110和解码设备120。其中编码设备用于对视频数据进行编码(可以理解成压缩)产生码流,并将码流传输给解码设备。解码设备对编码设备编码产生的码流进行解码,得到解码后的视频数据。
本申请实施例的编码设备110可以理解为具有视频编码功能的设备,解码设备120可以理解为具有视频解码功能的设备,即本申请实施例对编码设备110和解码设备120包括更广泛的装置,例如包含智能手机、台式计算机、移动计算装置、笔记本(例如,膝上型)计算机、平板计算机、机顶盒、电视、相机、显示装置、数字媒体播放器、视频游戏控制台、车载计算机等。
在一些实施例中,编码设备110可以经由信道130将编码后的视频数据(如码流)传输给解码设备120。信道130可以包括能够将编码后的视频数据从编码设备110传输到解码设备120的一个或多个媒体和/或装置。
在一个实例中,信道130包括使编码设备110能够实时地将编码后的视频数据直接发射到解码设备120的一个或多个通信媒体。在此实例中,编码设备110可根据通信标准来调制编码后的视频数据,且将调制后的视频数据发射到解码设备120。其中通信媒体包含无线通信媒体,例如射频频谱,可选的,通信媒体还可以包含有线通信媒体,例如一根或多根物理 传输线。
在另一实例中,信道130包括存储介质,该存储介质可以存储编码设备110编码后的视频数据。存储介质包含多种本地存取式数据存储介质,例如光盘、DVD、快闪存储器等。在该实例中,解码设备120可从该存储介质中获取编码后的视频数据。
在另一实例中,信道130可包含存储服务器,该存储服务器可以存储编码设备110编码后的视频数据。在此实例中,解码设备120可以从该存储服务器中下载存储的编码后的视频数据。可选的,该存储服务器可以存储编码后的视频数据且可以将该编码后的视频数据发射到解码设备120,例如web服务器(例如,用于网站)、文件传送协议(FTP)服务器等。
一些实施例中,编码设备110包含视频编码器112及输出接口113。其中,输出接口113可以包含调制器/解调器(调制解调器)和/或发射器。
在一些实施例中,编码设备110除了包括视频编码器112和输入接口113外,还可以包括视频源111。
视频源111可包含视频采集装置(例如,视频相机)、视频存档、视频输入接口、计算机图形系统中的至少一个,其中,视频输入接口用于从视频内容提供者处接收视频数据,计算机图形系统用于产生视频数据。
视频编码器112对来自视频源111的视频数据进行编码,产生码流。视频数据可包括一个或多个图像(picture)或图像序列(sequence of pictures)。码流以比特流的形式包含了图像或图像序列的编码信息。编码信息可以包含编码图像数据及相关联数据。相关联数据可包含序列参数集(Sequence Parameter Set,简称SPS)、图像参数集(Picture Parameter Set,简称PPS)及其它语法结构。SPS可含有应用于一个或多个序列的参数。PPS可含有应用于一个或多个图像的参数。语法结构是指码流中以指定次序排列的零个或多个语法元素的集合。
视频编码器112经由输出接口113将编码后的视频数据直接传输到解码设备120。编码后的视频数据还可存储于存储介质或存储服务器上,以供解码设备120后续读取。
在一些实施例中,解码设备120包含输入接口121和视频解码器122。
在一些实施例中,解码设备120除包括输入接口121和视频解码器122外,还可以包括显示装置123。
其中,输入接口121包含接收器及/或调制解调器。输入接口121可通过信道130接收编码后的视频数据。
视频解码器122用于对编码后的视频数据进行解码,得到解码后的视频数据,并将解码后的视频数据传输至显示装置123。
显示装置123显示解码后的视频数据。显示装置123可与解码设备120整合或在解码设备120外部。显示装置123可包括多种显示装置,例如液晶显示器(LCD)、等离子体显示器、有机发光二极管(OLED)显示器或其它类型的显示装置。
此外,图2仅为实例,本申请实施例的技术方案不限于图2,例如本申请的技术还可以应用于单侧的视频编码或单侧的视频解码。
下面对本申请实施例涉及的视频编码框架进行介绍。
图3为本申请实施例提供的编码框架的示意图。
应理解,该编码框架可用于对图像进行有损压缩(lossy compression),也可用于对图像进行无损压缩(lossless compression)。该无损压缩可以是视觉无损压缩(visually lossless compression),也可以是数学无损压缩(mathematically lossless compression)。
该编码框架可应用于亮度色度(YCbCr,YUV)格式的图像数据上。
例如,该编码框架读取视频数据,针对视频数据中的每帧图像,将一帧图像划分成若干个编码树单元(Coding Tree Unit,CTU),在一些例子中,CTU可被称作“树型块”、“最大编码单元”(Largest Coding Unit,简称LCU)或“编码树型块”(Coding Tree Block,简称CTB)。每一 个CTU可以与图像内的具备相等大小的像素块相关联。每一像素可对应一个亮度(luminance或luma)采样及两个色度(chrominance或chroma)采样。因此,每一个CTU可与一个亮度采样块及两个色度采样块相关联。例如,一个CTU大小为128×128、64×64、32×32等。一个CTU又可以继续被划分成若干个编码单元(Coding Unit,CU)进行编码,CU可以为矩形块或方形块。CU可以进一步划分为预测单元(Prediction Unit,简称PU)和变换单元(Transform Unit,简称TU),进而使得编码、预测、变换分离,处理的时候更灵活。在一种示例中,CTU以四叉树方式划分为CU,CU以四叉树方式划分为TU、PU。
视频编码器及视频解码器可支持各种PU大小。假定特定CU的大小为2N×2N,视频编码器及视频解码器可支持2N×2N或N×N的PU大小以用于帧内预测,且支持2N×2N、2N×N、N×2N、N×N或类似大小的对称PU以用于帧间预测。视频编码器及视频解码器还可支持2N×nU、2N×nD、nL×2N及nR×2N的不对称PU以用于帧间预测。
如图3所示,该编码框架包括:预测单元11、残差产生单元12、变换单元13、量化单元14、逆量化单元15、逆变换单元16、重建单元17、滤波单元18及熵编码单元19。预测单元11包含帧间预测单元101及帧内预测单元102。帧间预测单元101包含运动估计单元1011及运动补偿单元1012。需要说明的是,该编码框架可包含更多、更少或不同的功能组件。
可选的,在本申请中,当前块(current block)可以称为当前编码单元(CU)或当前预测单元(PU)等。预测块也可称为预测图像块或图像预测块,重建图像块也可称为重建块或图像重建图像块。
其中,编码端接收到视频之后,对于构成视频的每帧图像,将该图像划分成多个待编码图像块。对于当前待编码图像块,预测单元11首先通过参考重建图像块对当前待编码图像块进行预测,得到当前待编码图像块的预测信息。其中,编码端可以采用帧间预测或者帧内预测技术得到预测信息。
示例性的,帧间预测单元101中的运动估计单元1011可搜索参考图片的列表中的参考图片以查找待编码图像块的参考块。运动估计单元1011可产生指示该参考块的索引,及指示待编码图像块与该参考块之间的空间位移的运动向量。运动估计单元1011可将参考块的索引及该运动向量作为该待编码图像块的运动信息而输出。运动补偿单元1012可基于该待编码图像块的运动信息得到该待编码图像块的预测信息。
帧内预测单元102可以采用帧内预测模式对当前待编码图像块产生预测信息。目前存在15种帧内预测模式,包括Planar模式、DC模式以及13种角度预测模式。帧内预测单元102也可以采用帧内块复制(Intra Block Copy,IBC)、帧内串复制(Intra String Copy,ISC)技术等。
HEVC使用的帧内预测模式有平面模式(Planar)、DC和33种角度模式,共35种预测模式。VVC使用的帧内模式有Planar、DC和65种角度模式,共67种预测模式。AVS3使用的帧内模式有DC、Plane、Bilinear和63种角度模式,共66种预测模式。
残差产生单元12用于对当前待编码图像块的原始信号减去预测信息,得到残差信号。经过预测后,残差信号的幅值远小于原始信号。
变换单元13和量化单元14用于对残差信号进行变换和量化操作。经过变换量化后,得到变换量化系数。
熵编码单元19用于通过熵编码技术编码量化系数以及编码中的其他指示信息,得到码流。
进一步地,编码端还需要重建当前待编码图像块,以实现对后续待编码图像块的编码提供参考像素。示例性的,在得到当前待编码图像块的变换量化系数之后,逆量化单元15和逆变换单元16对当前待编码图像块的变换量化系数进行反量化和反变换,得到重建的残差信号,重建单元17将重建的残差信号与当前待编码图像块对应的预测信息相加,得到当前待编码图像块的重建信号,根据该重建信号得到重建图像块。
更进一步地,滤波单元18可以对重建图像块可以进行滤波,其中可以采用去块效应滤波(DeBlocking Filter,DBF)、自适应样点补偿(Sample Adaptive Offset,SAO)或者自适应环路滤波(Adaptive Loop Filter,ALF)等。其中,该重建图像块可以对后续待编码图像块进行预测。
在一些实施例中,重建图像块可以存储在解码图像缓存中,帧间预测单元101可使用含有重建后的像素块的参考图像来对其它图像的PU执行帧间预测。另外,帧内预测单元102可使用解码图像缓存中的重建图像块对与CU相同的图像中的其它PU执行帧内预测。
图4为本申请实施例提供的解码框架的示意图。
如图4所示,该解码框架包括:熵解码单元21、预测单元22、逆量化单元23、逆变换单元24、重建单元25、滤波单元26。预测单元22包括:运动补偿单元221及帧内预测单元222。
示例性的,解码端获取到码流之后,首先熵解码单元21对码流进行熵解码,得到当前待重建图像块的变换量化系数,然后逆量化单元23和逆变换单元24对变换量化系数进行反量化和反变换,得到当前待重建图像块的重建的残差信号。预测单元22对当前待重建图像块进行预测,得到当前待重建图像块的预测信息。如果预测单元22采用帧间预测,则运动补偿单元221可根据从码流解析的语法元素来构造第一参考图片列表(列表0)及第二参考图片列表(列表1)。此外,则熵解码单元21可解析待重建图像块的运动信息。运动补偿单元221可根据该运动信息来确定待重建图像块的一个或多个参考块。运动补偿单元221可根据一个或多个参考块来产生待重建图像块的预测信息。如果预测单元22采用帧内预测,则熵解码单元21可解析使用的帧内预测模式的索引,帧内预测单元222可以根据该索引,采用该帧内预测模式进行帧内预测,得到待重建图像块的预测信息。帧内预测单元222也可以采用IBC或者ISC技术等。
进一步地,重建单元25用于将预测信息和上述重建的残差信号相加,得到当前待重建图像块的重建信号,然后根据该重建信号得到当前待重建图像块对应的当前重建图像块,其中,该当前重建图像块可以对后续其他待重建图像块进行预测。类似于上述编码端的情况,可选地,在解码端滤波单元26可以对当前重建图像块进行滤波。
需要说明的是,编码端确定的块划分信息,以及预测、变换、量化、熵编码、环路滤波等模式信息或者参数信息等在必要时携带在码流中。解码端通过解析码流及根据已有信息进行分析确定与编码端相同的块划分信息,预测、变换、量化、熵编码、环路滤波等模式信息或者参数信息,从而保证编码端获得的解码图像和解码端获得的解码图像相同。
上述是基于块的混合编码框架下的视频编解码器的基本流程,随着技术的发展,该框架或流程的一些模块或步骤可能会被优化,本申请适用于该基于块的混合编码框架下的视频编解码器的基本流程,但不限于该框架及流程。
由上述视频编解码的预测方式可知,编码图像分为全帧内编码图像和可帧间编码图像,如图5至图7所示,其中图像包括帧(frame)、条带(slice)、片(tile)。图5中虚线框代表最大编码单元CTU的边界,图6中黑色实线代表条带,图7中黑色实线片边界。
其中,全帧内编码图像进行预测的参考信息全部来自当前图像的空域信息。可帧间编码图像进行预测过程中可以参考其它参考帧的时域参考信息。
由上述视频编解码的相关知识可知,传统的环路滤波器有DBF、SAO和ALF,主要针对重建图像进行滤波,减弱块效应、振铃效应等,从而提升重建图像的质量,理想情况是通过滤波器将重建图像恢复到原始图像。但是,由于传统滤波器的很多滤波系数是人工设计的,存在很大的优化空间。鉴于深度学习工具在图像处理方面的卓越表现,基于深度学习的滤波器被应用于环路滤波器模块。
本申请实施例涉及的主要是技术是基于神经网络的滤波器,例如基于神经网络的环路滤波器(Neural Network Loop Filter,NNLF)。如图8所示,将待滤波图像输入训练好的基于神经网络的环路滤波器NNLF中进行滤波,得到滤波图像。
在一些实施例中,在基于神经网络的滤波器的训练过程中,需要输入图像,并指定目标图像作为优化目标对滤波器参数进行训练。训练过程中,输入图像和目标图像在空间上是对齐的。一般输入图像从重建的失真图像中选取,目标图像从原始图像中选取作为最佳的优化目标。
在模型训练过程中会用到损失函数,损失函数是衡量预测值与真实值的差异,损失值loss越大,表示差异越大,而训练的目标就是要减小loss。对于基于深度学习的编码工具,常用的损失函数有:L1范数损失函数、L2范数损失函数和平滑smooth L1损失函数。
基于神经网络的滤波器在实际使用的时候,一般不会直接输入整帧图像进行滤波,而是将图像划分为子图像,然后逐步将子图像输入到基于神经网络的滤波器进行滤波。
因此,在基于神经网络的滤波器的训练过程中,也会将输入图像和目标图像进行划分,得到输入图像块和目标图像块构成的匹配对,进而根据匹配对对基于神经网络的滤波器进行训练。
在一些实施例中,在对输入图像和目标图像进行划分时,采用在一帧图像中进行随机裁剪的方式,选取输入图像块和目标图像块形成的匹配对,如图9和图10所示,采用随机裁剪的方式在输入图像和目标图像中进行随机裁剪,得到1到5号图像块。然后,在实际滤波的过程中,通常采用以CTU为基本单元进行滤波,例如图11和图12所示,以A至E这5个CTU进行实际滤波。
由上述可知,由于基于神经网络的滤波器在训练过程中的图像块的划分方式,与实际使用过程中图像块的划分方式不一致,导致基于神经网络的滤波器的滤波效果差。
为了解决上述技术问题,本申请实施例在实际滤波过程中,根据与基于神经网络的滤波器在训练过程中所使用的训练图像的相同的分块方式,对待滤波图像进行划分,得到N个待滤波图像块,针对N个待滤波图像块中的每个待滤波图像块使用基于神经网络的滤波器对该待滤波图像块进行滤波,得到最终的滤波图像。即本申请实施例中,通过将基于神经网络的滤波器在实际使用过程中的分块方式,与在训练时所使用的分块方式保持一致,以使得基于神经网络的滤波器发挥出最佳滤波性能,进而提高了图像的滤波效果。
下面通过一些实施例对本申请实施例的技术方案进行详细说明。下面这几个实施例可以相互结合,对于相同或相似的概念或过程可能在某些实施例不再赘述。
图13为本申请一实施例提供的图像滤波方法流程图,如图13所示,本申请实施例包括如下步骤:
S801、获取待滤波图像。
本申请实施例中,获取待滤波图像的方法包括但不限于如下几种情况:
情况1,对于非视频编解码场景。在一种示例中,该待滤波图像可以是图像采集设备采集的,例如通过摄像头拍摄的。在另一种示例,该待滤波图像可以是图像生成设备生成的,例如通过图像绘制装置,绘制而成。
情况2,对于视频编解码场景,获取待滤波图像的方式至少包括如下几种:
方式1,对于编码端,该待滤波图像可以是编码之前的图像,也就是说,在编码端,在对待滤波图像进行编码之前,首先对该待滤波图像进行滤波,再对滤波后图像进行编码。例如,编码端编码待滤波图像1之前,首先将该待滤波图像1输入基于神经网络的滤波器中进行滤波,得到滤波后的图像2。接着,对该图像2进行块划分,得到多个编码块,针对每个编码块采用帧间或帧内等预测方法进行预测,得到编码块的预测块,将编码块与预测块作差,得到残差块,对残差块进行变换和量化,得到量化系数,最后对量化系数进行编码。
方式2,对于编码端,该待滤波图像可以是重建图像,也就是说,对当前图像进行重建,得到当前图像的重建图像;将重建图像,确定为待滤波图像。例如,编码端对该当前图像进行块划分,得到多个编码块,针对每个编码块采用帧间或帧内等预测方法进行预测,得到编码块的预测块,将编码块与预测块作差,得到残差块,对残差块进行变换和量化,得到量化系数,最后对量化系数进行编码。另外,编码端还对量化参数进行反量化,得到编码块的变换系数,对变换系数进行反变换,得到残差块,将残差块与预测块相加,得到该编码块的重建块。将当前图像中所有编码块的重建块进行组合,得到当前图像的重建图像。将该重建图像作为待滤波图像,输入基于神经网络的滤波器中进行滤波,得到滤波后图像。
方式3,对于解码端,该待滤波图像可以是重建图像,也就是说,对当前图像进行重建,得到当前图像的重建图像;将重建图像,确定为待滤波图像。例如,解码端对接收到的码流进行解码,得到当前图像中当前块的量化系数,接着,对量化系数进行反量化,得到当前块的变换系数,对变换系数进行反变换,得到残差块。进一步的,采用帧间或帧内等预测方法对当前块进行预测,得到当前块的预测块,将残差块与预测块相加,得到该当前块的重建块。将当前图像中所有块的重建块进行组合,得到当前图像的重建图像。将该重建图像作为待滤波图像,输入基于神经网络的滤波器中进行滤波,得到滤波后图像。
需要说明的是,本申请实施例中,确定待滤波图像的方式包括但不限于上述几种,本申请还可以通过其他方法,得到待滤波图像,本申请实施例对此不做限制。
S802、确定基于神经网络的滤波器;根据基于神经网络的滤波器对应的分块方式对待滤波图像进行划分,得到N个待滤波图像块。
其中,上述分块方式是基于神经网络的滤波器在训练过程中所使用的训练图像的分块方式,也即根据与基于神经网络的滤波器在训练过程中所使用的训练图像的相同的分块方式,对待滤波图像进行划分,得到N个待滤波图像块,N为正整数。
本申请实施例中,在使用基于神经网络的滤波器对待滤波图像进行滤波之前,首先需要确定滤波器。
在一些实施例中,上述基于神经网络的滤波器为预设的或默认的,这样可以直接使用该预设的或默认的基于神经网络的滤波器进行滤波即可。
在一些实施例中,本申请实施例包括多个基于神经网络的候选滤波器,以下简称候选滤波器。这样可以从这多个候选滤波器中,确定一个候选滤波器作为本申请实施例的基于神经网络的滤波器。
在一种示例中,这多个候选滤波器中至少两个候选滤波器的网络结构不完全相同。
在另一种示例中,这多个候选滤波器中至少两个候选滤波器在训练过程中所使用的训练图像的分块方式不同。例如,候选滤波器1在训练时,将输入图像中的一个CTU确定为一个输入块,将目标图像中相同位置处的一个CTU确定为一个目标块,进而以该输入块为输入,以该目标块为目标对该候选滤波器1进行训练。再例如,候选滤波器2在训练时,将输入图像中的2个CTU确定为一个输入块,将目标图像中相同位置处的2个CTU确定为一个目标块,进而以该输入块为输入,以该目标块为目标对该候选滤波器2进行训练。
也就是说,本申请实施例多个候选滤波器的网络结构、训练参数、训练方式等信息不完全相同。
上述,从多个候选滤波器中确定基于神经网络的滤波器的方式包括但不限于如下几种:
方式1,将候选滤波器中的任意一个候选滤波器确定为本步骤中的基本神经网络的滤波器。
方式2,将上述多个候选滤波器中滤波效果最佳的候选滤波器,作为本申请的基于神经网络的滤波器。示例性的,使用上述多个候选滤波器中的每个候选滤波器对待滤波图像进行滤波,得到每个候选滤波器下的滤波后图像,将这多个滤波后图像进行比较,确定出效果最佳的滤波后图像,进而将该效果最佳的滤波后图像对应的候选滤波器,确定为本步骤中的基 于神经网络的滤波器。
本申请实施例中,对确定滤波后图像的图像效果的方法不做限制,例如通过确定图像清晰度、锐度、伪影等图像指标,来确定滤波后图像的图像效果。
方式3,将上述多个候选滤波器中失真最小的候选滤波器,作为本申请的基于神经网络的滤波器。示例性的,使用上述多个候选滤波器中的每个候选滤波器对待滤波图像进行滤波,得到每个候选滤波器下的滤波后图像,将每个候选滤波器下的滤波后图像与待滤波图像进行比较,确定出每个候选滤波器对应的失真,进而将失真最小的候选滤波器,确定为本步骤中的基于神经网络的滤波器。
本申请实施例对上述确定候选滤波器对应的失真的方法不做限制,例如,将候选滤波器下的滤波后图像与待滤波图像的差值,确定为该候选滤波器对应的失真。
需要说明的是,本申请实施例中,确定基于神经网络的滤波器的方式包括但不限于如上几种。
根据上述方法,确定出该基于神经网络的滤波器后,则获取该基于神经网络的滤波器在训练过程中所使用的训练图像的分块方式。例如,该基于神经网络的滤波器的文件中包括该基于神经网络的滤波器在训练过程中所使用的训练图像的分块方式,这样可以直接从该基于神经网络的滤波器的文件中读取该基于神经网络的滤波器在训练过程中所使用的训练图像的分块方式。
本申请实施例中,为了提升基于神经网络的滤波器的滤波性能,则该基于神经网络的滤波器在实际使用过程中的分块方式,与在训练时所使用的分块方式保持一致。
基于此,在实际滤波过程中,本申请实施例使用基于神经网络的滤波器在训练过程中所使用的训练图像的分块方式,对待滤波图像进行划分,得到N个待滤波图像块。也就是说,本申请实施例,在实际滤波时,使用基于神经网络的滤波器在训练过程中所使用的训练图像的分块方式,对待滤波图像进行划分,得到N个待滤波图像块。例如,基于神经网络的滤波器在训练过程中,将训练图像中的每一个CTU确定为一个训练图像块进行模型训练,这样,在实际滤波时,也将待滤波图像中的每一个CTU确定为一个待滤波图像块进行滤波,进而保证了基于神经网络的滤波器在实际使用过程中的分块方式与训练过程中的分块方式保持一致,可以使得基于神经网络的滤波器发挥出最佳性能,进而提高了滤波效果。
本申请实施例对基于神经网络的滤波器在训练过程中所使用的训练图像的分块方式的具体类型不做限制,可以是任意分块方式。
在一些实施例中,若上述基于神经网络的滤波器在训练时所使用的训练图像的分块方式包括将训练图像中的M个CTU确定为一个训练图像块,M为正整数,则上述S802中根据基于神经网络的滤波器对应的分块方式对待滤波图像进行划分,得到N个待滤波图像块包括如下S802-A:
S802-A、将待滤波图像中的M个CTU确定为一个待滤波图像块,得到N个待滤波图像块。
也即,以M个CTU作为一个待滤波图像块的方式,从待滤波图像中确定出N个待滤波图像块。
本申请实施例对上述M的具体取值不做限制。
示例1,M=1,如图14所示,在基于神经网络的滤波器的训练过程中,将训练图像的一个CTU确定为一个训练图像块对基于神经网络的滤波器进行训练,其中图14中一个最小的虚线框内的区域为一个CTU。
在一种可能的实现方式中,若基于神经网络的滤波器是通过有监督的训练方法训练得到,则上述训练图像包括输入图像和目标图像,其中目标图像可以理解为监督图像。在训练过程中,如图15所示,将输入图像的一个CTU确定为一个输入图像块,如图16所示,将目标图 像的一个CTU确定为一个目标图像块,该输入图像块与该目标图像块组成一个匹配对,也就是说,一个匹配对中的输入图像块在输入图像中的位置,与目标图像块在目标图像中的位置一致。接着,将输入图像块输入基于神经网络的滤波器中进行滤波,得到该输入图像块对应的滤波图像块。将该输入图像块的滤波图像块与目标图像块进行比较,计算损失,并根据该损失对该基于神经网络的滤波器的参数进行调整。接着,参照上述方法,继续以下一个匹配对中的输入图像块作为输入,以下一个匹配对中的目标图像块为目标,对该基于神经网络的滤波器进行训练,得到训练好的基于神经网络的滤波器。
由上述可知,在该示例1中,上述基于神经网络的滤波器在训练过程中,将训练图像中的一个CTU确定为一个训练图像块,对应的,在实际滤波过程中,如图17所示,将待滤波图像中的一个CTU确定为一个待滤波图像块,进而对待滤波图像进行分块后,得到N个待滤波图像块。
示例2,M=4,如图18所示,在基于神经网络的滤波器的训练过程中,将训练图像的4个CTU确定为一个训练图像块对基于神经网络的滤波器进行训练。
在一种可能的实现方式中,若基于神经网络的滤波器是通过有监督的训练方法训练得到时,则上述训练图像包括输入图像和目标图像,其中目标图像可以理解为监督图像。在具体的训练过程中,如图19所示,将输入图像的4个CTU确定为一个输入图像块,如图20所示,将目标图像的4个CTU确定为一个目标图像块,该输入图像块与该目标图像块组成一个匹配对,也就是说,一个匹配对中的输入图像块在输入图像中的位置,与目标图像块在目标图像中的位置一致。接着,将输入图像块输入基于神经网络的滤波器中进行滤波,得到该输入图像块对应的滤波图像块。将该输入图像块的滤波图像块与目标图像块进行比较,计算损失,并根据该损失对该基于神经网络的滤波器的参数进行调整。接着,参照上述方法,继续以下一个匹配对中的输入图像块作为输入,以下一个匹配对中的目标图像块为目标,对该基于神经网络的滤波器进行训练,得到训练好的基于神经网络的滤波器。
由上述可知,在该示例2中,上述基于神经网络的滤波器在训练过程中,将训练图像中的4个CTU确定为一个训练图像块,对应的,在实际滤波过程中,如图21所示,将待滤波图像中的4个CTU确定为一个待滤波图像块,进而得到N个待滤波图像块。
上述以M为1或4为例进行说明,在一些实施例中,M还可以是2、3、5等任意正整数,本申请实施例对此不做限制。
在一些实施例中,若上述基于神经网络的滤波器在训练时所使用的训练图像的分块方式包括将训练图像中的P个残缺CTU确定为一个训练图像块,P为正整数时,则上述S802中根据与基于神经网络的滤波器在训练过程中所使用的训练图像的相同的分块方式,对待滤波图像进行划分,得到N个待滤波图像块包括如下S802-B:
S802-B、将待滤波图像中的P个残缺CTU确定为一个待滤波图像块,得到N个待滤波图像块。
也即,以P个残缺CTU作为一个待滤波图像块,从待滤波图像中确定出N个待滤波图像块。
本申请实施例对上述P的具体取值不做限制。
举例说明,假设P=4,如图22所示,在基于神经网络的滤波器的训练过程中,将训练图像的4个残缺CTU确定为一个训练图像块,对基于神经网络的滤波器进行训练。
在一种可能的实现方式中,若基于神经网络的滤波器是通过有监督的训练方法训练得到,则上述训练图像包括输入图像和目标图像,其中目标图像可以理解为监督图像。在具体的训练过程中,如图23所示,将输入图像的4个残缺CTU确定为一个输入图像块,如图24所示,将目标图像的4个残缺CTU确定为一个目标图像块,该输入图像块与该目标图像块组成一个匹配对。接着,将输入图像块输入基于神经网络的滤波器中进行滤波,得到该输入图像块对 应的滤波图像块。将该输入图像块的滤波图像块与目标图像块进行比较,计算损失,并根据该损失对该基于神经网络的滤波器的参数进行调整。接着,参照上述方法,继续以下一个匹配对中的输入图像块作为输入,以下一个匹配对中的目标图像块为目标,对该基于神经网络的滤波器进行训练,得到训练好的基于神经网络的滤波器。
由上述可知,在该示例中,上述基于神经网络的滤波器在训练过程中,将训练图像中的4个残缺CTU确定为一个训练图像块,对应的,在实际滤波过程中,如图25所示,将待滤波图像中的4个残缺CTU确定为一个待滤波图像块,进而得到N个待滤波图像块。
上述以P为4为例进行说明,在一些实施例中,上述P还可以是1、2、3、5等任意正整数,本申请实施例对此不做限制。
在上述示例中,4个残缺CTU为相邻,在一些实施例中,上述若P大于1时,则P个残差CTU也可以不全相邻。也就是说,P个残差CTU中所有残差CTU均不相邻,或者,P个残差CTU中部分残差CTU相邻,部分残差CTU不相邻。
上述实施例中示出了训练图像的分块方式包括将训练图像中的M个CTU确定为一个训练图像块,或者,将训练图像中的P个残缺CTU确定为一个训练图像块。需要说明的是,本申请实施例涉及的训练图像的分块方式包括但不限于上述示例,本申请实施例对此不做限制。
根据上述步骤,使用基于神经网络的滤波器在训练过程中所使用的训练图像的分块方式,对待滤波图像进行划分,得到N个待滤波图像块后,执行如下S803。
S803、使用基于神经网络的滤波器,对N个待滤波图像块分别进行滤波,得到滤波后图像。
本申请实施例中,使用与训练图像的分块方式相同的分块方式,对待滤波图像进行划分,得到N个待滤波图像块。针对N个待滤波图像块中的每一个待滤波图像块,将该待滤波图像块输入基于神经网络的滤波器中进行滤波,得到该待滤波图像块的滤波后图像块。根据上述方法,可以确定出N个待滤波图像块中,每个待滤波图像块的滤波后图像块,这N个待滤波图像块的滤波后图像块组成滤波后图像。
由于在滤波过程中,结合图像块边界外的相邻内容进行滤波可以提升图像块边界区域的滤波效果。基于此,为了进一步提升基于神经网络的滤波器的滤波效果,则在一些实施例中,上述基于神经网络的滤波器是经过训练图像块的扩充图像块训练得到。也就是说,本申请实施例中,在滤波器的训练过程中,除了根据上述的训练图像的分块方式对训练图像进行划分,得到训练图像块外,还可以对该训练图像块向外进行扩充,得到扩充图像块,并使用该扩充图像块对基于神经网络的滤波器进行训练。此时,上述S803包括如下S803-A1至S803-A3的步骤:
S803-A1、针对N个待滤波图像块中的每一个待滤波图像块,根据训练图像块的扩充方式对待滤波图像块进行扩充,得到扩充后的待滤波图像块;
S803-A2、使用基于神经网络的滤波器,对扩充后的待滤波图像块进行滤波,得到滤波后的扩充图像块;
S803-A3、将滤波后的扩充图像块中待滤波图像块对应的图像区域,确定为待滤波图像块的滤波后图像块。
本申请实施例中,为了进一步提高待滤波图像的滤波效果,则本申请实施例采用与训练图像块相同的扩充方式进行扩充,对待滤波图像块进行扩充。具体的,在实际滤波过程中,使用与训练图像的分块方式相同的分块方式,对待滤波图像进行划分,得到N个待滤波图像块。针对N个待滤波图像块中的每一个待滤波图像块,使用与训练图像块相同的扩充方式,对待滤波图像块向外进行扩充,得到扩充后的待滤波图像块。接着,将扩充后的待滤波图像块输入基于神经网络的滤波器中进行滤波,得到该扩充后的待滤波图像块的滤波后的扩充图像块。由于该滤波后的扩充图像块与待滤波图像块的大小不一致,因此,对滤波后的扩充图 像块进行裁剪,具体是,将滤波后的扩充图像块中待滤波图像块对应的图像区域,确定为该待滤波图像块的滤波后图像块。根据上述步骤,可以确定出N个待滤波图像块中每个待滤波图像块的滤波后图像块,这N个待滤波图像块的滤波后图像块进行拼接,得到最终的滤波后图像。
本申请实施例对训练图像块的具体扩充方式不做限制。
在一些实施例中,训练图像块的扩充方式包括对训练图像块的至少一个边界区域向外进行扩充,此时,S803-A1中根据训练图像块的扩充方式对待滤波图像块进行扩充,得到扩充后的待滤波图像块包括:对待滤波图像块的至少一个边界区域向外进行扩充,得到扩充后的待滤波图像块。
在一些示例中,如图26所示,在基于神经网络的滤波器的训练过程中,将训练图像块的四周向外进行扩充,并使用扩充后的训练图像块对基于神经网络的滤波器进行训练。
在一种可能的实现方式中,若基于神经网络的滤波器是通过有监督的训练方法训练得到,则上述训练图像包括输入图像和目标图像,其中目标图像可以理解为监督图像。在具体的训练过程中,如图27所示,将输入图像的一个CTU确定为一个输入图像块,对该输入图像块的四周向外进行扩充,得到扩充后的输入图像块。如图28所示,将目标图像的一个CTU确定为一个目标图像块,对该目标图像块的四周向外进行扩充,得到扩充后的目标图像块。接着,将扩充后的输入图像块输入基于神经网络的滤波器中进行滤波,得到该扩充后的输入图像块对应的滤波图像块。将该扩充后的输入图像块的滤波图像块与扩充后的目标图像块进行比较,计算损失,并根据该损失对该基于神经网络的滤波器的参数进行调整。接着,参照上述方法,继续以下一个匹配对中的扩充后的输入图像块作为输入,以下一个匹配对中的扩充后的目标图像块为目标,对该基于神经网络的滤波器进行训练,得到训练好的基于神经网络的滤波器。
由上述可知,在该示例中,上述基于神经网络的滤波器在训练过程中,将训练图像中的一个CTU确定为一个训练图像块,并对训练图像块的四周向外进行扩充,得到扩充后的训练图像块。对应的,在实际滤波过程中,如图29所示,将待滤波图像中的一个CTU确定为一个待滤波图像块,接着,使用上述训练图像块的扩充方式,对待滤波图像块的四周向外进行扩充,得到扩充后的待滤波图像块,接着,将扩充后的待滤波图像块输入基于神经网络的滤波器中进行滤波,得到滤波后的扩充图像块,再将滤波后的扩充图像块中待滤波图像块对应的图像区域,确定为待滤波图像块的滤波后图像块。
在一些示例中,如图30所示,在基于神经网络的滤波器的训练过程中,将训练图像块的左边界、上边界向外进行扩充,并使用扩充后的训练图像块对基于神经网络的滤波器进行训练。
在一种可能的实现方式中,若基于神经网络的滤波器是通过有监督的训练方法训练得到时,则上述训练图像包括输入图像和目标图像,其中目标图像可以理解为监督图像。在具体的训练过程中,如图31所示,将输入图像的一个CTU确定为一个输入图像块,对该输入图像块的左边界、上边界向外进行扩充,得到扩充后的输入图像块。如图32所示,将目标图像的一个CTU确定为一个目标图像块,对该目标图像块的左边界、上边界向外进行扩充,得到扩充后的目标图像块。接着,将扩充后的输入图像块输入基于神经网络的滤波器中进行滤波,得到该扩充后的输入图像块对应的滤波图像块。将该扩充后的输入图像块的滤波图像块与扩充后的目标图像块进行比较,计算损失,并根据该损失对该基于神经网络的滤波器的参数进行调整。接着,参照上述方法,继续以下一个匹配对中的扩充后的输入图像块作为输入,以下一个匹配对中的扩充后的目标图像块为目标,对该基于神经网络的滤波器进行训练,得到训练好的基于神经网络的滤波器。
由上述可知,在该示例中,上述基于神经网络的滤波器在训练过程中,将训练图像中的 一个CTU确定为一个训练图像块,并对训练图像块的左边界、上边界向外进行扩充,得到扩充后的训练图像块。对应的,在实际滤波过程中,如图33所示,将待滤波图像中的一个CTU确定为一个待滤波图像块,接着,使用上述训练图像块的扩充方式,对待滤波图像块的左边界、上边界向外进行扩充,得到扩充后的待滤波图像块,接着,将扩充后的待滤波图像块输入基于神经网络的滤波器中进行滤波,得到滤波后的扩充图像块,再将滤波后的扩充图像块中待滤波图像块对应的图像区域,确定为待滤波图像块的滤波后图像块。
在一些示例中,训练图像块的扩充方式还包括对训练图像块的其他边界向外进行扩充,本申请实施例对此不做限制。
在一些实施例中,为了进一步提高基于神经网络的滤波器的滤波效果,在训练过程中,除了输入了输入图像块,还输入该输入图像块的参考图像块。为了保持实际滤波过程与训练过程的一致,则上述S803包括如下S803-B1至S803-B2的步骤:
S803-B1、针对N个待滤波图像块中的每一个待滤波图像块,确定待滤波图像块的参考图像块;
S803-B2、将待滤波图像块、以及待滤波图像块的参考图像块,输入基于神经网络的滤波器中进行滤波,得到待滤波图像块的滤波后图像块。
本申请实施例中,若基于神经网络的滤波器在训练过程中,输入信息包括输入图像块和该输入图像块的参考图像块时,则在实际滤波过程中,输入信息除了包括待滤波图像块外,还包括该待滤波图像块的参考图像块。
本申请实施例对上述确定待滤波图像块的参考图像块的方式不做限制。
在一些实施例中,待滤波图像块的参考图像块的确定方式,与训练图像块的参考图像块的确定方式不相同。
在一些实施例中,待滤波图像块的参考图像块的确定方式,与输入图像块的参考图像块的确定方式相同。此时,上述S803-B1中确定待滤波图像块的参考图像块包括如下S803-B11和S803-B12的步骤:
S803-B11、获取输入图像块的参考图像块的确定方式,该确定方式用于根据输入图像块的空域信息和时域信息中的至少一项,确定对应的参考图像块;
S803-B12、根据输入图像块的参考图像块的确定方式,确定待滤波图像块的参考图像块。
本申请实施例中,输入图像块的参考图像块的确定方式可以从该基于神经网络的滤波器的文件中读取。在一些实施例中,若基于神经网络的滤波器的训练设备与实际滤波设备为同一设备时,则该设备上保存有输入图像块的参考图像块的确定方式。
在获取输入图像块的参考图像块的确定方式后,使用输入图像块的参考图像块的确定方式,确定待滤波图像块的参考图像块。也就是说,本申请实施例中,待滤波图像块的参考图像块的确定方式与输入图像块的参考图像块的确定方式一致。
本申请实施例对参考图像块的具体类型不做限制。
在一些实施例中,若输入图像块的参考图像块包括输入图像块的时域参考图像块和空域参考图像块中的至少一个,则待滤波图像块的参考图像块包括待滤波图像块的时域参考图像块和空域参考图像块中的至少一个。
其中,空域参考图像块可以选择当前输入图像块固定相对位置的图像区域,即输入图像块的空域参考图像块与输入图像块处于同一帧,即均处于输入图像。
时域参考图像块与空域参考图像块的区别在于,时域参考图像块与当前输入图像块处于不同帧,其中时域参考图像块的参考位置可以选择与当前输入图像相同空间位置的参考块。
本申请实施例中,待滤波图像块的参考图像块的类型与输入图像块的参考图像块的类型相同。
示例1,若输入图像块的参考图像块包括空域参考图像块,则待滤波图像块的参考图像 块也包括空域参考图像块。此时,上述S803-B12包括如下步骤:
S803-B12-A、根据输入图像块的空域参考图像块的确定方式,确定待滤波图像块的空域参考图像块。
在该示例中,若输入图像块的参考图像块包括空域参考图像块时,则根据输入图像块的空域参考图像块的确定方式,确定待滤波图像块的空域参考图像块,进而实现对待滤波图像块的空域参考图像块的准确确定。
在一种可能的实现方式中,可以使用输入图像块的空域参考图像块的确定方式,确定待滤波图像块的空域参考图像块,进而保证了基于神经网络的滤波器在训练过程与实际滤波过程的输入信息保持一致,提升基于神经网络的滤波器的滤波性能。
本申请实施例对空域参考图像块的具体类型不做限制。
在一些实施例中,若输入图像块的空域参考图像块包括输入图像中位于输入图像块的左上方图像块、左侧图像块和上方图像块中的至少一个时,则待滤波图像块的空域参考图像块包括待滤波图像中位于该待滤波图像块的左上方图像块、左侧图像块和上方图像块中的至少一个。此时,S803-B12-A包括将待滤波图像中位于待滤波图像块的左上方图像块、左侧图像块和上方图像块中的至少一个,确定为待滤波图像块的空域参考图像块。
例如,图34所示,若输入图像块的空域参考图像块包括输入图像中位于输入图像块的左上方图像块,则如图35所示,将待滤波图像中位于所述待滤波图像块的左上方图像块,确定为待滤波图像的空域参考图像块。
再例如,图36所示,若输入图像块的空域参考图像块包括输入图像中位于输入图像块的左侧图像块,则如图37所示,将待滤波图像中位于待滤波图像块的左侧图像块,确定为待滤波图像的空域参考图像块。
再例如,图38所示,若输入图像块的空域参考图像块包括输入图像中位于输入图像块的上方图像块,则如图39所示,将待滤波图像中位于待滤波图像块的上方图像块,确定为待滤波图像的空域参考图像块。
再例如,图40所示,若输入图像块的空域参考图像块包括输入图像中位于输入图像块的左上方图像块、左侧图像块和上方图像块,则如图41所示,将待滤波图像中位于待滤波图像块的左上方图像块、左侧图像块和上方图像块,确定为待滤波图像的空域参考图像块。
本申请实施例中,为了进一步提升滤波效果,在训练过程中,除了输入了输入图像块外,还输入该输入图像块的空域参考图像块,以提升基于神经网络的滤波器的滤波效果。这样在实际滤波过程中,为了保持实际滤波过程与训练过程中输入信息的一致性,则采用与输入图像块的空域参考图像块的确定方式相同的确定方式,确定待滤波图像块的空域参考图像块,进而将待滤波图像块以及待滤波图像块的空域参考图像块输入该基于神经网络的滤波器中,实现对待滤波图像的滤波效果。
示例2,若输入图像块的参考图像块包括时域参考图像块,则待滤波图像块的参考图像块也包括时域参考图像块。此时,上述S803-B12包括如下步骤:
S803-B12-B、根据输入图像块的时域参考图像块的确定方式,确定待滤波图像块的时域参考图像块。
在该示例中,若输入图像块的参考图像块包括时域参考图像块时,则根据输入图像块的时域参考图像块的确定方式,确定待滤波图像块的时域参考图像块,进而实现对待滤波图像块的时域参考图像块的准确确定。
在一种可能的实现方式中,可以使用输入图像块的时域参考图像块的确定方式,确定待滤波图像块的时域参考图像块,进而保证了基于神经网络的滤波器在训练过程与实际滤波过程的输入信息保持一致,提升基于神经网络的滤波器的滤波性能。
在一些实施例中,S803-B12-B包括:确定待滤波图像的参考图像;将待滤波图像的参考图像中待滤波图像块对应位置处的图像块,确定为待滤波图像块的时域参考图像块。
例如,图42和图43所示,输入图像块的时域参考图像块为输入图像的参考图像中,与输入图像块对应位置处的图像块。也就是说,输入图像块的时域参考图像块在输入图像的参考图像中的位置,与输入图像块在输入图像中的位置一致。此时,如图44和图45所示,确定待滤波图像块的时域参考图像块的过程是,首先确定该滤波图像的参考图像,将该滤波图像的参考图像中待滤波图像块对应位置处的图像块,确定为待滤波图像块的时域参考图像块。
本申请实施例对参考图像的类型不做限制,例如本申请实施例的方法应用于编码端时,则上述待滤波图像的参考图像可以为任一已编码图像,若本申请实施例的方法应用于解码端时,则上述待滤波图像的参考图像可以为任一已解码图像。
本申请实施例中,为了进一步提升滤波效果,在训练过程中,除了输入了输入图像块外,还输入该输入图像块的时域参考图像块,以提升基于神经网络的滤波器的滤波效果。这样在实际滤波过程中,为了保持实际滤波过程与训练过程中输入信息的一致性,则采用与输入图像块的时域参考图像块的确定方式相同的确定方式,确定待滤波图像块的时域参考图像块,进而将待滤波图像块以及待滤波图像块的时域参考图像块输入该基于神经网络的滤波器中,实现对待滤波图像的滤波效果。
在一些实施例后,若输入图像块的参考图像块包括空域参考图像块和时域参考图像块时,则待滤波图像块的参考图像块也包括空域参考图像块和时域参考图像块,其中,待滤波图像块的空域参考图像块和时域参考图像块的确定过程,参照上述空域参考图像块和时域参考图像块的确定过程,在此不再赘述。
根据上述方法,使用基于神经网络的滤波器,对N个待滤波图像块分别进行滤波,得到滤波后图像。
在一些实施例中,本申请实施例的滤波方法可应用于环路滤波模块中,此时,本申请实施例的方法还包括基于该滤波后图像生成用于预测的参考图像,并将生成的参考图像存入缓存解码,以作为后续解码图像的参考图像。其中,基于该滤波后图像生成参考图像的方式可以是,将该滤波图像直接作为参考图像,或者对该滤波图像进行再处理,例如进行其他方式的滤波等处理,将再处理后的图像作为参考图像。可选的,该滤波后图像还可以被显示设备显示。
在一些实施例中,本申请实施例的方法还可以应用于视频后处理,即基于该滤波后图像生成显示图像,并将生成的显示图像输入显示设备进行显示,且跳过将该滤波后图像或该滤波后图像的再处理图像存入解码缓存。也就是说,本申请实施例中,上述基于滤波后图像生成显示图像输入显示设备进行显示,但是不作为参考图像存入解码缓存。示例性的,通过解码视频确定出当前图像的重建图像后,将该重建图像作为参考图像存入解码缓存,或者使用传统的环路滤波方法,例如使用DBF、SAO和ALF等至少一个滤波器,对该重建图像进行滤波,且将滤波后的图像作为参考图像,存储解码缓存。接着,将上述重建图像作为待滤波图像,通过本申请实施例的方法,使用基于神经网络的滤波器对该重建图像进行滤波,得到滤波后图像,基于该滤波后图像生成显示图像,将该显示图像输入显示设备进行显示。
在一些实施例中,上述待滤波后图像还可以通DBF、SAO和ALF等至少一个滤波器再进行滤波。
在一些实施例中,本申请实施例的待滤波图像可以是经过DBF、SAO和ALF等至少一个滤波器滤波后的图像,接着,再通过本申请实施例的方法,使用基于神经网络的滤波器,对该滤波后的图像再进行滤波。
本申请实施例提供的图像滤波方法,通过获取待滤波图像;确定基于神经网络的滤波器,并根据与基于神经网络的滤波器在训练过程中所使用的训练图像的相同的分块方式,对待滤波图像进行划分,得到N个待滤波图像块,N为正整数;使用基于神经网络的滤波器,对N个待滤波图像块分别进行滤波,得到滤波后图像。即本申请实施例中,通过将基于神经网络的滤波器在实际使用过程中的分块方式,与在训练时所使用的分块方式保持一致,以使得基 于神经网络的滤波器发挥出最佳滤波性能,进而提高了图像的滤波效果。
图46为本申请一实施例提供的图像滤波方法流程示意图。图46可以理解为上述图13所示的滤波方法的一具体实施例。
如图46所示,本申请实施例的图像滤波方法包括:
S901、获取待滤波图像。
对于非视频编解码场景,该待滤波图像可以为图像采集设备采集得到,或者为图像绘制装置绘制而成等。
对于视频编解码场景,则该待滤波图像可以为重建图像。
上述S901的具体实现方式参照上述S801的描述,在此不再赘述。
S902、确定基于神经网络的滤波器;根据基于神经网络的滤波器对应的分块方式对待滤波图像进行划分,得到N个待滤波图像块。
其中,分块方式是基于神经网络的滤波器在训练过程中所使用的训练图像的分块方式,也即根据与基于神经网络的滤波器在训练过程中所使用的训练图像的相同的分块方式,对待滤波图像进行划分,得到N个待滤波图像块,N为正整数。
例如,基于神经网络的滤波器在训练过程中用的训练图像的分块方式为将训练图像中的一个CTU确定为一个训练图像块,这样,在实际滤波时,将待滤波图像的一个CTU确定为一个待滤波图像块,得到N个待滤波图像块。
上述S902的具体实现方式参照上述S802的描述,在此不再赘述。
S903、针对N个待滤波图像块中的每一个待滤波图像块,根据输入图像块的参考图像块的确定方式,确定待滤波图像块的参考图像块。
本申请实施例中,待滤波图像块的参考图像块的确定方式与输入图像块的参考图像块的确定方式一致。
在一些实施例中,若输入图像块的参考图像块包括输入图像块的时域参考图像块和空域参考图像块中的至少一个时,则待滤波图像块的参考图像块包括待滤波图像块的时域参考图像块和空域参考图像块中的至少一个。
上述S903的具体实现方式参照上述S803-B12的描述,在此不再赘述。
S904、将待滤波图像块、以及待滤波图像块的参考图像块,输入基于神经网络的滤波器中进行滤波,得到滤波后图像。
本申请实施例中,为了进一步提升滤波效果,在训练过程中,除了输入了输入图像块外,还输入该输入图像块的参考图像块,以提升基于神经网络的滤波器的滤波效果。这样在实际滤波过程中,为了保持实际滤波过程与训练过程中输入信息的一致性,则采用与输入图像块的参考图像块的确定方式相同的确定方式,确定待滤波图像块的参考图像块,进而将待滤波图像块以及待滤波图像块的参考图像块输入该基于神经网络的滤波器中,实现对待滤波图像的滤波效果。
本申请实施例提供的图像滤波方法,通过与基于神经网络的滤波器在训练过程中所使用的训练图像的相同的分块方式,对待滤波图像进行划分,得到N个待滤波图像块;针对N个待滤波图像块中的每一个待滤波图像块,根据输入图像块的参考图像块的确定方式,确定待滤波图像块的参考图像块;将待滤波图像块,以及待滤波图像块的参考图像块,输入基于神经网络的滤波器中进行滤波,得到滤波后图像。即本申请实施例中,通过将基于神经网络的滤波器在实际使用过程中的分块方式,与在训练时所使用的分块方式保持一致,且将输入图像块的参考图像块的确定方式,与待滤波图像块的参考图像块的确定方式保持一致,以进一步提高基于神经网络的滤波器的滤波效果。
应理解,图13至图46仅为本申请的示例,不应理解为对本申请的限制。
上文结合图13至图46,详细描述了本申请的方法实施例,下文结合图47至图48,详细描述本申请的装置实施例。
图47是本申请一实施例提供的图像滤波装置的示意性框图。该装置10可以为电子设备或者为电子设备中的一部分。如图47所示,图像滤波装置10可包括:
获取单元11,用于获取待滤波图像;
划分单元12,用于确定基于神经网络的滤波器;根据基于神经网络的滤波器对应的分块方式对待滤波图像进行划分,得到N个待滤波图像块,分块方式是基于神经网络的滤波器在训练过程中所使用的训练图像的分块方式,N为正整数;
滤波单元13,用于使用基于神经网络的滤波器,对N个待滤波图像块分别进行滤波,得到滤波后图像。
在一些实施例中,训练图像的分块方式包括将训练图像中的M个CTU确定为一个训练图像块,M为正整数;
划分单元12,用于将待滤波图像中的M个CTU确定为一个待滤波图像块,得到N个待滤波图像块。
在一些实施例中,训练图像的分块方式包括将训练图像中的P个残缺CTU确定为一个训练图像块,P为正整数;
划分单元12,用于将待滤波图像中的P个残缺CTU确定为一个待滤波图像块,得到N个待滤波图像块。
在一些实施例中,基于神经网络的滤波器是经过训练图像块的扩充图像块训练得到;
滤波单元13,用于针对N个待滤波图像块中的每一个待滤波图像块,根据训练图像块的扩充方式对待滤波图像块进行扩充,得到扩充后的待滤波图像块;使用基于神经网络的滤波器,对扩充后的待滤波图像块进行滤波,得到滤波后的扩充图像块;将滤波后的扩充图像块中待滤波图像块对应的图像区域,确定为待滤波图像块对应的滤波后图像块。
在一些实施例中,训练图像块的扩充方式包括对训练图像块的至少一个边界区域向外进行扩充;
滤波单元13,用于对待滤波图像块的至少一个边界区域向外进行扩充,得到扩充后的待滤波图像块。
在一些实施例中,训练图像包括输入图像,基于神经网络的滤波器训练时的输入数据包括输入图像块、以及输入图像块的参考图像块,输入图像块是通过分块方式对输入图像进行图像划分得到;
滤波单元13,用于针对N个待滤波图像块中的每一个待滤波图像块,确定待滤波图像块的参考图像块;将待滤波图像块,以及待滤波图像块的参考图像块,输入基于神经网络的滤波器中进行滤波,得到待滤波图像块的滤波后图像块。
在一些实施例中,滤波单元13,用于获取输入图像块的参考图像块的确定方式,确定方式用于根据输入图像块的空域信息和时域信息中的至少一项,确定对应的参考图像块;根据输入图像块的参考图像块的确定方式,确定待滤波图像块的参考图像块。
在一些实施例中,输入图像块的参考图像块包括输入图像块的时域参考图像块和空域参考图像块中的至少一个;待滤波图像块的参考图像块包括待滤波图像块的时域参考图像块和空域参考图像块中的至少一个。
在一些实施例中,输入图像块的参考图像块也包括输入图像块的空域参考图像块;
滤波单元13,用于根据输入图像块的空域参考图像块的确定方式,确定待滤波图像块的空域参考图像块。
在一些实施例中,输入图像块的空域参考图像块包括输入图像中位于输入图像块的左上方图像块、左侧图像块和上方图像块中的至少一个;
滤波单元13,用于将待滤波图像中位于待滤波图像块的左上方图像块、左侧图像块和上 方图像块中的至少一个,确定为待滤波图像块的空域参考图像块。
在一些实施例中,输入图像块的参考图像块也包括输入图像块的时域参考图像块;
滤波单元13,用于根据输入图像块的时域参考图像块的确定方式,确定待滤波图像块的时域参考图像块。
在一些实施例中,输入图像块的时域参考图像块包括输入图像的参考图像中输入图像块对应位置处的图像块;
滤波单元13,具体用于确定待滤波图像的参考图像;将待滤波图像的参考图像中待滤波图像块对应位置处的图像块,确定为待滤波图像块的时域参考图像块。
在一些实施例中,训练图像包括输入图像和输入图像对应的目标图像,基于神经网络的滤波器是以输入图像块为输入数据,以目标图像块为目标进行训练得到,输入图像块是通过训练图像的分块方式对输入图像进行图像划分得到,目标图像块是通过训练图像的分块方式对目标图像进行图像划分得到。
在一些实施例中,获取单元11,用于对当前图像进行重建,得到当前图像的重建图像;将重建图像,确定为待滤波图像。
在一些实施例中,滤波单元13,还用于基于滤波后图像生成用于预测的参考图像,存入解码缓存中。
在一些实施例中,滤波单元13,还用于基于滤波后图像生成显示图像,并将显示图像输入显示设备进行显示,且跳过将滤波后图像或者滤波后图像的再处理图像存入解码缓存。
图48是本申请实施例提供的电子设备的示意性框图,该电子设备用于执行上述方法实施例。如图48所示,该电子设备30可包括:
存储器31和处理器32,该存储器31用于存储计算机程序33,并将该计算机程序33传输给该处理器32。换言之,该处理器32可以从存储器31中调用并运行计算机程序33,以实现本申请实施例中的方法。
例如,该处理器32可用于根据该计算机程序33中的指令执行上述方法步骤。
在本申请的一些实施例中,该处理器32可以包括但不限于:
通用处理器、数字信号处理器(Digital Signal Processor,DSP)、专用集成电路(Application Specific Integrated Circuit,ASIC)、现场可编程门阵列(Field Programmable Gate Array,FPGA)或者其他可编程逻辑器件、分立门或者晶体管逻辑器件、分立硬件组件等等。
在本申请的一些实施例中,该存储器31包括但不限于:
易失性存储器和/或非易失性存储器。其中,非易失性存储器可以是只读存储器(Read-Only Memory,ROM)、可编程只读存储器(Programmable ROM,PROM)、可擦除可编程只读存储器(Erasable PROM,EPROM)、电可擦除可编程只读存储器(Electrically EPROM,EEPROM)或闪存。易失性存储器可以是随机存取存储器(Random Access Memory,RAM),其用作外部高速缓存。通过示例性但不是限制性说明,许多形式的RAM可用,例如静态随机存取存储器(Static RAM,SRAM)、动态随机存取存储器(Dynamic RAM,DRAM)、同步动态随机存取存储器(Synchronous DRAM,SDRAM)、双倍数据速率同步动态随机存取存储器(Double Data Rate SDRAM,DDR SDRAM)、增强型同步动态随机存取存储器(Enhanced SDRAM,ESDRAM)、同步连接动态随机存取存储器(Synch Link DRAM,SLDRAM)和直接内存总线随机存取存储器(Direct Rambus RAM,DR RAM)。
在本申请的一些实施例中,该计算机程序33可以被分割成一个或多个模块,该一个或者多个模块被存储在该存储器31中,并由该处理器32执行,以完成本申请提供的录制页面的方法。该一个或多个模块可以是能够完成特定功能的一系列计算机程序指令段,该指令段用于描述该计算机程序33在该电子设备中的执行过程。
如图48所示,该电子设备30还可包括:
收发器34,该收发器34可连接至该处理器32或存储器31。
其中,处理器32可以控制该收发器34与其他设备进行通信,示例性的,可以向其他设备发送信息或数据,或接收其他设备发送的信息或数据。收发器34可以包括发射机和接收机。收发器34还可以进一步包括天线,天线的数量可以为一个或多个。
应当理解,该电子设备30中的各个组件通过总线系统相连,其中,总线系统除包括数据总线之外,还包括电源总线、控制总线和状态信号总线。
根据本申请的一个方面,提供了一种计算机存储介质,其上存储有计算机程序,该计算机程序被计算机执行时使得该计算机能够执行上述方法实施例的方法。
本申请实施例还提供一种包含指令的计算机程序产品,该指令被计算机执行时使得计算机执行上述方法实施例的方法。
根据本申请的另一个方面,提供了一种计算机程序产品或计算机程序,该计算机程序产品或计算机程序包括计算机指令,该计算机指令存储在计算机可读存储介质中。电子设备的处理器从计算机可读存储介质读取该计算机指令,处理器执行该计算机指令,使得该电子设备执行上述方法实施例的方法。

Claims (20)

  1. 一种图像滤波方法,所述方法应用于电子设备中,所述方法包括:
    获取待滤波图像;
    确定基于神经网络的滤波器;
    根据所述基于神经网络的滤波器对应的分块方式对所述待滤波图像进行划分,得到N个待滤波图像块,所述分块方式是所述基于神经网络的滤波器在训练过程中所使用的训练图像的分块方式,所述N为正整数;
    使用所述基于神经网络的滤波器,对所述N个待滤波图像块分别进行滤波,得到滤波后图像。
  2. 根据权利要求1所述的方法,所述训练图像的分块方式包括将所述训练图像中的M个编码树单元确定为一个训练图像块,所述M为正整数;
    所述根据所述基于神经网络的滤波器对应的分块方式对所述待滤波图像进行划分,得到N个待滤波图像块,包括:
    以M个编码树单元作为一个待滤波图像块的方式,从所述待滤波图像中确定出所述N个待滤波图像块。
  3. 根据权利要求1所述的方法,所述训练图像的分块方式包括将所述训练图像中的P个残缺编码树单元确定为一个训练图像块,所述P为正整数;
    所述根据所述基于神经网络的滤波器对应的分块方式对所述待滤波图像进行划分,得到N个待滤波图像块,包括:
    以P个残缺编码树单元作为一个待滤波图像块的方式,从所述待滤波图像中确定出所述N个待滤波图像块。
  4. 根据权利要求1至3任一所述的方法,所述基于神经网络的滤波器是经过所述训练图像块的扩充图像块训练得到;
    所述使用所述基于神经网络的滤波器,对所述N个待滤波图像块分别进行滤波,得到滤波后图像,包括:
    针对所述N个待滤波图像块中的每一个待滤波图像块,根据所述训练图像块的扩充方式对所述待滤波图像块进行扩充,得到扩充后的待滤波图像块;
    使用所述基于神经网络的滤波器,对所述扩充后的待滤波图像块进行滤波,得到滤波后的扩充图像块;
    将所述滤波后的扩充图像块中所述待滤波图像块对应的图像区域,确定为所述待滤波图像块对应的滤波后图像块。
  5. 根据权利要求4所述的方法,所述训练图像块的扩充方式包括对所述训练图像块的至少一个边界区域向外进行扩充;
    所述根据所述训练图像块的扩充方式对所述待滤波图像块进行扩充,得到扩充后的待滤波图像块,包括:
    对所述待滤波图像块的至少一个边界区域向外进行扩充,得到所述扩充后的待滤波图像块。
  6. 根据权利要求1至3任一所述的方法,所述训练图像包括输入图像,所述基于神经网络的滤波器训练时的输入数据包括输入图像块、以及所述输入图像块的参考图像块,所述输入图像块是通过所述分块方式对所述输入图像进行图像划分得到;
    所述使用所述基于神经网络的滤波器,对所述N个待滤波图像块分别进行滤波,得到滤波后图像,包括:
    针对所述N个待滤波图像块中的每一个待滤波图像块,确定所述待滤波图像块的参考图像块;
    将所述待滤波图像块、以及所述待滤波图像块的参考图像块,输入所述基于神经网络的滤波器中进行滤波,得到所述待滤波图像块的滤波后图像块。
  7. 根据权利要求6所述的方法,所述确定所述待滤波图像块的参考图像块,包括:
    获取所述输入图像块的参考图像块的确定方式,所述确定方式用于根据所述输入图像块的空域信息和时域信息中的至少一项,确定对应的参考图像块;
    根据所述输入图像块的参考图像块的确定方式,确定所述待滤波图像块的参考图像块。
  8. 根据权利要求7所述的方法,
    所述输入图像块的参考图像块包括所述输入图像块的时域参考图像块和空域参考图像块中的至少一个;
    所述待滤波图像块的参考图像块包括所述待滤波图像块的时域参考图像块和空域参考图像块中的至少一个。
  9. 根据权利要求8所述的方法,所述根据所述输入图像块的参考图像块的确定方式,确定所述待滤波图像块的参考图像块,包括:
    根据所述输入图像块的空域参考图像块的确定方式,确定所述待滤波图像块的空域参考图像块。
  10. 根据权利要求9所述的方法,所述输入图像块的空域参考图像块包括所述输入图像中位于所述输入图像块的左上方图像块、左侧图像块和上方图像块中的至少一个;
    所述根据所述输入图像块的空域参考图像块的确定方式,确定所述待滤波图像块的空域参考图像块,包括:
    将所述待滤波图像中位于所述待滤波图像块的左上方图像块、左侧图像块和上方图像块中的至少一个,确定为所述待滤波图像块的空域参考图像块。
  11. 根据权利要求8所述的方法,所述根据所述输入图像块的参考图像块的确定方式,确定所述待滤波图像块的参考图像块,包括:
    根据所述输入图像块的时域参考图像块的确定方式,确定所述待滤波图像块的时域参考图像块。
  12. 根据权利要求11所述的方法,所述输入图像块的时域参考图像块包括所述输入图像的参考图像中所述输入图像块对应位置处的图像块;
    所述根据所述输入图像块的时域参考图像块的确定方式,确定所述待滤波图像块的时域参考图像块,包括:
    确定所述待滤波图像的参考图像;
    将所述待滤波图像的参考图像中所述待滤波图像块对应位置处的图像块,确定为所述待滤波图像块的时域参考图像块。
  13. 根据权利要求1-12任一项所述的方法,所述训练图像包括输入图像和所述输入图像对应的目标图像,所述基于神经网络的滤波器是以输入图像块为输入数据,以目标图像块为目标进行训练得到,所述输入图像块是通过所述训练图像的分块方式对所述输入图像进行图像划分得到,所述目标图像块是通过所述训练图像的分块方式对所述目标图像进行图像划分得到。
  14. 根据权利要求1-12任一项所述的方法,所述获取待滤波图像,包括:
    对原始图像进行重建,得到所述原始图像的重建图像;
    将所述重建图像,确定为所述待滤波图像。
  15. 一种图像滤波装置,包括:
    获取单元,用于获取待滤波图像;
    划分单元,用于确定基于神经网络的滤波器;根据所述基于神经网络的滤波器对应的分块方式对所述待滤波图像进行划分,得到N个待滤波图像块,所述分块方式是所述基于神经网络的滤波器在训练过程中所使用的训练图像的分块方式,所述N为正整数;
    滤波单元,用于使用所述基于神经网络的滤波器,对所述N个待滤波图像块分别进行滤波,得到滤波后图像。
  16. 根据权利要求15所述的装置,所述训练图像的分块方式包括将所述训练图像中的M个编码树单元确定为一个训练图像块,所述M为正整数;
    所述划分单元,用于以M个编码树单元作为一个待滤波图像块的方式,从所述待滤波图像中确定出所述N个待滤波图像块。
  17. 根据权利要求15所述的装置,所述训练图像的分块方式包括将所述训练图像中的P个残缺编码树单元确定为一个训练图像块,所述P为正整数;
    所述划分单元,用于以P个残缺编码树单元作为一个待滤波图像块的方式,从所述待滤波图像中确定出所述N个待滤波图像块。
  18. 一种电子设备,所述电子设备用于执行如权利要求1至14任一项所述的方法。
  19. 一种计算机可读存储介质,用于存储计算机程序,所述计算机程序使得计算机执行如权利要求1至14中任一项所述的方法。
  20. 一种计算机程序产品或计算机程序,所述计算机程序产品或所述计算机程序包括计算机程序指令,所述计算机程序指令使得计算机执行如权利要求1至14任一所述的方法。
PCT/CN2023/079134 2022-05-18 2023-03-01 图像滤波方法、装置及设备 WO2023221599A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202210551039.7 2022-05-18
CN202210551039.7A CN117151986A (zh) 2022-05-18 2022-05-18 图像滤波方法、装置及设备

Publications (2)

Publication Number Publication Date
WO2023221599A1 true WO2023221599A1 (zh) 2023-11-23
WO2023221599A9 WO2023221599A9 (zh) 2024-01-25

Family

ID=88834537

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2023/079134 WO2023221599A1 (zh) 2022-05-18 2023-03-01 图像滤波方法、装置及设备

Country Status (2)

Country Link
CN (1) CN117151986A (zh)
WO (1) WO2023221599A1 (zh)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108184129A (zh) * 2017-12-11 2018-06-19 北京大学 一种视频编解码方法、装置及用于图像滤波的神经网络
CN110351568A (zh) * 2019-06-13 2019-10-18 天津大学 一种基于深度卷积网络的视频环路滤波器
CN111741300A (zh) * 2020-05-28 2020-10-02 杭州师范大学 一种视频处理方法
CN114025164A (zh) * 2021-09-30 2022-02-08 浙江大华技术股份有限公司 图像编码方法、图像解码方法、编码器以及解码器
CN114501012A (zh) * 2021-12-31 2022-05-13 浙江大华技术股份有限公司 图像滤波、编解码方法以及相关设备

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108184129A (zh) * 2017-12-11 2018-06-19 北京大学 一种视频编解码方法、装置及用于图像滤波的神经网络
CN110351568A (zh) * 2019-06-13 2019-10-18 天津大学 一种基于深度卷积网络的视频环路滤波器
CN111741300A (zh) * 2020-05-28 2020-10-02 杭州师范大学 一种视频处理方法
CN114025164A (zh) * 2021-09-30 2022-02-08 浙江大华技术股份有限公司 图像编码方法、图像解码方法、编码器以及解码器
CN114501012A (zh) * 2021-12-31 2022-05-13 浙江大华技术股份有限公司 图像滤波、编解码方法以及相关设备

Also Published As

Publication number Publication date
CN117151986A (zh) 2023-12-01
WO2023221599A9 (zh) 2024-01-25

Similar Documents

Publication Publication Date Title
CN113748677A (zh) 编码器、解码器及对应的帧内预测方法
CN111327904B (zh) 图像重建方法和装置
US20220295071A1 (en) Video encoding method, video decoding method, and corresponding apparatus
CN113545063A (zh) 使用线性模型进行帧内预测的方法及装置
CN115665408B (zh) 用于跨分量线性模型预测的滤波方法和装置
CN111385572B (zh) 预测模式确定方法、装置及编码设备和解码设备
CN113796078A (zh) 帧内预测模式相关的编码器、解码器及对应方法
CN113785573A (zh) 编码器、解码器和使用自适应环路滤波器的对应方法
CN112075077A (zh) 图像预测方法、装置、设备、系统及存储介质
CN113660489B (zh) 用于帧内子划分的解码方法、装置、解码器和存储介质
US11516470B2 (en) Video coder and corresponding method
WO2023071557A1 (zh) 媒体文件封装方法、装置、设备及存储介质
CN114205582B (zh) 用于视频编解码的环路滤波方法、装置及设备
US11985303B2 (en) Context modeling method and apparatus for flag
WO2023221599A1 (zh) 图像滤波方法、装置及设备
CN112055211B (zh) 视频编码器及qp设置方法
WO2020114393A1 (zh) 变换方法、反变换方法以及视频编码器和视频解码器
CN116760976B (zh) 仿射预测决策方法、装置、设备及存储介质
CN113891084B (zh) 帧内预测模式相关的编码器、解码器、对应方法及计算机可读介质
WO2023220969A1 (zh) 视频编解码方法、装置、设备、系统及存储介质
WO2023173255A1 (zh) 图像编解码方法、装置、设备、系统、及存储介质
WO2023220946A1 (zh) 视频编解码方法、装置、设备、系统及存储介质
WO2023122969A1 (zh) 帧内预测方法、设备、系统、及存储介质
CN113766227B (zh) 用于图像编码和解码的量化和反量化方法及装置
WO2023184248A1 (zh) 视频编解码方法、装置、设备、系统及存储介质

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 23806559

Country of ref document: EP

Kind code of ref document: A1