WO2019141255A1 - Procédé et dispositif de filtrage d'image - Google Patents

Procédé et dispositif de filtrage d'image Download PDF

Info

Publication number
WO2019141255A1
WO2019141255A1 PCT/CN2019/072412 CN2019072412W WO2019141255A1 WO 2019141255 A1 WO2019141255 A1 WO 2019141255A1 CN 2019072412 W CN2019072412 W CN 2019072412W WO 2019141255 A1 WO2019141255 A1 WO 2019141255A1
Authority
WO
WIPO (PCT)
Prior art keywords
image block
distorted
distortion
picture
image
Prior art date
Application number
PCT/CN2019/072412
Other languages
English (en)
Chinese (zh)
Inventor
姚佳宝
Original Assignee
杭州海康威视数字技术股份有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 杭州海康威视数字技术股份有限公司 filed Critical 杭州海康威视数字技术股份有限公司
Publication of WO2019141255A1 publication Critical patent/WO2019141255A1/fr

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/102Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
    • H04N19/103Selection of coding mode or of prediction mode
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/102Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
    • H04N19/13Adaptive entropy coding, e.g. adaptive variable length coding [AVLC] or context adaptive binary arithmetic coding [CABAC]
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/169Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
    • H04N19/184Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being bits, e.g. of the compressed video stream
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/60Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using transform coding
    • H04N19/61Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using transform coding in combination with predictive coding
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/80Details of filtering operations specially adapted for video compression, e.g. for pixel interpolation

Definitions

  • the present application relates to the field of video, and in particular, to a method and an apparatus for filtering pictures.
  • the original video picture when encoding the original video picture, the original video picture is processed multiple times and the reconstructed picture is obtained.
  • the resulting reconstructed picture may have been pixel-shifted relative to the original video picture, ie, the reconstructed picture is distorted, resulting in visual impairment or artifacts.
  • the in-loop filtering module filters the reconstructed picture of the entire frame.
  • the reconstructed picture is a high-resolution picture
  • the resources required for filtering and reconstructing the picture are often high, so that the device may not be satisfied. For example, filtering a reconstructed picture of 4K resolution may cause insufficient memory.
  • the embodiment of the present application provides a method and an apparatus for filtering a picture.
  • the technical solution is as follows:
  • an embodiment of the present application provides a method for filtering a picture, where the method includes:
  • the acquiring the plurality of first image blocks by dividing the distorted picture comprises:
  • the plurality of distorted image blocks include a first distorted image block located at a vertex position of the distorted picture, a second distorted image block located on an upper boundary and a lower boundary of the distorted picture, located in the distortion a third distorted image block on the left and right borders of the picture and a fourth distorted image block other than the first distorted image block, the second distorted image block, and the third distorted image block;
  • the width and height of the first distorted image block are equal to W 1 -lap and H 1 -lap, respectively, W 1 is the target width, H 1 is the target height, and lap is the first expanded size.
  • the width and height of the second distorted image block are equal to W 1 -2lap and H 1 -lap, respectively, and the width and height of the third distorted image block are W 1 -lap and H 1 -2lap, respectively, the fourth distortion
  • the width and height of the image block are W 1 -2lap and H 1 -2lap, respectively.
  • the performing the edge expansion processing on each of the plurality of distorted image blocks according to the first expanded size to obtain the first image block corresponding to each of the distorted image blocks comprises:
  • the method before using the convolutional neural network model to separately filter each of the distorted image blocks of the distorted picture, the method further includes:
  • the set expansion dimension is not less than zero and not greater than a second expansion dimension corresponding to the convolution layer, and the second expansion
  • the edge size is the expanded size of the convolutional layer when the convolutional neural network model is trained.
  • the method further includes:
  • the first expanded size is set according to a second expanded size corresponding to each convolution layer included in the convolutional neural network model.
  • the generating, according to the de-distorted image block corresponding to each of the distorted image blocks, a frame of the de-distorted image comprising:
  • the third image block corresponding to each of the distorted image blocks is composed into a frame de-distorted picture.
  • the method further includes:
  • the target width and the target height are determined according to the first expanded size, the width and height of the distorted picture.
  • the embodiment of the present application provides an apparatus for filtering a picture, where the apparatus includes:
  • a first acquiring module configured to acquire a distorted picture, where the distorted picture is distorted with respect to an original video picture input to the video encoding system
  • a second acquiring module configured to obtain a plurality of first image blocks by dividing the distorted picture
  • a filtering module configured to filter each first image block by using a convolutional neural network model to obtain a second image block corresponding to each of the first image blocks;
  • a generating module configured to generate a frame de-distorted picture according to the second image block corresponding to each of the first image blocks.
  • the second obtaining module includes:
  • a dividing unit configured to divide the distorted picture according to a target width and a target height, to obtain a plurality of distorted image blocks included in the distorted picture
  • an edge expansion unit configured to perform edge expansion processing on each of the plurality of distortion image blocks according to the first expansion size to obtain a first image block corresponding to each of the distortion image blocks.
  • the plurality of distorted image blocks include a first distorted image block located at a vertex position of the distorted picture, a second distorted image block located on an upper boundary and a lower boundary of the distorted picture, located in the distortion a third distorted image block on the left and right borders of the picture and a fourth distorted image block other than the first distorted image block, the second distorted image block, and the third distorted image block;
  • the width and height of the first distorted image block are equal to W 1 -lap and H 1 -lap, respectively, W 1 is the target width, H 1 is the target height, and lap is the first expanded size.
  • the width and height of the second distorted image block are equal to W 1 -2lap and H 1 -lap, respectively, and the width and height of the third distorted image block are W 1 -lap and H 1 -2lap, respectively, the fourth distortion
  • the width and height of the image block are W 1 -2lap and H 1 -2lap, respectively.
  • the edge expansion unit is configured to:
  • the device further includes:
  • a first setting module configured to set an edge expansion size corresponding to the convolution layer included in the convolutional neural network model, where the expanded size of the setting is not less than zero and not greater than a second expansion corresponding to the convolution layer a size, the second expanded size is an expanded size of the convolution layer when the convolutional neural network model is trained
  • the device further includes:
  • a second setting module configured to set the first expanded size according to a second expanded size corresponding to each convolution layer included in the convolutional neural network model.
  • the generating module includes:
  • An edge-splitting unit is configured to perform edge-splitting processing on the de-distorted image blocks corresponding to each of the distortion image blocks to obtain a third image block corresponding to each of the distortion image blocks;
  • a component unit configured to form a third image block corresponding to each of the distortion image blocks into a frame de-distorted picture.
  • the device further includes:
  • a determining module configured to determine the target width and the target height according to the first expanded size, the width and height of the distorted picture.
  • an embodiment of the present application provides a computer readable storage medium, where the computer readable storage medium stores a computer program, and when the computer program is executed by a processor, the first aspect or the first aspect is implemented.
  • the block corresponding de-distorted image block By dividing the distorted picture generated in the video encoding and decoding process, obtaining a plurality of distorted image blocks included in the distorted picture, and then using the convolutional neural network model to respectively filter each distorted image block of the distorted picture to obtain each distorted image.
  • the block corresponding de-distorted image block generates a frame of picture according to the de-distorted image block corresponding to each of the distorted image blocks.
  • the generated one-frame picture is a filtered picture. Since the distortion image block is filtered by using convolutional neural network filtering, the filtering of the entire frame distortion picture can reduce the resources required for filtering, so that the device can satisfy Required for filtering.
  • FIG. 1 is a flowchart of a method for filtering a picture according to an embodiment of the present application
  • FIG. 2 is a flowchart of another method for filtering a picture provided by an embodiment of the present application.
  • FIG. 3 is a structural block diagram of a video encoding system according to an embodiment of the present application.
  • FIG. 4 is a structural block diagram of another video encoding system according to an embodiment of the present application.
  • FIG. 5 is a schematic diagram of a divided image block provided by an embodiment of the present application.
  • FIG. 6 is another schematic diagram of a divided image block provided by an embodiment of the present application.
  • FIG. 7 is another schematic diagram of a divided image block provided by an embodiment of the present application.
  • FIG. 8 is another schematic diagram of a divided image block provided by an embodiment of the present application.
  • FIG. 9 is another schematic diagram of a divided image block provided by an embodiment of the present application.
  • FIG. 10 is another schematic diagram of a divided image block provided by an embodiment of the present application.
  • FIG. 11 is a system architecture diagram of a technical solution provided by an embodiment of the present application.
  • FIG. 12 is a schematic diagram of data flow of a technical solution provided by an embodiment of the present application.
  • FIG. 13 is a schematic diagram of obtaining a distortion image color component of a distorted image according to an embodiment of the present application.
  • 15 is a second schematic diagram of side information components provided by an embodiment of the present application.
  • 16 is a flowchart of a method for removing distortion of a distorted image according to an embodiment of the present application
  • 17 is a flowchart of a method for training a convolutional neural network model provided by an embodiment of the present application.
  • FIG. 18 is a flowchart of another method for filtering a picture according to an embodiment of the present disclosure.
  • FIG. 19 is a structural block diagram of a video encoding system according to an embodiment of the present application.
  • FIG. 20 is a structural block diagram of another video encoding system according to an embodiment of the present application.
  • FIG. 21 is a structural block diagram of another video encoding system according to an embodiment of the present disclosure.
  • FIG. 22 is a schematic diagram of an apparatus for filtering a picture according to an embodiment of the present application.
  • FIG. 23 is a schematic structural diagram of a device according to an embodiment of the present application.
  • an embodiment of the present application provides a method for image filtering, including:
  • Step 101 Acquire a distortion picture generated by a video codec process.
  • Step 102 Acquire a plurality of distorted image blocks by dividing the distorted picture.
  • the entire frame of the video image may be obtained, and then the entire frame of the video image is divided to obtain multiple distortion pictures.
  • part of the image data in the entire frame of the video image may be acquired each time. When the acquired image data reaches a distorted image block, the following operations are performed on the distorted image block, thereby The method of dividing the distorted picture into a plurality of distorted image blocks is realized, and the efficiency of video encoding or decoding can be improved.
  • Step 103 Filter each of the distorted image blocks using a convolutional neural network model to obtain a de-distorted image block corresponding to each of the distorted image blocks.
  • one or more distorted image blocks can be filtered at the same time, that is, parallel filtering can be implemented to improve filtering efficiency.
  • Step 104 Generate a frame de-distorted picture according to the de-distorted image block corresponding to each of the distorted image blocks.
  • the method provided in this embodiment may occur in a video encoding process or in a video decoding process. Therefore, the distorted picture may be a video picture generated during the video encoding process or a video picture generated during the video decoding process.
  • a plurality of distorted image blocks are obtained by dividing a distorted picture generated in a video encoding and decoding process, and then each distorted image block is separately filtered by using a convolutional neural network model to obtain each distorted image.
  • the block corresponding de-distorted image block generates a frame de-distorted picture according to the de-distorted image block corresponding to each of the distorted image blocks.
  • the generated de-distorted picture is a filtered picture. Since the distortion image block is filtered by using convolutional neural network filtering, the filtering of the entire frame distortion picture can reduce the resources required for filtering, thereby making the device The resources required for filtering can be satisfied, and the resources can be resources such as video memory and/or memory.
  • an embodiment of the present application provides a method for filtering a picture, which may filter a distortion picture generated during an encoding process, including:
  • Step 201 Acquire a distorted picture generated during video encoding.
  • a reconstructed picture may be generated during the video encoding process, and the distorted picture may be the reconstructed picture, or may be a picture obtained by filtering the reconstructed picture.
  • the video coding system includes a prediction module, an adder, a transform unit, a quantization unit, an entropy encoder, an inverse quantization unit, an inverse transform unit, a reconstruction unit, and a CNN (convolution neural network). Model) and the buffer and other parts.
  • the process of encoding the video coding system may be: inputting the original picture into the prediction module and the adder, and the prediction module predicts the input original picture according to the reference picture in the buffer to obtain prediction data, and inputs the prediction data into the addition method.
  • entropy coder and reconstruction unit The prediction module includes an intra prediction unit, a motion estimation and motion compensation unit, and a switch.
  • the intra prediction unit may perform intra prediction on the original picture to obtain intra prediction data
  • the motion estimation and motion compensation unit performs inter prediction on the original picture according to the reference picture buffered in the buffer to obtain inter prediction data
  • the switch selects the intra frame. Predict the data or output the inter prediction data to the adder and the reconstruction unit.
  • the intra prediction data may include intra mode information
  • the inter prediction data may include inter mode information.
  • a filter may be connected between the convolutional neural network model and the reconstruction unit, and the filter may also filter the reconstructed picture generated by the reconstruction unit, and output the filtered reconstructed picture.
  • the filtered reconstructed picture may be obtained, and the filtered reconstructed picture is taken as a distorted picture.
  • the distorted picture is distorted relative to the original video picture.
  • Step 202 Divide the distorted picture according to the target width and the target height to obtain a plurality of distorted image blocks included in the distorted picture.
  • the distortion image block divided in this step may be an image block of equal size or may not be an image block of equal size.
  • each distorted image block when each distorted image block can be equal in size, the width of each distorted image block in the distorted picture can be equal to the target width, and the height of each distorted image block in the distorted picture can be equal to the target height.
  • the width of the distorted picture is not an integer multiple of the target width, there is an overlap between two distorted image blocks in each line of the distorted image block obtained according to the target width division.
  • the width of the distorted picture is not equal to an integral multiple of the target width, and each line obtained according to the target width includes four distorted image blocks, and for each line of distorted image blocks, the line includes the third and fourth There is an overlap between the pieces of distorted image, where ⁇ W is the overlapping width of the third and fourth distorted image blocks in FIG.
  • the height of the distorted picture is an integral multiple of the target height
  • the height of the distorted picture is equal to an integral multiple of the target height
  • each column obtained according to the target height division includes three distorted image blocks
  • the column includes three distorted image blocks. There is no overlap.
  • the height of the distorted picture is not an integer multiple of the target height
  • the height of the distorted picture is not equal to an integral multiple of the target height
  • each column obtained according to the target height division includes four distorted image blocks
  • the column includes the third and fourth
  • the obtained plurality of distorted image blocks may include a first distorted image block, a second distorted image block, a third distorted image block, and a fourth distorted image block. Types.
  • the first distorted image block is located at a vertex position of the distorted picture, and the first distorted image block is respectively the image blocks P1, P5, P16, and P20 in FIG.
  • the width and height of a distorted image block P1, P5, P16, and P20 are equal to W 1 -lap and H 1 -lap, respectively, W 1 is the target width, H 1 is the target height, and lap is the first expanded size.
  • the second distorted image block is located on an upper boundary and a lower boundary of the distorted picture, the second distorted image block is different from the first distorted image block, and the second distorted image block is respectively the image block P2, P3, P4 in FIG. , P17, P18, and P19, and the width and height of the second distorted image blocks P2, P3, P4, P17, P18, and P19 are equal to W 1 -2lap and H 1 -lap, respectively.
  • the third distorted image block is located on the left and right borders of the distorted picture, the third distorted image block is different from the first distorted image block, and the third distorted image block is respectively the image blocks P6, P11, P10 and P15 in FIG.
  • the width and height of the third distortion image blocks P6, P11, P10, and P15 are W 1 -lap and H 1 -2lap, respectively.
  • the distortion image block of the plurality of distortion tiles except the first distortion image block, the second distortion image block, and the third distortion image block is a fourth distortion image block
  • the fourth distortion image block is the image block in FIG. 8 respectively.
  • P7, P8, P9, P12, P13 and P14, the width and height of the fourth distortion image blocks P7, P8, P9, P12, P13 and P14 are W 1 -2lap and H 1 -2lap, respectively.
  • the last two distorted image blocks in each line of the distorted image block may have partial overlap or partial overlap, for example, the distorted image block P4 located in the first line in FIG. There is a partial overlap with P5, and ⁇ W is the overlapping width of the distorted image blocks P4 and P5 of the first line in FIG.
  • the last two distorted image blocks in each column of the distorted image block may have partial overlap or partial overlap.
  • the distorted image blocks P11 and P16 located in the first column in FIG. 8 partially overlap.
  • ⁇ H is the overlapping height of the distortion image blocks P11 and P16 of the first column.
  • the first expanded size may also be set, and the target width and the target height are determined according to the first expanded size, the width and height of the distorted picture.
  • the convolutional neural network model comprises a plurality of convolution layers, each convolution layer corresponding to a second expanded size.
  • the first expanded size is calculated based on the second expanded size corresponding to each convolutional layer.
  • the second expanded size corresponding to each convolution layer may be accumulated to obtain an accumulated value, and the first expanded size is set to be greater than or equal to the accumulated value.
  • obtaining the distorted picture may be obtaining an entire frame of the distorted picture, and then dividing the entire frame of the distorted picture; or
  • the acquired image data when the acquired image data can form a distorted image block having a width of a target width and a height of a target height, the distorted image block is output, thereby realizing dividing the distorted picture into a plurality of distorted images of equal size.
  • the acquired image data when the acquired image data is the data of the first distorted image block and can constitute the first distorted image block, the first distorted tile is output, and the acquired image data is the data of the second distorted image block.
  • the second distortion tile is output, and when the acquired image data is data of the third distortion image block and can form a third distortion image block, the third distortion tile is output, when acquired When the image data is the data of the fourth distorted image block and can constitute the fourth distorted image block, outputting the fourth distorted tile; thereby dividing the distorted picture into the first distorted image block, the second distorted image block, and the third distorted image block And four types of distorted image blocks of the fourth distorted image block.
  • Step 203 Perform a process of expanding each of the distorted image blocks according to the first expanded size to obtain a first image block corresponding to each of the distorted image blocks.
  • the four edges of the target image block are respectively subjected to edge expansion processing according to the first expanded edge size to obtain a first image block corresponding to the target image block, and the target image block is any one of the plurality of distortion image blocks.
  • the process for determining the target width may include the 31-34 process, which are:
  • the preset width ranges from greater than 0 and less than an integer value in the width of the distorted picture.
  • the preset width range is greater than the first expanded size and smaller than the integer value of the width of the distorted picture.
  • the first expanded edge size is typically greater than or equal to 1 pixel. For example, assuming that the width of the distorted picture is 10 pixels and assuming that the first expanded size is 1 pixel, the preset width range includes integer values 2, 3, 4, 5, 6, 7, 8, and 9.
  • the first formula is:
  • ⁇ W is the overlap width corresponding to the selected width value
  • W 1 is the selected width value
  • W 2 is the width of the first image block after the edge processing of the distorted image block
  • W 3 is the distortion.
  • the width of the image, % is the remainder operation.
  • the height of the distorted picture is equal to an integral multiple of the height value, the height value is determined as the target height and ends.
  • the second formula is:
  • ⁇ H is the overlap width corresponding to the selected width value
  • H 1 is the selected height value
  • H 2 is the height of the first image block after the distortion image block is expanded
  • H 3 is the distortion picture. the height of.
  • step (step 203) when each of the divided image blocks is not equal in size, the step (step 203) may be:
  • the target distortion image block being the first distortion image block, the second distortion image block, and the third A distorted image block, the target edge being an edge of the target distorted image block that does not coincide with the boundary of the distorted picture.
  • four edges of the fourth distortion image block are subjected to edge expansion processing to obtain a first image block corresponding to the fourth distortion image block.
  • the width of the expanded edge is equal to the first expanded size.
  • the target edges of the first distorted image block P1 are the right edge and the lower edge.
  • the right and lower edges are respectively performed according to the first expanded edge size lap.
  • the edge expansion processing obtains the first image block corresponding to the first distortion image block P1 (which is a broken line frame including P1).
  • the target edges of the second distorted image block P2 are a left edge, a right edge, and a lower edge.
  • the left edge and the right edge are respectively according to the first expanded edge size lap.
  • the edge expansion processing is performed with the lower edge to obtain a first image block corresponding to the second distortion image block P2 (which is a dotted line frame including P2).
  • the target edges of the third distortion image block P6 are an upper edge, a lower edge, and a right edge.
  • the upper edge ratio and the lower edge are respectively according to the first edge expansion size lap.
  • the edge expansion processing is performed with the right edge to obtain a first image block corresponding to the third distortion image block P6 (which is a broken line frame including P6).
  • the four edges of the fourth distortion image block P8 are respectively subjected to edge expansion processing according to the first expanded edge size lap, and the first corresponding to the fourth distortion image block P8 is obtained.
  • Image block (which is a dashed box including P8).
  • the width of each of the first image blocks obtained in the second case described above is equal to the target width
  • the height of each of the first image blocks is equal to the target height
  • the target width and the target height may be determined as follows before performing step 202. Can be:
  • the third formula is:
  • S 1 is the first parameter
  • W 1 is the width value in the preset width range
  • W 3 is the width of the distorted picture.
  • the fourth formula is:
  • S 2 is the second parameter
  • H 1 is the height value in the preset height range
  • H 3 is the height of the distorted picture.
  • the edge of the distorted image block is subjected to edge expansion processing using a preset pixel value.
  • the preset pixel value may be a pixel value of 0, 1, 2, or 3, and as shown in FIG. 10, the four edges of the distorted image block P1 may be expanded by a preset pixel value, and the width of the edge expansion of each edge is equal to The first expanded size, the pixel value of each pixel in the region obtained by the edge expansion is a preset pixel value.
  • the edge is subjected to edge expansion processing using the pixel value of each pixel included in the edge of the distorted image block.
  • the left edge may be subjected to edge expansion processing using the pixel value of each pixel included in the left edge, and each pixel in the region obtained by the left edge is expanded.
  • the pixel value of the pixel is the pixel value of a certain pixel included in the left edge.
  • the neighboring image block adjacent to the right edge of the distorted image block P1 is P4, and the right edge of the distorted image block P1 is subjected to edge expansion processing using the neighboring image block P4.
  • the convolutional neural network includes a plurality of convolutional layers, each convolutional layer corresponding to one trim size and a second expanded size, the trim size being equal to the second expanded size.
  • Each convolution layer performs a clipping operation on the input first image block, performs a trimming process on the first image block according to the trimming size, and according to the second expanded edge before outputting the first image block
  • the first image block is subjected to edge expansion processing such that the size of the first image block input to the convolutional layer is equal to the size of the first image block output from the convolutional layer.
  • the expansion size corresponding to each convolution layer may be set before performing this step, and for each convolution layer, the expansion of the convolution layer may be set.
  • the size is not less than 0 and is not greater than the second expanded size corresponding to the convolutional layer when the convolutional neural network model is trained, that is, the expanded size corresponding to the convolutional layer is greater than or equal to 0 and less than or equal to the volume.
  • the second expanded edge size corresponding to the laminate may be set before performing this step, and for each convolution layer, the expansion of the convolution layer may be set.
  • the size of the second image block corresponding to the first image block output by the convolutional neural network model is greater than or equal to the size of the distorted image block corresponding to the first image block.
  • the second expanded size corresponding to each convolution layer may not be set before the step is performed, and the corresponding trimming size of the convolution layer is equal to the convolution a second expanded size corresponding to the layer, such that after the first image block is input to the convolutional neural network model, the size of the second image block corresponding to the first image block output by the convolutional neural network model is equal to the first image The size of the block.
  • an edge information component corresponding to the first image block may also be generated, where the side information component represents a distortion feature of the first image block relative to the original image;
  • the distorted image color component of the image block and the side information component are input to a pre-established convolutional neural network model for convolution filtering processing to obtain a de-distorted second image block.
  • the method includes: an edge information component generation module 11, a convolutional neural network 12, and a network training module 13;
  • the convolutional neural network 12 may include the following three-layer structure:
  • the input layer processing unit 121 is configured to receive input data input to the convolutional neural network model, where the input data includes a distorted image color component of the first image block, and an edge information component of the first image block; The data is subjected to a convolution filtering process of the first layer;
  • the hidden layer processing unit 122 performs at least one layer of convolution filtering processing on the output data of the input layer processing unit 121;
  • the output layer processing unit 123 performs convolution filtering processing on the output data of the hidden layer processing unit 122, and outputs the result as a de-distorted image color component for generating a de-distorted second image block.
  • the convolutional neural network model can be represented by a convolutional neural network of a preset structure and a configured network parameter set. After the input data is subjected to convolution filtering processing of the input layer, the hidden layer, and the output, de-distortion is obtained. The second image block.
  • the input data as a convolutional neural network model may include one or more side information components according to actual needs, and may also include one or more distorted image color components, for example, including at least a Y color component, a U color component, and One of the V color components, correspondingly, includes one or more de-distorted image color components.
  • the stored data of each pixel of an image block including the values of all the color components of the pixel, can be extracted from the stored data of each pixel as needed when obtaining the distorted image color component of the distorted image block.
  • the value of one or more of the desired color components is derived to obtain a distorted image color component of the distorted image block.
  • the side information component it represents the distortion feature of the first image block relative to the original image block in the original picture, which is an expression of the distortion feature determined by the image processing process.
  • the side information component corresponding to the first image block is used as the input data to be input to the convolutional neural network model.
  • the side information component may also represent the distortion type of the distorted first image block relative to the corresponding original image block in the original picture, and the side information component may include the prediction mode of each coding unit in the first image block.
  • the prediction mode of the coding unit may be used as An edge information component that characterizes the type of distortion.
  • the side information component of the first image block may be an edge information guide map, which is a matrix structure of the same height as the first image block.
  • the side information component includes an edge information component of each pixel of the first image block in which the position of the side information component of the pixel is the same as the position of the pixel in the first image block.
  • the matrix structure of the side information component is the same as the matrix structure of the distorted first image block color component, wherein the coordinates [0, 0], [0, 1] represent the distortion position, and the matrix element value 1 represents The degree of distortion, that is, the side information component, can simultaneously indicate the degree of distortion and the position of the distortion.
  • the coordinates [0, 0], [0, 1], [2, 0], [2, 4] represent the distortion position
  • the element values 1 and 2 of the matrix represent the distortion type, that is, the side information component. At the same time, it can indicate the degree of distortion and the position of distortion.
  • two side information components respectively illustrated in FIG. 14 and FIG. 15 may be included.
  • the first image block is also a matrix, with each element in the matrix being the distorted image color component of the pixel in the first image block.
  • the distorted image color component of the pixel may include the color component of any one of the three channels Y, U, V or more.
  • the side information component may include side information components respectively corresponding to each of the distorted image color components.
  • the side information component of the pixel in the side information component of the first image block includes the side information component corresponding to each of the distortion image color components in the pixel.
  • the side information components of the first image block can be generated by the following two steps 61 and 62, respectively.
  • Step 61 Determine, for each of the first image blocks to be processed, a distortion level value of each pixel in the first image block.
  • the quantization parameter of each coding unit in the first image block is included in the quantization unit in the video coding system, so the quantization parameter of each coding unit in the first image block can be acquired from the quantization unit.
  • the encoding information of each coding unit is included in the current original video picture, and the coding information of each coding unit in the first image block may be acquired from the current original video picture.
  • Step 62 Generate, according to the position of each pixel point in the first image block, a side information component corresponding to the first image block by using the obtained distortion degree value of each pixel point, where each component value included in the side information component is The pixel at the same position on the first image block corresponds to the position of the side information component of the side information component in the first image block being the same as the position of the pixel point in the first image block.
  • the side information component Since each component value included in the side information component corresponds to a pixel point of the same position on the first image block, the side information component has the same structure as the distortion image color component of the first image block, that is, a matrix representing the side information component.
  • the matrix representing the color component of the first image block is of the same type.
  • the obtained distortion degree value of each pixel point may be normalized based on the pixel value range of the first image block, thereby obtaining The degree of distortion after processing, the range of values of the distortion level after processing is the same as the range of pixel values;
  • the processed distortion level value of each pixel point is determined as the component value of the same position of the pixel point in the side information component corresponding to the first image block.
  • norm(x) is the processed distortion degree value obtained after the normalization process
  • x is the distortion degree value of the pixel point
  • the pixel value range of the first image block is [PIXEL MIN , PIXEL MAX ]
  • the distortion degree value of the pixel point The range of values is [QP MIN , QP MAX ].
  • the convolutional neural network model includes an input layer, an implicit layer, and an output layer as an example, and the first image block is filtered by using a convolutional neural network for any first image block to be processed.
  • a de-distorted second image block is obtained, the scheme being described as follows.
  • Step 63 For any one of the first image blocks to be processed, the distortion image color component of the first image block and the generated side information component are used as input data of a pre-established convolutional neural network model, and are first performed by the input layer.
  • the convolution filtering process of the layer obtains an image block expressed in a sparse form, and outputs the image block expressed in a sparse form.
  • the input data may be input to the network through respective channels.
  • the first image block color component Y and c m channels of the c v channels may be
  • the side information component M is combined in the dimension of the channel to form the input data I of c v +c m channels, and multidimensional convolution filtering and nonlinear mapping are performed on the input data I by using the following formula to generate n 1 Image blocks represented by sparse forms:
  • W 1 corresponds to n 1 convolution filters, that is, n 1 convolution filters are applied to the input of the convolution layer of the input layer, and n 1 image blocks are output; convolution of each convolution filter
  • the size of the kernel is c 1 ⁇ f 1 ⁇ f 1 , where c 1 is the number of input channels and f 1 is the spatial size of each convolution kernel.
  • a ReLU Rectified linear unit
  • F i (I) g(W i *F i-1 (I)+B i ), i ⁇ 2,3,...,N ⁇ ;
  • F i (I) represents the output of the i-th layer convolutional layer in the convolutional neural network
  • * is the convolution operation
  • W i is the weight coefficient of the i-th layer convolutional layer filter bank
  • B i is the convolutional layer
  • the offset coefficient of the filter bank, g() is a nonlinear mapping function.
  • W i corresponds to n i convolution filters, that is, n i convolution filters are applied to the input of the i-th convolution layer, and n i image blocks are output; convolution of each convolution filter
  • the size of the kernel is c i ⁇ f i ⁇ f i , where c i is the number of input channels and f i is the spatial size of each convolution kernel.
  • Step 65 The output layer aggregates the high-dimensional image block F N (I) output by the hidden layer, and outputs the de-distorted image color component of the first image block, for generating the de-distorted second image block.
  • the structure of the output layer is not limited in the embodiment of the present invention, and the output layer may be a Residual Learning structure, a Direct Learning structure, or other structures.
  • the processing using the Residual Learning structure is as follows:
  • F(I) is the de-distorted image color component of the output layer
  • F N (I) is the output of the hidden layer (which is a high-dimensional image block)
  • * is the convolution operation
  • W N+1 is the output layer.
  • B N+1 is the offset coefficient of the convolution layer filter bank of the output layer
  • Y is the distorted image color component that is not subjected to convolution filtering processing and is to be subjected to de-distortion processing.
  • W N+1 corresponds to n N+1 convolution filters, that is, n N+1 convolution filters are applied to the input of the N+1th convolution layer, and n N+1 image blocks are output.
  • n N+1 is the number of output de-distorted image color components, generally equal to the number of input distortion image color components. If only one de-distorted image color component is output, n N+1 is generally 1
  • the size of the convolution kernel of each convolution filter is c N+1 ⁇ f N+1 ⁇ f N+1 , where c N+1 is the number of input channels and f N+1 is the number of each convolution kernel The size of the space.
  • the de-distorted image color component is directly output, that is, the de-distorted second image block is obtained.
  • the output layer processing can be expressed by the following formula:
  • F(I) is the output of the output layer
  • F N (I) is the output of the hidden layer
  • * is the convolution operation
  • W N+1 is the weight coefficient of the convolutional layer filter bank of the output layer
  • B N+ 1 is the offset coefficient of the convolution layer filter bank of the output layer.
  • W N+1 corresponds to n N+1 convolution filters, that is, n N+1 convolution filters are applied to the input of the N+1th convolution layer, and n N+1 image blocks are output.
  • n N+1 is the number of output de-distorted image color components, generally equal to the number of input distortion image color components. If only one de-distorted image color component is output, n N+1 is generally 1
  • the size of the convolution kernel of each convolution filter is c N+1 ⁇ f N+1 ⁇ f N+1 , where c N+1 is the number of input channels and f N+1 is the number of each convolution kernel The size of the space.
  • the output layer adopts a Residual Learning structure
  • the output layer includes a convolution layer.
  • multiple distortion image blocks can be filtered at the same time, so that parallelization filtering can be implemented, and the efficiency of video coding is improved.
  • FIG. 17 a method for training a convolutional neural network model is also proposed, as shown in FIG. 17, which specifically includes the following processing steps:
  • Step 71 Acquire a preset training set, where the preset training set includes an original sample image, and a distortion image color component of the plurality of distortion images corresponding to the original sample image, and an edge information component corresponding to each of the distortion images, where the distortion image corresponds to The side information component represents the distorted feature of the distorted image relative to the original sample image.
  • the distortion characteristics of the plurality of distorted images are different.
  • the training set may include an original sample image, and performing image processing on the original sample image to obtain a plurality of distortion images having different distortion characteristics, and side information components corresponding to each distortion image; that is, the training set includes the An original sample image, the plurality of distorted images corresponding to the one original sample image and the side information components corresponding to each of the distorted images.
  • the training-related high-level parameters such as the learning rate and the gradient descent algorithm
  • the learning rate and the gradient descent algorithm may be appropriately set, and may be set in the manner mentioned above, or may be set in other manners, and will not be described in detail herein.
  • Step 73 Perform forward calculation.
  • the distortion image color component of each of the distortion images in the preset training set and the corresponding side information component are input to a convolutional neural network of a preset structure to perform convolution filtering processing, to obtain a de-distorted image corresponding to the distortion image.
  • Color component
  • the forward calculation of the convolutional neural network CNN with the parameter set ⁇ i is performed on the preset training set ⁇ , and the output F(Y) of the convolutional neural network is obtained, that is, the corresponding image of each distortion image Distorted image color component.
  • the current parameter set is ⁇ 1 .
  • the current parameter set ⁇ i is obtained by adjusting the parameter set ⁇ i-1 used last time. description.
  • Step 74 Determine a loss value of the plurality of original sample images based on the original image color component of the plurality of original sample images and the obtained de-distorted image color component.
  • MSE mean square error
  • Step 75 Determine whether the convolutional neural network of the preset structure adopting the current parameter set is converged based on the loss value. If not, proceed to step 76. If it converges, proceed to step 77.
  • the convergence may be determined when the loss value is less than the preset loss value threshold.
  • the loss value of each original sample image in the plurality of original sample images is less than a preset loss value threshold, determining convergence, or
  • the loss value of the original sample image of the plurality of original sample images is less than a preset loss value threshold, and the convergence is determined; or the difference between the loss value and the last calculated loss value may be calculated by the current calculation.
  • the convergence is determined.
  • the difference between the loss value of the original sample image obtained this time and the loss value of the original sample image obtained last time is calculated, that is, Calculating the difference of each original sample image, determining convergence when the difference of each original sample image is less than a preset change threshold, or determining convergence when the difference of any original sample image is less than a preset change threshold
  • the invention is not limited herein.
  • Step 77 The current parameter set is taken as the final parameter set ⁇ final of the output, and the convolutional neural network of the preset structure adopting the final parameter set ⁇ final is used as the trained convolutional neural network model.
  • the second expanded size corresponding to each convolutional layer in the convolutional neural network model is set to zero, and the first expanded size is equal to the second expanded size corresponding to each of the convolutional layers
  • the accumulated value is obtained, and the obtained second image block corresponding to each first image block is equal in width to the distortion image block corresponding to each first image block, and may be according to the distortion image block corresponding to each first image block.
  • the second image block corresponding to each first image block is composed into a frame de-distorted picture, and the frame de-distorted picture is buffered in the buffer as a frame reference picture.
  • each obtained The second image block corresponding to the first image block is equal in width to the first image block, and the second image block corresponding to each first image block may be trimmed according to the first expanded size to obtain each De-distorting image blocks corresponding to the first image block, according to the position of the distorted image block corresponding to each first image block in the distorted picture, the de-distorted image blocks corresponding to each first image block are combined into one frame de-distorted picture, The frame de-distorted picture is buffered in the buffer as a frame reference picture.
  • the edge trimming process when the edge trimming process is performed, for the second image block corresponding to any one of the first image blocks, the edge of the second image block that is subjected to the edge expansion processing is determined, and the second image is determined according to the first expanded edge size.
  • the edge determined in the block is trimmed to obtain a de-distorted image block corresponding to the first image block, and the width of the cut edge is equal to the first expanded size.
  • the convolutional layer corresponding to the convolutional layer is set to be larger than 0 and smaller than the second expanded dimension corresponding to the convolutional layer, that is, the volume
  • the size of the second image block is smaller than the size of the first image block and larger than the size of the first image block corresponding to the first image block obtained by the filtering.
  • the second image block corresponding to the first image block is subjected to trimming processing to obtain a de-distorted image block corresponding to each first image block, according to the position of the distorted image block corresponding to each first image block in the distorted picture,
  • the de-distorted image block corresponding to each first image block constitutes a frame de-distorted picture, and the frame-de-distorted picture is buffered in the buffer as a frame reference picture.
  • a plurality of distorted image blocks are obtained by dividing a distorted picture generated in a video encoding process, and then using a convolutional neural network model to simultaneously filter one or more distorted image blocks to obtain each
  • the de-distorted image block corresponding to the distorted image block generates a frame de-distorted picture according to the de-distorted image block corresponding to each distorted image block.
  • the generated de-distorted picture is a filtered picture. Since the distortion image block is filtered by using convolutional neural network filtering, the filtering of the entire frame distortion picture can reduce the resources required for filtering, thereby making the device Can meet the resources needed for filtering.
  • multiple distortion image blocks can be filtered at the same time, which can improve the filtering efficiency and improve the video coding efficiency.
  • an embodiment of the present application provides a method for filtering a picture, which may filter a distortion picture generated during a decoding process, including:
  • Step 301 Acquire a distorted picture generated during video decoding.
  • a reconstructed picture may be generated during the video decoding process, and the distorted picture may be the reconstructed picture, or may be a picture obtained by filtering the reconstructed picture.
  • the video decoding system includes a prediction module, an entropy decoder, an inverse quantization unit, an inverse transform unit, a reconstruction unit, a convolutional neural network model CNN, and a buffer.
  • the video decoding system encodes a process of inputting a bit stream into an entropy decoder, and the entropy decoder decodes the bit stream to obtain mode information, quantization parameters, and residual information, and input the mode information into a prediction module,
  • the quantization parameter is input to the convolutional neural network model, and the residual information is input to the inverse quantization unit.
  • the prediction module predicts the input mode information according to the reference picture in the buffer to obtain prediction data, and inputs the prediction data into the reconstruction unit.
  • the prediction module includes an intra prediction unit, a motion estimation and motion compensation unit, and a switch, and the mode information may include intra mode information and inter mode information.
  • the intra prediction unit may predict intra prediction information for the intra mode information, and the motion estimation and motion compensation unit may obtain inter prediction data by inter prediction of the inter mode information according to the reference picture buffered in the buffer, and the switch selects the intraframe. Predict the data or output the inter prediction data to the reconstruction unit.
  • the inverse quantization unit and the inverse transform unit respectively perform inverse quantization and inverse transform processing on the residual information to obtain prediction error information, and input the prediction error information into the reconstruction unit; the reconstruction unit generates and reconstructs the prediction error information according to the prediction error information and the prediction data. image.
  • the reconstructed picture generated by the reconstructing unit may be acquired, and the reconstructed picture is taken as a distorted picture.
  • a filter may be connected between the convolutional neural network model and the reconstruction unit, and the filter may also filter the reconstructed picture generated by the reconstruction unit, and output the filtered reconstructed picture.
  • the filtered reconstructed picture may be obtained, and the filtered reconstructed picture is taken as a distorted picture.
  • Steps 302-305 are the same as steps 202-205 above, and will not be described in detail herein.
  • a plurality of distorted image blocks are obtained by dividing a distorted picture generated in a video decoding process, and then using a convolutional neural network model to simultaneously filter one or more distorted image blocks to obtain each
  • the de-distorted image block corresponding to the distorted image block generates a frame de-distorted picture according to the de-distorted image block corresponding to each distorted image block.
  • the generated de-distorted picture is a filtered picture. Since the distortion image block is filtered by using convolutional neural network filtering, the filtering of the entire frame distortion picture can reduce the resources required for filtering, thereby making the device Can meet the resources needed for filtering.
  • multiple distortion image blocks can be filtered at the same time, which can improve the filtering efficiency and improve the video decoding efficiency.
  • an embodiment of the present application provides a device 400 for image filtering, where the device 400 includes:
  • a first acquiring module 401 configured to acquire a distorted picture, where the distorted picture is distorted with respect to an original video picture input to the video encoding system;
  • the second obtaining module 402 is configured to obtain a plurality of first image blocks by dividing the distorted picture
  • a filtering module 403 configured to filter each first image block by using a convolutional neural network model to obtain a second image block corresponding to each of the first image blocks;
  • the generating module 404 is configured to generate a frame de-distorted picture according to the second image block corresponding to each of the first image blocks.
  • the second obtaining module 402 includes:
  • a dividing unit configured to divide the distorted picture according to a target width and a target height, to obtain a plurality of distorted image blocks included in the distorted picture
  • an edge expansion unit configured to perform edge expansion processing on each of the plurality of distortion image blocks according to the first expansion size to obtain a first image block corresponding to each of the distortion image blocks.
  • the plurality of distorted image blocks include a first distorted image block located at a vertex position of the distorted picture, a second distorted image block located on an upper boundary and a lower boundary of the distorted picture, located in the distortion a third distorted image block on the left and right borders of the picture and a fourth distorted image block other than the first distorted image block, the second distorted image block, and the third distorted image block;
  • the width and height of the first distorted image block are equal to W 1 -lap and H 1 -lap, respectively, W 1 is the target width, H 1 is the target height, and lap is the first expanded size.
  • the width and height of the second distorted image block are equal to W 1 -2lap and H 1 -lap, respectively, and the width and height of the third distorted image block are W 1 -lap and H 1 -2lap, respectively, the fourth distortion
  • the width and height of the image block are W 1 -2lap and H 1 -2lap, respectively.
  • the edge expansion unit is configured to:
  • the device 400 further includes:
  • a first setting module configured to set an edge expansion size corresponding to the convolution layer included in the convolutional neural network model, where the expanded size of the setting is not less than zero and not greater than a second expansion corresponding to the convolution layer
  • the size, the second expanded size is an expanded size of the convolutional layer when the convolutional neural network model is trained.
  • the device 400 further includes:
  • a second setting module configured to set the first expanded size according to a second expanded size corresponding to each convolution layer included in the convolutional neural network model.
  • the generating module 404 includes:
  • An edge-splitting unit is configured to perform edge-splitting processing on the de-distorted image blocks corresponding to each of the distortion image blocks to obtain a third image block corresponding to each of the distortion image blocks;
  • a component unit configured to form a third image block corresponding to each of the distortion image blocks into a frame de-distorted picture.
  • a determining module configured to determine the target width and the target height according to the first expanded size, the width and height of the distorted picture.
  • a plurality of distorted image blocks are obtained by dividing a distorted picture generated in a video encoding and decoding process, and then using a convolutional neural network model to simultaneously filter one or more distorted image blocks, thereby obtaining each Deblocking image blocks corresponding to the distorted image blocks, and generating a frame of de-distorted pictures according to the de-distorted image blocks corresponding to each of the distorted image blocks.
  • the generated de-distorted picture is a filtered picture. Since the distortion image block is filtered by using convolutional neural network filtering, the filtering of the entire frame distortion picture can reduce the resources required for filtering, thereby making the device Can meet the resources needed for filtering.
  • FIG. 23 is a block diagram showing the structure of a terminal 500 according to an exemplary embodiment of the present invention.
  • the terminal 500 can be a portable mobile terminal, such as a smart phone, a tablet computer, an MP3 player (Moving Picture Experts Group Audio Layer III), and a MP4 (Moving Picture Experts Group Audio Layer IV). Image experts compress standard audio layers 4) players, laptops or desktops.
  • Terminal 500 may also be referred to as a user device, a portable terminal, a laptop terminal, a desktop terminal, and the like.
  • the terminal 500 includes a processor 501 and a memory 502.
  • Processor 501 can include one or more processing cores, such as a 4-core processor, an 8-core processor, and the like.
  • the processor 501 may be configured by at least one of a DSP (Digital Signal Processing), an FPGA (Field-Programmable Gate Array), and a PLA (Programmable Logic Array). achieve.
  • the processor 501 may also include a main processor and a coprocessor.
  • the main processor is a processor for processing data in an awake state, which is also called a CPU (Central Processing Unit); the coprocessor is A low-power processor for processing data in standby.
  • the processor 501 can be integrated with a GPU (Graphics Processing Unit), which is responsible for rendering and rendering of the content that the display needs to display.
  • the processor 501 may also include an AI (Artificial Intelligence) processor for processing computational operations related to machine learning.
  • AI Artificial Intelligence
  • Memory 502 can include one or more computer readable storage media, which can be non-transitory. Memory 502 can also include high speed random access memory, as well as non-volatile memory, such as one or more disk storage devices, flash storage devices. In some embodiments, the non-transitory computer readable storage medium in the memory 502 is configured to store at least one instruction for execution by the processor 501 to implement one of the methods provided by the method embodiments of the present application. Image filtering method.
  • the terminal 500 optionally further includes: a peripheral device interface 503 and at least one peripheral device.
  • the processor 501, the memory 502, and the peripheral device interface 503 can be connected by a bus or a signal line.
  • Each peripheral device can be connected to the peripheral device interface 503 via a bus, signal line or circuit board.
  • the peripheral device includes at least one of a radio frequency circuit 504, a touch display screen 505, a camera 506, an audio circuit 507, a positioning component 508, and a power source 509.
  • Peripheral device interface 503 can be used to connect at least one peripheral device associated with an I/O (Input/Output) to processor 501 and memory 502.
  • processor 501, memory 502, and peripheral interface 503 are integrated on the same chip or circuit board; in some other embodiments, any of processor 501, memory 502, and peripheral interface 503 or The two can be implemented on a separate chip or circuit board, which is not limited in this embodiment.
  • the RF circuit 504 is configured to receive and transmit an RF (Radio Frequency) signal, also referred to as an electromagnetic signal.
  • Radio frequency circuit 504 communicates with the communication network and other communication devices via electromagnetic signals.
  • the RF circuit 504 converts the electrical signal into an electromagnetic signal for transmission, or converts the received electromagnetic signal into an electrical signal.
  • the radio frequency circuit 504 includes an antenna system, an RF transceiver, one or more amplifiers, a tuner, an oscillator, a digital signal processor, a codec chipset, a subscriber identity module card, and the like.
  • Radio frequency circuit 504 can communicate with other terminals via at least one wireless communication protocol.
  • the wireless communication protocols include, but are not limited to, the World Wide Web, a metropolitan area network, an intranet, generations of mobile communication networks (2G, 3G, 4G, and 5G), wireless local area networks, and/or WiFi (Wireless Fidelity) networks.
  • the RF circuit 504 may also include NFC (Near Field Communication) related circuitry, which is not limited in this application.
  • the display screen 505 is used to display a UI (User Interface).
  • the UI can include graphics, text, icons, video, and any combination thereof.
  • display 505 is a touch display
  • display 505 also has the ability to acquire touch signals over the surface or surface of display 505.
  • the touch signal can be input to the processor 501 as a control signal for processing.
  • display 505 can also be used to provide virtual buttons and/or virtual keyboards, also referred to as soft buttons and/or soft keyboards.
  • the display screen 505 may be one, and the front panel of the terminal 500 is disposed; in other embodiments, the display screen 505 may be at least two, respectively disposed on different surfaces of the terminal 500 or in a folded design; In still other embodiments, display screen 505 can be a flexible display screen disposed on a curved surface or a folded surface of terminal 500. Even the display screen 505 can be set to a non-rectangular irregular pattern, that is, a profiled screen.
  • the display screen 505 can be prepared by using an LCD (Liquid Crystal Display) or an OLED (Organic Light-Emitting Diode).
  • Camera component 506 is used to capture images or video.
  • camera assembly 506 includes a front camera and a rear camera.
  • the front camera is placed on the front panel of the terminal, and the rear camera is placed on the back of the terminal.
  • the rear camera is at least two, which are respectively a main camera, a depth camera, a wide-angle camera, and a telephoto camera, so as to realize the background blur function of the main camera and the depth camera, and the main camera Combine with a wide-angle camera for panoramic shooting and VR (Virtual Reality) shooting or other integrated shooting functions.
  • camera assembly 506 can also include a flash.
  • the flash can be a monochrome temperature flash or a two-color temperature flash.
  • the two-color temperature flash is a combination of a warm flash and a cool flash that can be used for light compensation at different color temperatures.
  • the audio circuit 507 can include a microphone and a speaker.
  • the microphone is used to collect sound waves of the user and the environment, and convert the sound waves into electrical signals for processing to the processor 501 for processing, or input to the RF circuit 504 for voice communication.
  • the microphone can also be an array microphone or an omnidirectional acquisition microphone.
  • the speaker is then used to convert electrical signals from processor 501 or radio frequency circuit 504 into sound waves.
  • the speaker can be a conventional film speaker or a piezoelectric ceramic speaker.
  • the audio circuit 507 can also include a headphone jack.
  • the location component 508 is used to locate the current geographic location of the terminal 500 to implement navigation or LBS (Location Based Service).
  • the positioning component 508 can be a positioning component based on a US-based GPS (Global Positioning System), a Chinese Beidou system, or a Russian Galileo system.
  • Power source 509 is used to power various components in terminal 500.
  • the power source 509 can be an alternating current, a direct current, a disposable battery, or a rechargeable battery.
  • the rechargeable battery may be a wired rechargeable battery or a wireless rechargeable battery.
  • a wired rechargeable battery is a battery that is charged by a wired line
  • a wireless rechargeable battery is a battery that is charged by a wireless coil.
  • the rechargeable battery can also be used to support fast charging technology.
  • terminal 500 also includes one or more sensors 510.
  • the one or more sensors 510 include, but are not limited to, an acceleration sensor 511, a gyro sensor 512, a pressure sensor 513, a fingerprint sensor 514, an optical sensor 515, and a proximity sensor 516.
  • the acceleration sensor 511 can detect the magnitude of the acceleration on the three coordinate axes of the coordinate system established by the terminal 500.
  • the acceleration sensor 511 can be used to detect components of gravity acceleration on three coordinate axes.
  • the processor 501 can control the touch display 505 to display the user interface in a landscape view or a portrait view based on the gravitational acceleration signal acquired by the acceleration sensor 511.
  • the acceleration sensor 511 can also be used for the acquisition of game or user motion data.
  • the gyro sensor 512 can detect the body direction and the rotation angle of the terminal 500, and the gyro sensor 512 can cooperate with the acceleration sensor 511 to collect the 3D motion of the user to the terminal 500. Based on the data collected by the gyro sensor 512, the processor 501 can implement functions such as motion sensing (such as changing the UI according to the user's tilting operation), image stabilization at the time of shooting, game control, and inertial navigation.
  • functions such as motion sensing (such as changing the UI according to the user's tilting operation), image stabilization at the time of shooting, game control, and inertial navigation.
  • the pressure sensor 513 may be disposed at a side border of the terminal 500 and/or a lower layer of the touch display screen 505.
  • the pressure sensor 513 When the pressure sensor 513 is disposed on the side frame of the terminal 500, the user's holding signal to the terminal 500 can be detected, and the processor 501 performs left and right hand recognition or shortcut operation according to the holding signal collected by the pressure sensor 513.
  • the operability control on the UI interface is controlled by the processor 501 according to the user's pressure on the touch display screen 505.
  • the operability control includes at least one of a button control, a scroll bar control, an icon control, and a menu control.
  • the fingerprint sensor 514 is used to collect the fingerprint of the user.
  • the processor 501 identifies the identity of the user according to the fingerprint collected by the fingerprint sensor 514, or the fingerprint sensor 514 identifies the identity of the user according to the collected fingerprint. Upon identifying that the identity of the user is a trusted identity, the processor 501 authorizes the user to perform related sensitive operations including unlocking the screen, viewing encrypted information, downloading software, paying and changing settings, and the like.
  • the fingerprint sensor 514 can be disposed on the front, back, or side of the terminal 500. When the physical button or vendor logo is provided on the terminal 500, the fingerprint sensor 514 can be integrated with the physical button or the manufacturer logo.
  • Optical sensor 515 is used to collect ambient light intensity.
  • the processor 501 can control the display brightness of the touch display 505 based on the ambient light intensity acquired by the optical sensor 515. Specifically, when the ambient light intensity is high, the display brightness of the touch display screen 505 is raised; when the ambient light intensity is low, the display brightness of the touch display screen 505 is lowered.
  • the processor 501 can also dynamically adjust the shooting parameters of the camera assembly 506 based on the ambient light intensity acquired by the optical sensor 515.
  • Proximity sensor 516 also referred to as a distance sensor, is typically disposed on the front panel of terminal 500. Proximity sensor 516 is used to collect the distance between the user and the front of terminal 500. In one embodiment, when the proximity sensor 516 detects that the distance between the user and the front side of the terminal 500 is gradually decreasing, the processor 501 controls the touch display screen 505 to switch from the bright screen state to the screen state; when the proximity sensor 516 detects When the distance between the user and the front side of the terminal 500 gradually becomes larger, the processor 501 controls the touch display screen 505 to switch from the state of the screen to the bright state.
  • FIG. 23 does not constitute a limitation to the terminal 500, and may include more or less components than those illustrated, or may combine some components or adopt different component arrangements.

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Compression Or Coding Systems Of Tv Signals (AREA)
  • Image Processing (AREA)

Abstract

L'invention concerne un procédé et un dispositif de filtrage d'image, appartenant au domaine de l'imagerie vidéo. Le procédé comprend : l'acquisition d'une image déformée, l'image déformée étant déformée par rapport à une entrée d'image vidéo d'origine dans un système de codage vidéo ; la division de l'image déformée et l'acquisition de multiples blocs d'image déformés compris dans l'image déformée ; le filtrage de chacun des blocs d'image déformés de l'image déformée au moyen d'un modèle de réseau neuronal convolutionnel et l'obtention de blocs d'image non déformés correspondant respectivement aux blocs d'image déformés ; et la génération d'une trame d'image selon les blocs d'image non déformés correspondant respectivement aux blocs d'image déformés. Le dispositif comprend : un premier module d'acquisition, un deuxième module d'acquisition, un module de filtrage et un module de génération. La présente invention réduit la quantité de ressources requises pour un filtrage, de sorte qu'un appareil réponde à une exigence de ressource pour un filtrage.
PCT/CN2019/072412 2018-01-18 2019-01-18 Procédé et dispositif de filtrage d'image WO2019141255A1 (fr)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201810050422.8 2018-01-18
CN201810050422.8A CN110062225B (zh) 2018-01-18 2018-01-18 一种图片滤波的方法及装置

Publications (1)

Publication Number Publication Date
WO2019141255A1 true WO2019141255A1 (fr) 2019-07-25

Family

ID=67301965

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2019/072412 WO2019141255A1 (fr) 2018-01-18 2019-01-18 Procédé et dispositif de filtrage d'image

Country Status (2)

Country Link
CN (1) CN110062225B (fr)
WO (1) WO2019141255A1 (fr)

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20040213472A1 (en) * 2003-02-17 2004-10-28 Taku Kodama Image compression apparatus, image decompression apparatus, image compression method, image decompression method, program, and recording medium
US20110182469A1 (en) * 2010-01-28 2011-07-28 Nec Laboratories America, Inc. 3d convolutional neural networks for automatic human action recognition
CN105611303A (zh) * 2016-03-07 2016-05-25 京东方科技集团股份有限公司 图像压缩系统、解压缩系统、训练方法和装置、显示装置
CN107018422A (zh) * 2017-04-27 2017-08-04 四川大学 基于深度卷积神经网络的静止图像压缩方法
CN107197260A (zh) * 2017-06-12 2017-09-22 清华大学深圳研究生院 基于卷积神经网络的视频编码后置滤波方法
CN107590804A (zh) * 2017-09-14 2018-01-16 浙江科技学院 基于通道特征和卷积神经网络的屏幕图像质量评价方法

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102214362B (zh) * 2011-04-27 2012-09-05 天津大学 基于块的快速图像混合方法
CA2997193C (fr) * 2015-09-03 2021-04-06 Mediatek Inc. Procede et appareil de traitement base sur un reseau neuronal dans un codage video

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20040213472A1 (en) * 2003-02-17 2004-10-28 Taku Kodama Image compression apparatus, image decompression apparatus, image compression method, image decompression method, program, and recording medium
US20110182469A1 (en) * 2010-01-28 2011-07-28 Nec Laboratories America, Inc. 3d convolutional neural networks for automatic human action recognition
CN105611303A (zh) * 2016-03-07 2016-05-25 京东方科技集团股份有限公司 图像压缩系统、解压缩系统、训练方法和装置、显示装置
CN107018422A (zh) * 2017-04-27 2017-08-04 四川大学 基于深度卷积神经网络的静止图像压缩方法
CN107197260A (zh) * 2017-06-12 2017-09-22 清华大学深圳研究生院 基于卷积神经网络的视频编码后置滤波方法
CN107590804A (zh) * 2017-09-14 2018-01-16 浙江科技学院 基于通道特征和卷积神经网络的屏幕图像质量评价方法

Also Published As

Publication number Publication date
CN110062225A (zh) 2019-07-26
CN110062225B (zh) 2021-06-11

Similar Documents

Publication Publication Date Title
TWI788630B (zh) 三維人臉模型生成方法、裝置、電腦設備及儲存介質
US11205282B2 (en) Relocalization method and apparatus in camera pose tracking process and storage medium
CN108810538B (zh) 视频编码方法、装置、终端及存储介质
CN108305236B (zh) 图像增强处理方法及装置
CN107945163B (zh) 图像增强方法及装置
CN110502954B (zh) 视频分析的方法和装置
US9692959B2 (en) Image processing apparatus and method
CN112633306B (zh) 对抗图像的生成方法及装置
WO2019141193A1 (fr) Procédé et appareil de traitement de données de trame vidéo
CN111028144B (zh) 视频换脸方法及装置、存储介质
CN110933334B (zh) 视频降噪方法、装置、终端及存储介质
CN112287852A (zh) 人脸图像的处理方法、显示方法、装置及设备
WO2020083385A1 (fr) Procédé, dispositif et système de traitement d'image
CN110991457A (zh) 二维码处理方法、装置、电子设备及存储介质
WO2023087637A1 (fr) Procédé et appareil de codage vidéo, dispositif électronique et support de stockage lisible par ordinateur
CN110232417B (zh) 图像识别方法、装置、计算机设备及计算机可读存储介质
CN112135191A (zh) 视频编辑方法、装置、终端及存储介质
CN115205164B (zh) 图像处理模型的训练方法、视频处理方法、装置及设备
WO2019141258A1 (fr) Procédé de codage vidéo, procédé de décodage vidéo, dispositif, et système
WO2019141255A1 (fr) Procédé et dispositif de filtrage d'image
CN112383719B (zh) 图像亮度的调整方法、装置、设备及可读存储介质
CN115330610A (zh) 图像处理方法、装置、电子设备以及存储介质
CN113379624A (zh) 图像生成方法、图像生成模型的训练方法、装置及设备
CN114332709A (zh) 视频处理方法、装置、存储介质以及电子设备
CN108881739B (zh) 图像生成方法、装置、终端及存储介质

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 19741103

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 19741103

Country of ref document: EP

Kind code of ref document: A1