CN114827630A - Method, system, device and medium for learning CU deep partitioning based on frequency domain distribution - Google Patents

Method, system, device and medium for learning CU deep partitioning based on frequency domain distribution Download PDF

Info

Publication number
CN114827630A
CN114827630A CN202210241583.1A CN202210241583A CN114827630A CN 114827630 A CN114827630 A CN 114827630A CN 202210241583 A CN202210241583 A CN 202210241583A CN 114827630 A CN114827630 A CN 114827630A
Authority
CN
China
Prior art keywords
frequency domain
blocks
division
block
size
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202210241583.1A
Other languages
Chinese (zh)
Other versions
CN114827630B (en
Inventor
许皓淇
曹英烈
周智恒
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
South China University of Technology SCUT
Guangzhou City University of Technology
Original Assignee
South China University of Technology SCUT
Guangzhou City University of Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by South China University of Technology SCUT, Guangzhou City University of Technology filed Critical South China University of Technology SCUT
Priority to CN202210241583.1A priority Critical patent/CN114827630B/en
Publication of CN114827630A publication Critical patent/CN114827630A/en
Application granted granted Critical
Publication of CN114827630B publication Critical patent/CN114827630B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/60Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using transform coding
    • H04N19/625Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using transform coding using discrete cosine transform [DCT]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/102Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
    • H04N19/12Selection from among a plurality of transforms or standards, e.g. selection between discrete cosine transform [DCT] and sub-band transform or selection between H.263 and H.264
    • H04N19/122Selection of transform size, e.g. 8x8 or 2x4x8 DCT; Selection of sub-band transforms of varying structure or type
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/102Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
    • H04N19/124Quantisation
    • H04N19/126Details of normalisation or weighting functions, e.g. normalisation matrices or variable uniform quantisers
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/90Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using coding techniques not provided for in groups H04N19/10-H04N19/85, e.g. fractals
    • H04N19/96Tree coding, e.g. quad-tree coding

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Discrete Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Software Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Medical Informatics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Mathematical Physics (AREA)
  • Artificial Intelligence (AREA)
  • Image Processing (AREA)
  • Compression Or Coding Systems Of Tv Signals (AREA)

Abstract

The invention discloses a method, a system, a device and a medium for learning CU deep partitioning based on frequency domain distribution, wherein the method comprises the following steps: dividing the image into a plurality of 64x64 size blocks and performing DCT to obtain a frequency domain coefficient distribution matrix F 64 (ii) a Corresponding to W 64 Calculating a probability score p 64 If it is less than the division threshold α 64 The downward division is ended and greater than the division threshold α 64 Continuously dividing the data into 4 32x32 sub-CU blocks according to the quadtree principle, and obtaining a 32x 32-size frequency domain coefficient matrix F in the same way 32 And W 32 Is calculated to obtainGet the probability score p 32 Then is compared with a division threshold value alpha 32 Judging whether to continue dividing; and so on until all CU blocks stop partitioning ahead of time or partition the block into the smallest 8x8CU blocks. The invention judges whether to continue the division or not through the probability fraction and the division threshold value, does not need to carry out traversal recursion on all conditions, lightens the complexity of CU deep division, saves a large amount of encoding time, and can be widely applied to the technical field of video encoding.

Description

Method, system, device and medium for learning CU deep partition based on frequency domain distribution
Technical Field
The invention relates to the technical field of artificial intelligence and video coding, in particular to a method, a system, a device and a medium for CU deep partitioning based on frequency domain distribution learning.
Background
With the development of internet and communication technology in recent years, the rapid increase of video traffic has brought about great challenges to video coding technology.
In the conventional coding framework (HEVC for example), any coded frame needs to be divided into multiple ctu (code Tree unit) sequences before performing the subsequent quantization operation of the predictive transform. The CTU can be continuously divided into CUs (code units) with different sizes downwards according to the principle of a quadtree, and the sizes of the CUs are maximally 64x64 and minimally 8x 8. The division of CTUs determines the efficiency of subsequent coding.
In order to obtain the optimal CTU partition, the encoding procedure uses a traversal recursive method, which continuously partitions the 64x64CU down to 8x8CU, and predicts each partition by using a built-in rate-distortion cost function until one of the optimal prediction cases is selected.
The division mode causes great waste on coding time and computing resources, and the resolution of the video image is obviously improved along with the increase of the resolution of the video image. Therefore, how to reduce the complexity of CU depth partitioning becomes a popular problem in the current industry.
Disclosure of Invention
To solve at least one of the technical problems in the prior art to a certain extent, an object of the present invention is to provide a method, a system, an apparatus, and a medium for learning CU depth partitioning based on frequency domain distribution.
The technical scheme adopted by the invention is as follows:
a method for learning CU deep division based on frequency domain distribution comprises the following steps:
obtaining a video image, dividing the video image into a plurality of first CU blocks with the size of 64x64, and obtaining a frequency domain coefficient distribution matrix of DCT (discrete cosine transform) transformation of the first CU blocks
Figure BDA0003542307010000011
According to the coefficient distribution matrix of frequency domain
Figure BDA0003542307010000012
Obtaining a probability score
Figure BDA0003542307010000013
If it is
Figure BDA0003542307010000014
The first CU block is divided downwards into 4 second CU blocks with the size of 32x32, and a frequency domain coefficient distribution matrix of DCT transformation of the second CU blocks is obtained
Figure BDA0003542307010000015
According to the coefficient distribution matrix of frequency domain
Figure BDA0003542307010000016
Obtaining a probability score
Figure BDA0003542307010000017
Otherwise, ending the division of the first CU block;
if it is
Figure BDA0003542307010000018
Dividing the second CU block into 4 third CU blocks with the size of 16x16 downwards, and acquiring a frequency domain coefficient distribution matrix of DCT (discrete cosine transform) transformation of the third CU blocks
Figure BDA0003542307010000021
According to the coefficient distribution matrix of frequency domain
Figure BDA0003542307010000022
Obtaining a probability score
Figure BDA0003542307010000023
Otherwise, ending the division of the second CU block;
if it is
Figure BDA0003542307010000024
Divide the third CU block down into 4 fourth CU blocks of size 8x 8; otherwise, ending the division of the third CU block;
wherein ,αN To divide the threshold, N is 64, 32, 16.
Further, the probability score
Figure BDA0003542307010000025
Obtained by the following formula:
Figure BDA0003542307010000026
wherein ,W64 And distributing the weight matrix for a preset frequency domain, wherein i represents a row coordinate of the matrix, and j represents a row coordinate of the matrix.
Further, the frequency domain distribution weight matrix W 64 Obtained by the following method:
obtaining training set and sample needed by network
Figure BDA0003542307010000027
Representing a DCT transform frequency domain coefficient distribution matrix corresponding to a k 64x64 size CU block; l is k When the CU block is divided down, 0 and 1 indicate that the CU block is divided down, 0 indicates no, and 1 indicates yes;
training the network according to a preset loss function to obtain a frequency domain distribution weight matrix W 64
The expression of the preset loss function is as follows:
Figure BDA0003542307010000028
further, the division threshold α 64 Obtained by the following method:
selecting label samples with L equal to 0 in training set
Figure BDA0003542307010000029
Calculating a probability score from the selected samples
Figure BDA00035423070100000210
From calculated probability scores
Figure BDA00035423070100000211
Obtaining a partition threshold α 64
Further, threshold values are divided
Figure BDA00035423070100000212
Further, the dividing the video image into first CU blocks of 64x64 size includes:
the video image is divided into several first CU blocks of 64x64 size according to the luminance component.
Further, the dividing the video image into first CU blocks of 64x64 size includes:
after dividing the video image into a plurality of first CU blocks of 64x64 size, pixel interpolation is carried out on the pixel area of 64x64 size of the residual insufficient division.
The other technical scheme adopted by the invention is as follows:
a frequency domain distribution-based learning CU depth partitioning system, comprising:
the first dividing module is used for acquiring a video image, dividing the video image into a plurality of first CU blocks with the size of 64x64, and acquiring a frequency domain coefficient distribution matrix of DCT (discrete cosine transform) transformation of the first CU blocks
Figure BDA0003542307010000031
According to the coefficient distribution matrix of frequency domain
Figure BDA0003542307010000032
Obtaining a probability score
Figure BDA0003542307010000033
A second division module for if
Figure BDA0003542307010000034
The first CU block is divided downwards into 4 second CU blocks with the size of 32x32, and a frequency domain coefficient distribution matrix of DCT transformation of the second CU blocks is obtained
Figure BDA0003542307010000035
According to the coefficient distribution matrix of frequency domain
Figure BDA0003542307010000036
Obtaining a probability score
Figure BDA0003542307010000037
Otherwise, ending the division of the first CU block;
a third division module for if
Figure BDA0003542307010000038
Dividing the second CU block into 4 third CU blocks with the size of 16x16 downwards, and acquiring a frequency domain coefficient distribution matrix of DCT (discrete cosine transform) transformation of the third CU blocks
Figure BDA0003542307010000039
According to the coefficient distribution matrix of frequency domain
Figure BDA00035423070100000310
Obtaining a probability score
Figure BDA00035423070100000311
Otherwise, ending the division of the second CU block;
a fourth division module for if
Figure BDA00035423070100000312
Divide the third CU block down into 4 fourth CU blocks of size 8x 8; otherwise, ending the division of the third CU block;
wherein ,αN To divide the threshold, N is 64,32,16。
The other technical scheme adopted by the invention is as follows:
an apparatus for learning CU depth partitioning based on frequency domain distribution, comprising:
at least one processor;
at least one memory for storing at least one program;
when executed by the at least one processor, cause the at least one processor to implement the method described above.
The other technical scheme adopted by the invention is as follows:
a computer readable storage medium in which a processor executable program is stored, which when executed by a processor is for performing the method as described above.
The invention has the beneficial effects that: the invention judges whether to continue the division or not through the probability fraction and the division threshold value to obtain a mode of terminating the division in advance, does not need to carry out traversal recursion on all conditions, reduces the complexity of CU deep division and saves a large amount of coding time.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the following description is made on the drawings of the embodiments of the present invention or the related technical solutions in the prior art, and it should be understood that the drawings in the following description are only for convenience and clarity of describing some embodiments in the technical solutions of the present invention, and it is obvious for those skilled in the art that other drawings can be obtained according to these drawings without creative efforts.
FIG. 1 is a flowchart illustrating steps of a method for learning CU deep partition based on frequency domain distribution according to an embodiment of the present invention
FIG. 2 is a schematic diagram of a video image in an embodiment of the invention;
FIG. 3 is a pictorial representation of a CU block 64x64DCT transform frequency-domain coefficient distribution of the video image of FIG. 2;
FIG. 4 is a schematic flow chart of a method for learning CU depth fast partition based on frequency domain distribution according to an embodiment of the present invention;
FIG. 5 is a diagram illustrating a frequency domain distribution weight matrix W according to an embodiment of the present invention 64 And a division threshold α 64 And (4) network training flow chart.
Detailed Description
Reference will now be made in detail to embodiments of the present invention, examples of which are illustrated in the accompanying drawings, wherein like or similar reference numerals refer to the same or similar elements or elements having the same or similar function throughout. The embodiments described below with reference to the accompanying drawings are illustrative only for the purpose of explaining the present invention and are not to be construed as limiting the present invention. The step numbers in the following embodiments are provided only for convenience of illustration, the order between the steps is not limited at all, and the execution order of each step in the embodiments can be adapted according to the understanding of those skilled in the art.
In the description of the present invention, it should be understood that the orientation or positional relationship referred to in the description of the orientation, such as the upper, lower, front, rear, left, right, etc., is based on the orientation or positional relationship shown in the drawings, and is only for convenience of description and simplification of description, and does not indicate or imply that the device or element referred to must have a specific orientation, be constructed and operated in a specific orientation, and thus, should not be construed as limiting the present invention.
In the description of the present invention, the meaning of a plurality of means is one or more, the meaning of a plurality of means is two or more, and larger, smaller, larger, etc. are understood as excluding the number, and larger, smaller, inner, etc. are understood as including the number. If the first and second are described for the purpose of distinguishing technical features, they are not to be understood as indicating or implying relative importance or implicitly indicating the number of technical features indicated or implicitly indicating the precedence of the technical features indicated.
In the description of the present invention, unless otherwise explicitly limited, terms such as arrangement, installation, connection and the like should be understood in a broad sense, and those skilled in the art can reasonably determine the specific meanings of the above terms in the present invention in combination with the specific contents of the technical solutions.
As shown in fig. 1, the present embodiment provides a method for learning CU depth partitioning based on frequency domain distribution, including the following steps:
s101, obtaining a video image, dividing the video image into a plurality of first CU blocks with the size of 64x64, and obtaining a frequency domain coefficient distribution matrix of DCT (discrete cosine transform) transformation of the first CU blocks
Figure BDA0003542307010000041
According to the coefficient distribution matrix of frequency domain
Figure BDA0003542307010000042
Obtaining a probability score
Figure BDA0003542307010000043
The luminance component of the video frame image is selected and divided into N64 x64 sized CU blocks. If the remaining areas are less than 64x64 pixels, the areas are interpolated.
DCT transformation is carried out on the CU area with the size of k 64x64 to obtain a frequency domain coefficient matrix
Figure BDA0003542307010000051
k is 1,2, N. Calculating the division probability:
Figure BDA0003542307010000052
s102, if
Figure BDA0003542307010000053
The first CU block is divided downwards into 4 second CU blocks with the size of 32x32, and a frequency domain coefficient distribution matrix of DCT transformation of the second CU blocks is obtained
Figure BDA0003542307010000054
According to the coefficient distribution matrix of frequency domain
Figure BDA0003542307010000055
Obtaining a probability score
Figure BDA0003542307010000056
Otherwise, the division of the first CU block is ended.
If it is
Figure BDA0003542307010000057
The CU area stops partitioning in advance; if it is
Figure BDA0003542307010000058
The CU area (called LCU _64) continues to be divided down into 4CU areas of size 32x32 according to the quadtree principle.
For the continued downward partition of LCU _64, the frequency domain coefficient matrix of the mth CU area with the size of 32x32 is obtained
Figure BDA0003542307010000059
m is 1,2,3, 4. Calculating the division probability:
Figure BDA00035423070100000510
s103, if
Figure BDA00035423070100000511
Dividing the second CU block into 4 third CU blocks with the size of 16x16 downwards, and acquiring a frequency domain coefficient distribution matrix of DCT (discrete cosine transform) transformation of the third CU blocks
Figure BDA00035423070100000512
According to the coefficient distribution matrix of frequency domain
Figure BDA00035423070100000513
Obtaining a probability score
Figure BDA00035423070100000514
Otherwise, the division of the second CU block is ended.
If it is
Figure BDA00035423070100000515
The CU area stops partitioning in advance; if it is
Figure BDA00035423070100000516
The CU area (called LCU _32) continues to be divided down into 4CU areas of size 16x16 according to the quadtree principle.
Similarly, the LCU _32 is continuously divided downwards to obtain the frequency domain coefficient matrix of the nth CU area with the size of 32x32
Figure BDA00035423070100000517
n is 1,2,3, 4. Calculating the division probability:
Figure BDA00035423070100000518
s104, if
Figure BDA00035423070100000519
Divide the third CU block down into 4 fourth CU blocks of size 8x 8; otherwise, the division of the third CU block is ended.
If it is
Figure BDA00035423070100000520
The CU area stops partitioning in advance; if it is
Figure BDA00035423070100000521
The CU area continues to be divided down into 4CU areas of size 8x8 according to the quadtree principle, ending.
Therefore, the method can obtain a mode of terminating the partition in advance, does not need to perform traversal recursion on all the situations, reduces the complexity of CU deep partition, and saves a large amount of coding time. Meanwhile, the probability fraction nature of the calculation division based on the frequency domain learning is only calculated on the two matrixes, the processor is simple and convenient to process, and great calculation resources and time can be saved.
The above method is explained in detail below with reference to the drawings and the specific embodiments.
As shown in fig. 4, fig. 4 is a diagram illustrating the frequency domain distribution-based CU deep learning method according to an embodiment of the present inventionA flow chart of a fast partitioning method. The method comprises a frequency domain distribution learning network module and a CU deep division judging module. When the CU deep division judgment is carried out, a frequency domain distribution weight matrix W needs to be learned through a frequency domain distribution learning network module firstly N And a division threshold value alpha N Two important parameters, N64, 32, 16.
FIG. 5 shows a learning frequency domain distribution weight matrix W provided by an embodiment of the present invention 64 And a division threshold α 64 And (4) network training flow chart. Distributing the weight matrix W for the remaining frequency domains 32 ,W 16 And a division threshold α 32 ,α 16 And similar training process is also performed, and additional drawings are omitted.
Step A1: obtaining training set and sample needed by network
Figure BDA0003542307010000061
Representing a DCT transform frequency domain coefficient distribution matrix corresponding to a k 64x64 size CU block; l is k When the CU block is divided down, 0 and 1 indicate that the CU block is divided down, 0 indicates no, and 1 indicates yes;
step A2: setting a loss function
Figure BDA0003542307010000062
Training the network to obtain a frequency domain distribution weight matrix W 64
Step A3: according to the frequency domain distribution weight matrix W which is learned currently 64 Selecting a label sample with L being 0 in the data set
Figure BDA0003542307010000063
Computing partition probability scores
Figure BDA0003542307010000064
Step A4: observation probability score
Figure BDA0003542307010000065
Selecting a suitable way to set the division threshold value alpha 64 . In this embodiment, the selection is made
Figure BDA0003542307010000066
Better experimental results can be obtained.
Referring to fig. 2 and 3, the more high-frequency components in a region tends to be divided into smaller CU blocks, and the frequency domain distribution learning network module learns how to represent the relationship between the richness (content richness) of the high-frequency components and the CU division depth. This association distributes the weight matrix W across the domains N And a division threshold α N Parameter representation.
Step S1: dividing the brightness component of any video image into a plurality of CU blocks with the size of 64x64, and solving a frequency domain coefficient distribution matrix of DCT (discrete cosine transform) transformation of the CU blocks
Figure BDA0003542307010000071
According to
Figure BDA0003542307010000072
Calculating the corresponding probability score;
step S2: decision probability score
Figure BDA0003542307010000073
And a division threshold α 64 The relationship between;
step S3: if it is
Figure BDA0003542307010000074
Ending the partitioning of the CU block in advance;
step S4: if it is
Figure BDA0003542307010000075
The CU block is divided down into 4 32x32 sub-CU blocks according to the quadtree principle;
step S5: performing DCT (discrete cosine transform) on 4 32x32 sub-CU blocks to obtain frequency domain coefficient distribution matrix
Figure BDA0003542307010000076
Calculating a probability score
Figure BDA0003542307010000077
Step S6: decision probability score
Figure BDA0003542307010000078
And a division threshold α 32 The relationship between;
step S7: if it is
Figure BDA0003542307010000079
Ending the partitioning of the CU block in advance;
step S8: if it is
Figure BDA00035423070100000710
The CU block is divided down into 4 16x16 sub-CU blocks according to the quadtree principle;
step S9: performing DCT (discrete cosine transform) on 4 16x16 sub-CU blocks to obtain frequency domain coefficient distribution matrix
Figure BDA00035423070100000711
Calculating a probability score
Figure BDA00035423070100000712
Step S10: decision probability score
Figure BDA00035423070100000713
And a division threshold α 16 The relationship between;
step S11: if it is
Figure BDA00035423070100000714
Ending the partitioning of the CU block in advance;
step S12: if it is
Figure BDA00035423070100000715
The CU block is divided down into 4 8x8 sub-CU blocks according to the quadtree principle, and this division is ended.
It can be seen that steps S3, S7, and S11 all have a chance to end partitioning in advance, so that in many cases, it is avoided to make decisions after traversing all partitioning modes of a CU block, and waste of encoding time and computational resources can be avoided. The division among different sub-CU blocks is independent and not influenced, so that a program can be processed in parallel, a plurality of sub-CU blocks are decided at the same time, and the coding time is greatly saved. Wherein, only one step is executed in S3 and S4, S7 and S8, and S11 and S12, even if only 9 steps are needed for dividing a 64x64CU block into a minimum 8x8CU block, which involves 3 DCT transformation steps, 3 inter-matrix operations, and three judgments, thereby greatly reducing the demand on the computing resources of the processor.
In the test experiment in the embodiment, a CU division depth result similar to that obtained by the conventional coding framework HEVC can be obtained, and the coding time is greatly reduced on the premise of not affecting the video quality and the code rate.
The present embodiment further provides a system for learning CU depth partitioning based on frequency domain distribution, including:
the first dividing module is used for acquiring a video image, dividing the video image into a plurality of first CU blocks with the size of 64x64, and acquiring a frequency domain coefficient distribution matrix of DCT (discrete cosine transform) transformation of the first CU blocks
Figure BDA0003542307010000081
According to the coefficient distribution matrix of frequency domain
Figure BDA0003542307010000082
Obtaining a probability score
Figure BDA0003542307010000083
A second division module for if
Figure BDA0003542307010000084
The first CU block is divided downwards into 4 second CU blocks with the size of 32x32, and a frequency domain coefficient distribution matrix of DCT transformation of the second CU blocks is obtained
Figure BDA0003542307010000085
According to the coefficient distribution matrix of frequency domain
Figure BDA0003542307010000086
Obtaining a probability score
Figure BDA0003542307010000087
Otherwise, ending the division of the first CU block;
a third division module for if
Figure BDA0003542307010000088
Dividing the second CU block into 4 third CU blocks with the size of 16x16 downwards, and acquiring a frequency domain coefficient distribution matrix of DCT (discrete cosine transform) transformation of the third CU blocks
Figure BDA0003542307010000089
According to the coefficient distribution matrix of frequency domain
Figure BDA00035423070100000810
Obtaining a probability score
Figure BDA00035423070100000811
Otherwise, ending the division of the second CU block;
a fourth division module for if
Figure BDA00035423070100000812
Divide the third CU block down into 4 fourth CU blocks of size 8x 8; otherwise, ending the division of the third CU block;
wherein ,αN To divide the threshold, N is 64, 32, 16.
The frequency domain distribution learning-based CU depth partitioning system of the present embodiment can execute the frequency domain distribution learning-based CU depth partitioning method provided in the embodiment of the present invention, can execute any combination of the implementation steps of the embodiment of the method, and has corresponding functions and advantageous effects of the method.
The present embodiment further provides a device for learning CU depth partitioning based on frequency domain distribution, including:
at least one processor;
at least one memory for storing at least one program;
when executed by the at least one processor, cause the at least one processor to implement the method of fig. 1.
The device for learning CU depth partitioning based on frequency domain distribution according to the present embodiment can perform the method for learning CU depth partitioning based on frequency domain distribution provided in the embodiments of the method of the present invention, can perform any combination of the implementation steps of the embodiments of the method, and has corresponding functions and advantages of the method.
The embodiment of the application also discloses a computer program product or a computer program, which comprises computer instructions, and the computer instructions are stored in a computer readable storage medium. The computer instructions may be read by a processor of a computer device from a computer-readable storage medium, and executed by the processor to cause the computer device to perform the method illustrated in fig. 1.
The present embodiment further provides a storage medium, which stores an instruction or a program capable of executing the frequency domain distribution learning CU depth partitioning method according to the embodiments of the present invention, and when the instruction or the program is executed, the instruction or the program can execute any combination of the implementation steps of the embodiments of the method, and has corresponding functions and advantages of the method.
In alternative embodiments, the functions/acts noted in the block diagrams may occur out of the order noted in the operational illustrations. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality/acts involved. Furthermore, the embodiments presented and described in the flow charts of the present invention are provided by way of example in order to provide a more thorough understanding of the technology. The disclosed methods are not limited to the operations and logic flows presented herein. Alternative embodiments are contemplated in which the order of various operations is changed and in which sub-operations described as part of larger operations are performed independently.
Furthermore, although the present invention is described in the context of functional modules, it should be understood that, unless otherwise stated to the contrary, one or more of the described functions and/or features may be integrated in a single physical device and/or software module, or one or more functions and/or features may be implemented in a separate physical device or software module. It will also be appreciated that a detailed discussion of the actual implementation of each module is not necessary for an understanding of the present invention. Rather, the actual implementation of the various functional modules in the apparatus disclosed herein will be understood within the ordinary skill of an engineer, given the nature, function, and internal relationship of the modules. Accordingly, those skilled in the art can, using ordinary skill, practice the invention as set forth in the claims without undue experimentation. It is also to be understood that the specific concepts disclosed are merely illustrative of and not intended to limit the scope of the invention, which is defined by the appended claims and their full scope of equivalents.
The functions may be stored in a computer-readable storage medium if they are implemented in the form of software functional units and sold or used as separate products. Based on such understanding, the technical solution of the present invention or a part thereof which substantially contributes to the prior art may be embodied in the form of a software product, which is stored in a storage medium and includes several instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present invention. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and other various media capable of storing program codes.
The logic and/or steps represented in the flowcharts or otherwise described herein, e.g., an ordered listing of executable instructions that can be considered to implement logical functions, can be embodied in any computer-readable medium for use by or in connection with an instruction execution system, apparatus, or device, such as a computer-based system, processor-containing system, or other system that can fetch the instructions from the instruction execution system, apparatus, or device and execute the instructions. For the purposes of this description, a "computer-readable medium" can be any means that can contain, store, communicate, propagate, or transport the program for use by or in connection with the instruction execution system, apparatus, or device.
More specific examples (a non-exhaustive list) of the computer-readable medium would include the following: an electrical connection (electronic device) having one or more wires, a portable computer diskette (magnetic device), a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber device, and a portable compact disc read-only memory (CDROM). Additionally, the computer-readable medium could even be paper or another suitable medium upon which the program is printed, as the program can be electronically captured, via for instance optical scanning of the paper or other medium, then compiled, interpreted or otherwise processed in a suitable manner if necessary, and then stored in a computer memory.
It should be understood that portions of the present invention may be implemented in hardware, software, firmware, or a combination thereof. In the above embodiments, the various steps or methods may be implemented in software or firmware stored in memory and executed by a suitable instruction execution system. For example, if implemented in hardware, as in another embodiment, any one or combination of the following techniques, which are known in the art, may be used: a discrete logic circuit having a logic gate circuit for implementing a logic function on a data signal, an application specific integrated circuit having an appropriate combinational logic gate circuit, a Programmable Gate Array (PGA), a Field Programmable Gate Array (FPGA), or the like.
In the foregoing description of the specification, reference to the description of "one embodiment/example," "another embodiment/example," or "certain embodiments/examples," etc., means that a particular feature, structure, material, or characteristic described in connection with the embodiment or example is included in at least one embodiment or example of the invention. In this specification, schematic representations of the above terms do not necessarily refer to the same embodiment or example. Furthermore, the particular features, structures, materials, or characteristics described may be combined in any suitable manner in any one or more embodiments or examples.
While embodiments of the present invention have been shown and described, it will be understood by those of ordinary skill in the art that: various changes, modifications, substitutions and alterations can be made to the embodiments without departing from the principles and spirit of the invention, the scope of which is defined by the claims and their equivalents.
While the preferred embodiments of the present invention have been illustrated and described, it will be understood by those skilled in the art that various changes in form and details may be made therein without departing from the spirit and scope of the invention as defined by the appended claims.

Claims (10)

1. A method for learning CU depth partitioning based on frequency domain distribution is characterized by comprising the following steps:
obtaining a video image, dividing the video image into a plurality of first CU blocks with the size of 64x64, and obtaining a frequency domain coefficient distribution matrix of DCT (discrete cosine transform) transformation of the first CU blocks
Figure FDA0003542305000000011
According to the coefficient distribution matrix of frequency domain
Figure FDA0003542305000000012
Obtaining a probability score
Figure FDA0003542305000000013
If it is
Figure FDA0003542305000000014
The first CU block is divided downwards into 4 second CU blocks with the size of 32x32, and a frequency domain coefficient distribution matrix of DCT transformation of the second CU blocks is obtained
Figure FDA0003542305000000015
According to the coefficient distribution matrix of frequency domain
Figure FDA0003542305000000016
Obtaining a probability score
Figure FDA0003542305000000017
Otherwise, ending the division of the first CU block;
if it is
Figure FDA0003542305000000018
Dividing the second CU block into 4 third CU blocks with the size of 16x16 downwards, and acquiring a frequency domain coefficient distribution matrix of DCT (discrete cosine transform) transformation of the third CU blocks
Figure FDA0003542305000000019
According to the coefficient distribution matrix of frequency domain
Figure FDA00035423050000000110
Obtaining a probability score
Figure FDA00035423050000000111
Otherwise, ending the division of the second CU block;
if it is
Figure FDA00035423050000000112
Divide the third CU block down into 4 fourth CU blocks of size 8x 8; otherwise, ending the division of the third CU block;
wherein ,αN To divide the threshold, N is 64, 32, 16.
2. The method as claimed in claim 1, wherein the probability score is used to learn CU depth partition based on frequency domain distribution
Figure FDA00035423050000000113
By the followingThe formula is calculated to obtain:
Figure FDA00035423050000000114
wherein ,W64 For a preset frequency domain distribution weight matrix, i represents the row coordinate of the matrix, and j represents the column coordinate of the matrix.
3. The method as claimed in claim 2, wherein the frequency domain distribution weight matrix W is a weight matrix of learning CU 64 Obtained by the following method:
obtaining training set and sample needed by network
Figure FDA00035423050000000115
Figure FDA00035423050000000116
Representing a DCT transform frequency domain coefficient distribution matrix corresponding to a k 64x64 size CU block; l is k When the CU block is divided down, 0 and 1 indicate that the CU block is divided down, 0 indicates no, and 1 indicates yes;
training the network according to a preset loss function to obtain a frequency domain distribution weight matrix W 64
The expression of the preset loss function is as follows:
Figure FDA00035423050000000117
4. the method as claimed in claim 3, wherein the partition threshold α is a 64 Obtained by the following method:
selecting label samples with L being 0 in training set
Figure FDA0003542305000000021
According to the selectionCalculating probability scores of the samples
Figure FDA0003542305000000022
From calculated probability scores
Figure FDA0003542305000000023
Obtaining a partition threshold α 64
5. The method as claimed in claim 4, wherein the partition threshold is set according to the frequency domain distribution
Figure FDA0003542305000000024
6. The method according to claim 1, wherein the dividing the video image into 64x64 sized first CU blocks comprises:
the video image is divided into several first CU blocks of 64x64 size according to the luminance component.
7. The method according to claim 1, wherein the dividing the video image into 64x64 sized first CU blocks comprises:
after dividing a video image into a plurality of first CU blocks with the size of 64x64, pixel interpolation is carried out on a pixel area with the size of 64x64 in the residual insufficient division.
8. A system for learning CU deep partitioning based on frequency domain distribution, comprising:
the first dividing module is used for acquiring a video image, dividing the video image into a plurality of first CU blocks with the size of 64x64, and acquiring a frequency domain coefficient distribution matrix of DCT (discrete cosine transform) transformation of the first CU blocks
Figure FDA0003542305000000025
According to the coefficient distribution matrix of frequency domain
Figure FDA0003542305000000026
Obtaining a probability score
Figure FDA0003542305000000027
A second division module for if
Figure FDA0003542305000000028
The first CU block is divided downwards into 4 second CU blocks with the size of 32x32, and a frequency domain coefficient distribution matrix of DCT transformation of the second CU blocks is obtained
Figure FDA0003542305000000029
According to the coefficient distribution matrix of frequency domain
Figure FDA00035423050000000210
Obtaining a probability score
Figure FDA00035423050000000211
Otherwise, ending the division of the first CU block;
a third division module for if
Figure FDA00035423050000000212
Dividing the second CU block into 4 third CU blocks with the size of 16x16 downwards, and acquiring a frequency domain coefficient distribution matrix of DCT (discrete cosine transform) transformation of the third CU blocks
Figure FDA00035423050000000213
According to the coefficient distribution matrix of frequency domain
Figure FDA00035423050000000214
Obtaining a probability score
Figure FDA00035423050000000215
Otherwise, ending the division of the second CU block;
a fourth division module for if
Figure FDA00035423050000000216
Divide the third CU block down into 4 fourth CU blocks of size 8x 8;
otherwise, ending the division of the third CU block;
wherein ,αN To divide the threshold, N is 64, 32, 16.
9. An apparatus for learning CU depth partitioning based on frequency domain distribution, comprising:
at least one processor;
at least one memory for storing at least one program;
when executed by the at least one processor, cause the at least one processor to implement the method of any one of claims 1-7.
10. A computer-readable storage medium, in which a program executable by a processor is stored, wherein the program executable by the processor is adapted to perform the method according to any one of claims 1 to 7 when executed by the processor.
CN202210241583.1A 2022-03-11 2022-03-11 CU depth division method, system, device and medium based on frequency domain distribution learning Active CN114827630B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210241583.1A CN114827630B (en) 2022-03-11 2022-03-11 CU depth division method, system, device and medium based on frequency domain distribution learning

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210241583.1A CN114827630B (en) 2022-03-11 2022-03-11 CU depth division method, system, device and medium based on frequency domain distribution learning

Publications (2)

Publication Number Publication Date
CN114827630A true CN114827630A (en) 2022-07-29
CN114827630B CN114827630B (en) 2023-06-06

Family

ID=82529378

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210241583.1A Active CN114827630B (en) 2022-03-11 2022-03-11 CU depth division method, system, device and medium based on frequency domain distribution learning

Country Status (1)

Country Link
CN (1) CN114827630B (en)

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20130259120A1 (en) * 2012-04-03 2013-10-03 Qualcomm Incorporated Quantization matrix and deblocking filter adjustments for video coding
US20150264403A1 (en) * 2014-03-17 2015-09-17 Qualcomm Incorporated Systems and methods for low complexity forward transforms using zeroed-out coefficients
CN108370441A (en) * 2015-11-12 2018-08-03 Lg 电子株式会社 Method and apparatus in image compiling system for intra prediction caused by coefficient
WO2018199459A1 (en) * 2017-04-26 2018-11-01 강현인 Image restoration machine learning algorithm using compression parameter, and image restoration method using same
US20200074642A1 (en) * 2018-08-29 2020-03-05 Qualcomm Incorporated Motion assisted image segmentation
CN112927202A (en) * 2021-02-25 2021-06-08 华南理工大学 Method and system for detecting Deepfake video with combination of multiple time domains and multiple characteristics
US20210289205A1 (en) * 2019-12-16 2021-09-16 Panasonic Intellectual Property Corporation Of America Encoder, decoder, encoding method, and decoding method
CN113411582A (en) * 2021-05-10 2021-09-17 华南理工大学 Video coding method, system, device and medium based on active contour

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20130259120A1 (en) * 2012-04-03 2013-10-03 Qualcomm Incorporated Quantization matrix and deblocking filter adjustments for video coding
US20150264403A1 (en) * 2014-03-17 2015-09-17 Qualcomm Incorporated Systems and methods for low complexity forward transforms using zeroed-out coefficients
CN108370441A (en) * 2015-11-12 2018-08-03 Lg 电子株式会社 Method and apparatus in image compiling system for intra prediction caused by coefficient
WO2018199459A1 (en) * 2017-04-26 2018-11-01 강현인 Image restoration machine learning algorithm using compression parameter, and image restoration method using same
US20200074642A1 (en) * 2018-08-29 2020-03-05 Qualcomm Incorporated Motion assisted image segmentation
US20210289205A1 (en) * 2019-12-16 2021-09-16 Panasonic Intellectual Property Corporation Of America Encoder, decoder, encoding method, and decoding method
CN112927202A (en) * 2021-02-25 2021-06-08 华南理工大学 Method and system for detecting Deepfake video with combination of multiple time domains and multiple characteristics
CN113411582A (en) * 2021-05-10 2021-09-17 华南理工大学 Video coding method, system, device and medium based on active contour

Non-Patent Citations (4)

* Cited by examiner, † Cited by third party
Title
ZHIHENG ZHOU等: "Parametric shape prior model used in image segmentation", 《JOURNAL OF SYSTEMS ENGINEERING AND ELECTRONICS》 *
向友君等: "基于人类视觉系统的扩散差错掩盖算法", 《华南理工大学学报(自然科学版)》, no. 05 *
吴国光等: "基于低复杂度自适应帧的WVSN视频编码方法", 《中国测试》, no. 03 *
周智恒,谢胜利: "基于纹理检测的视频序列误差掩盖", 计算机工程与应用, no. 05 *

Also Published As

Publication number Publication date
CN114827630B (en) 2023-06-06

Similar Documents

Publication Publication Date Title
Li et al. A deep learning approach for multi-frame in-loop filter of HEVC
US11218695B2 (en) Method and device for encoding or decoding image
US20180124415A1 (en) Encoder pre-analyser
US20180124422A1 (en) Motion compensation using temporal picture interpolation
CN112399176B (en) Video coding method and device, computer equipment and storage medium
CN107005695B (en) Method and apparatus for alternate transforms for video coding
CA3208670A1 (en) Image encoding method, image decoding method, encoder, decoder and storage medium
US11259029B2 (en) Method, device, apparatus for predicting video coding complexity and storage medium
CN108174208B (en) Efficient video coding method based on feature classification
CN111492655A (en) Texture-based partition decision for video compression
CN112929658B (en) Deep reinforcement learning-based quick CU partitioning method for VVC
CN109587491A (en) A kind of intra-frame prediction method, device and storage medium
CN111445424A (en) Image processing method, image processing device, mobile terminal video processing method, mobile terminal video processing device, mobile terminal video processing equipment and mobile terminal video processing medium
CN111988628A (en) VVC fast intra-frame coding method based on reinforcement learning
US20220284632A1 (en) Analysis device and computer-readable recording medium storing analysis program
EP3152901A2 (en) Apparatus and method to support encoding and decoding video data
CN114363632A (en) Intra-frame prediction method, encoding and decoding method, encoder and decoder, system, electronic device and storage medium
CN114827630A (en) Method, system, device and medium for learning CU deep partitioning based on frequency domain distribution
CN111669602B (en) Method and device for dividing coding unit, coder and storage medium
CN111464805B (en) Three-dimensional panoramic video rapid coding method based on panoramic saliency
US20190098303A1 (en) Method, device, and encoder for controlling filtering of intra-frame prediction reference pixel point
CN113518220A (en) Intra-frame division method, device and medium based on oriented filtering and edge detection
CN116634147B (en) HEVC-SCC intra-frame CU rapid partitioning coding method and device based on multi-scale feature fusion
CN113747177B (en) Intra-frame coding speed optimization method, device and medium based on historical information
EP3499890A1 (en) Deep learning based image partitioning for video compression

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant