CN104125466A - GPU (Graphics Processing Unit)-based HEVC (High Efficiency Video Coding) parallel decoding method - Google Patents
GPU (Graphics Processing Unit)-based HEVC (High Efficiency Video Coding) parallel decoding method Download PDFInfo
- Publication number
- CN104125466A CN104125466A CN201410328646.2A CN201410328646A CN104125466A CN 104125466 A CN104125466 A CN 104125466A CN 201410328646 A CN201410328646 A CN 201410328646A CN 104125466 A CN104125466 A CN 104125466A
- Authority
- CN
- China
- Prior art keywords
- gpu
- thread
- image
- hevc
- pixel
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000000034 method Methods 0.000 title claims abstract description 48
- 238000012545 processing Methods 0.000 title claims description 13
- 230000033001 locomotion Effects 0.000 claims abstract description 82
- 239000011159 matrix material Substances 0.000 claims abstract description 54
- 239000013598 vector Substances 0.000 claims abstract description 53
- 230000009466 transformation Effects 0.000 claims abstract description 50
- 238000013139 quantization Methods 0.000 claims abstract description 10
- 238000001914 filtration Methods 0.000 claims description 38
- 238000003860 storage Methods 0.000 claims description 13
- 238000006243 chemical reaction Methods 0.000 claims description 10
- 238000004364 calculation method Methods 0.000 claims description 7
- 238000004513 sizing Methods 0.000 claims description 6
- 230000003044 adaptive effect Effects 0.000 claims description 4
- 230000006835 compression Effects 0.000 description 6
- 238000007906 compression Methods 0.000 description 6
- 230000006870 function Effects 0.000 description 6
- 238000005516 engineering process Methods 0.000 description 4
- 230000001133 acceleration Effects 0.000 description 2
- 230000005540 biological transmission Effects 0.000 description 2
- 238000010586 diagram Methods 0.000 description 2
- 238000009826 distribution Methods 0.000 description 2
- 230000000694 effects Effects 0.000 description 2
- 208000037170 Delayed Emergence from Anesthesia Diseases 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 238000005520 cutting process Methods 0.000 description 1
- 230000006837 decompression Effects 0.000 description 1
- 230000007423 decrease Effects 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 238000011156 evaluation Methods 0.000 description 1
- 238000000605 extraction Methods 0.000 description 1
- 238000013507 mapping Methods 0.000 description 1
- 238000010295 mobile communication Methods 0.000 description 1
- 238000011084 recovery Methods 0.000 description 1
- 238000013519 translation Methods 0.000 description 1
- 238000012795 verification Methods 0.000 description 1
Landscapes
- Compression Or Coding Systems Of Tv Signals (AREA)
Abstract
The invention discloses a GPU-based HEVC parallel decoding method. The method includes that a GPU performs entropy decoding, re-ordering and inverse quantization on a read code stream file to obtain a transformation coefficient matrix, and the GPU parses the obtained code stream file to a obtain motion vector and a reference frame; the GPU processes the transformation coefficient matrix through an HEVC inverse transformation parallel algorithm to obtain residual data of an image, and the GPU uses an HEVC motion compensation parallel algorithm to obtain a predicted pixel value of the image according to the reference frame position which the motion vector points to; the GPU sequentially performs summing, deblocking filter and sample self-adaption compensation on the residual data and the predicted pixel value of the image to obtain a reconstructed image, and a pixel value of the reconstructed image is copied to a memory of the CPU. The GPU-based HEVC parallel decoding method effectively improves the decoding speed and efficiency and can be widely used in the video coding and decoding field.
Description
Technical field
The present invention relates to coding and decoding video field, especially a kind of HEVC parallel decoding method based on GPU.
Background technology
Fast development along with the Internet and mobile communication technology, digital video just strides forward towards the direction of high definition, high frame per second, high compression rate, the form of video develops into 1080P from 720P, has even occurred the clear digital video of superelevation of 4Kx2K, 8Kx4K in some occasion.In Video Applications, transmission bandwidth and memory space are undoubtedly most crucial resource, how in limited space, to realize the storage of high sharpness video, in bandwidth has the network environment of bottleneck, realize good transmission, are large difficult problems.The video of high definition can be brought higher quality of life, but so must have huge data volume.Give an example, 1080P high sharpness video, pixel is 1920X1080, the form of 4:2:0, the data volume of one two field picture is 24.88Mbit.The video of high definition has produced a difficult problem like this, and that is exactly that video code rate significantly raises.Video coding is exactly to characterize video information with few bit number of trying one's best, and the compression efficiency of the H.264 coding standard of current extensive use still cannot fully meet the application demand of ultra high-definition video.
HEVC (High Efficiency Video Coding) high efficiency video coding is by the MPEG of ISO and the common new video compression coding scheme of the next generation of formulating of the VCEG of ITU-T.HEVC standard is to inherit the existing Video Coding Scheme coding theory H.264 of knowing clearly, some coding techniquess have wherein been continued to use, and improved correlation technique, the interpolation filter that larger, the block-based interframe/infra-frame prediction of coding unit size selection mode is more diversified, more complicated etc.Video coding technique before HEVC contrast, has the advantages such as compression efficiency is higher, video quality is better, robustness is better, error recovery capabilities is stronger, be more suitable for transmitting in IP network.HEVC contrasts H.264/AVC coding standard, and when the video image of high definition and high-fidelity is encoded, compression efficiency is doubled, in the situation that to obtain the picture quality of rebuilding after decoding identical, the code check of video flowing reduces 50% like this.
But, the decline of code check is to using the increase of the complexity of encoding and decoding software as prerequisite, adopted more complicated, more flexibly after coding techniques, the complexity of HEVC encoding and decoding software also increases greatly, make high sharpness video carry out the also increase thereupon of time that compression and decompression spend, cannot meet the high real-time decoding broadcast request of the applications such as video conference and video telephone.
In the situation that high sharpness video becomes main flow, merely rely on CPU obviously can not realize well the real-time decoding of high sharpness video.GPU has excellent Floating-point Computation ability and powerful computation capability, if operand in decoding algorithm is huge, the module that complexity is higher is transferred to the upper realization of GPU, can effectively solve this difficult problem of real-time decoding.Yet, also do not have in the industry the HEVC coding and decoding video scheme based on GPU to occur at present.
Summary of the invention
In order to solve the problems of the technologies described above, the object of the invention is: provide that a kind of decoding speed is fast and efficiency is high, the HEVC parallel decoding method based on GPU.
The technical solution adopted for the present invention to solve the technical problems is: a kind of HEVC parallel decoding method based on GPU, comprising:
A, GPU carry out entropy decoding, reorder and inverse quantization the ASCII stream file ASCII reading, thereby obtain transform coefficient matrix, and GPU resolves the ASCII stream file ASCII obtaining simultaneously, thereby obtains motion vector and reference frame;
B, GPU adopt HEVC inverse transformation parallel algorithm to process transform coefficient matrix, thereby obtain the residual error data of image, and GPU adopts HEVC motion compensation parallel algorithm simultaneously, asks for the predicted pixel values of image according to the reference frame position of motion vector points;
C, GPU by the predicted pixel values of the residual error data of image and image sue for peace successively, deblocking filtering and sample adaptive equalization process, thereby obtain rebuilding image, and the pixel value of rebuilding image is copied in the internal memory of CPU.
Further, in described step B, GPU adopts HEVC inverse transformation parallel algorithm to process this step to transform coefficient matrix, and it comprises:
B11, initialization GPU, on GPU, application is for the equipment end global memory of store transformed coefficient matrix and residual error data;
B12, the size of the sizing grid of thread and thread block is set, and be thread and the corresponding Thread Id number that each converter unit distributes respective numbers according to the size of converter unit;
The corresponding transform coefficient matrix of each converter unit in B13, fetch equipment end global memory, then according to Thread Id number, each transform coefficient matrix is entered to row-column parallel calculation one dimension IDCT inverse transformation and the parallel one dimension IDCT inverse transformation of row successively, thereby obtain the residual error data of whole image block;
B14, the residual error data of each image block calculating is copied back to CPU internal memory, obtain the residual error data of whole image, then release device end global memory space.
Further, described step B13, it comprises:
The corresponding transform coefficient matrix of each converter unit in B131, fetch equipment end global memory;
B132, according to Thread Id number, each row of each transform coefficient matrix are carried out to one dimension IDCT inverse transformation simultaneously, the coefficient matrix after being converted is also temporarily stored in the result of conversion in the shared drive of thread block;
Every a line of B133, the coefficient matrix according to Thread Id number after to conversion in shared drive is carried out one dimension IDCT inverse transformation simultaneously, obtains residual error data matrix, and according to the residual error data of the whole image block of residual error data matrix computations.
Further, in described step B, GPU adopts HEVC motion compensation parallel algorithm, asks for this step of predicted pixel values of image according to the reference frame position of motion vector points, and it comprises:
S1, initialization GPU, in GPU, application is used for storing the memory space of motion vector, reference frame and predicted pixel values that each pixel of inter-frame forecast mode is corresponding;
S2, copy motion vector and corresponding reference frame image to equipment end, with reference to frame, be tied on texture storage device simultaneously;
S3, carry out thread configuration, for the processing of each predicted pixel values distributes a Thread Id number, in equipment end, open up the global memory space for Storage Estimation pixel value;
S4, each thread carry out direct texture reads according to the position of the Thread Id of self number and motion vector points reference frame simultaneously or filtering interpolation is processed, thereby obtains the pixel predictors of each thread;
S5, the pixel predictors of each thread is copied back to CPU internal memory, then the global memory space of release device end.
Further, described step S4, it is specially:
Each thread directly reads with the position of motion vector points reference frame according to the Thread Id of self number simultaneously or filtering interpolation is processed: if the motion vector points of this thread is whole pixel value position, directly read this motion vector locational pixel value of reference frame pointed in texture storage device, and using the pixel value that the reads pixel predictors as this thread; If the motion vector points of this thread is a minute location of pixels, according to the position of minute pixel, selects corresponding brightness or colourity image element interpolation Filtering Formula to calculate, thereby obtain the pixel predictors of this thread.
Further, described brightness image element interpolation Filtering Formula is 8 point interpolation Filtering Formulas, and described degree image element interpolation Filtering Formula is 4 point interpolation Filtering Formulas.
The invention has the beneficial effects as follows: built the decoding framework being formed by CPU and GPU, inverse transformation processing and motion compensation process that decoding complex degree is higher are transferred to the upper realization of GPU, and designed HEVC inverse transformation parallel algorithm and the HEVC motion compensation parallel algorithm based on GPU, effectively improved decoding speed and decoding efficiency.
Accompanying drawing explanation
Below in conjunction with drawings and Examples, the invention will be further described.
Fig. 1 is the flow chart of steps of a kind of HEVC parallel decoding method based on GPU of the present invention;
Fig. 2 is the flow chart that in step B of the present invention, GPU adopts HEVC inverse transformation parallel algorithm to process transform coefficient matrix;
Fig. 3 is the flow chart of step B13 of the present invention;
Fig. 4 is that in step B of the present invention, GPU adopts HEVC motion compensation parallel algorithm, asks for the flow chart of the predicted pixel values of image according to the reference frame position of motion vector points;
Fig. 5 is the HEVC decoding frame diagram of the embodiment of the present invention one;
Fig. 6 is minute pixel interpolating schematic diagram of brightness of the present invention.
Embodiment
With reference to Fig. 1, a kind of HEVC parallel decoding method based on GPU, comprising:
A, GPU carry out entropy decoding, reorder and inverse quantization the ASCII stream file ASCII reading, thereby obtain transform coefficient matrix, and GPU resolves the ASCII stream file ASCII obtaining simultaneously, thereby obtains motion vector and reference frame;
B, GPU adopt HEVC inverse transformation parallel algorithm to process transform coefficient matrix, thereby obtain the residual error data of image, and GPU adopts HEVC motion compensation parallel algorithm simultaneously, asks for the predicted pixel values of image according to the reference frame position of motion vector points;
C, GPU by the predicted pixel values of the residual error data of image and image sue for peace successively, deblocking filtering and sample adaptive equalization process, thereby obtain rebuilding image, and the pixel value of rebuilding image is copied in the internal memory of CPU.
With reference to Fig. 2, be further used as preferred embodiment, in described step B, GPU adopts HEVC inverse transformation parallel algorithm to process this step to transform coefficient matrix, and it comprises:
B11, initialization GPU, on GPU, application is for the equipment end global memory of store transformed coefficient matrix and residual error data;
B12, the size of the sizing grid of thread and thread block is set, and be thread and the corresponding Thread Id number that each converter unit distributes respective numbers according to the size of converter unit;
The corresponding transform coefficient matrix of each converter unit in B13, fetch equipment end global memory, then according to Thread Id number, each transform coefficient matrix is entered to row-column parallel calculation one dimension IDCT inverse transformation and the parallel one dimension IDCT inverse transformation of row successively, thereby obtain the residual error data of whole image block;
B14, the residual error data of each image block calculating is copied back to CPU internal memory, obtain the residual error data of whole image, then release device end global memory space.
Wherein, the sizing grid of thread is set as Grid (4,4,1), and the size of thread block is set as Block (16,16,1), and a Grid distributes 16 Block, and each Block distributes 256 threads.The quantity of thread is correspondingly distributed according to the size of converter unit.Image block is comprised of at least one converter unit, and image is comprised of at least one image block.
With reference to Fig. 3, be further used as preferred embodiment, described step B13, it comprises:
The corresponding transform coefficient matrix of each converter unit in B131, fetch equipment end global memory;
B132, according to Thread Id number, each row of each transform coefficient matrix are carried out to one dimension IDCT inverse transformation simultaneously, the coefficient matrix after being converted is also temporarily stored in the result of conversion in the shared drive of thread block;
Every a line of B133, the coefficient matrix according to Thread Id number after to conversion in shared drive is carried out one dimension IDCT inverse transformation simultaneously, obtains residual error data matrix, and according to the residual error data of the whole image block of residual error data matrix computations.
With reference to Fig. 4, be further used as preferred embodiment, in described step B, GPU adopts HEVC motion compensation parallel algorithm, asks for this step of predicted pixel values of image according to the reference frame position of motion vector points, and it comprises:
S1, initialization GPU, in GPU, application is used for storing the memory space of motion vector, reference frame and predicted pixel values that each pixel of inter-frame forecast mode is corresponding;
S2, copy motion vector and corresponding reference frame image to equipment end, with reference to frame, be tied on texture storage device simultaneously;
S3, carry out thread configuration, for the processing of each predicted pixel values distributes a Thread Id number, in equipment end, open up the global memory space for Storage Estimation pixel value;
S4, each thread carry out direct texture reads according to the position of the Thread Id of self number and motion vector points reference frame simultaneously or filtering interpolation is processed, thereby obtains the pixel predictors of each thread;
S5, the pixel predictors of each thread is copied back to CPU internal memory, then the global memory space of release device end.
Be further used as preferred embodiment, described step S4, it is specially:
Each thread directly reads with the position of motion vector points reference frame according to the Thread Id of self number simultaneously or filtering interpolation is processed: if the motion vector points of this thread is whole pixel value position, directly read this motion vector locational pixel value of reference frame pointed in texture storage device, and using the pixel value that the reads pixel predictors as this thread; If the motion vector points of this thread is a minute location of pixels, according to the position of minute pixel, selects corresponding brightness or colourity image element interpolation Filtering Formula to calculate, thereby obtain the pixel predictors of this thread.
Wherein, read this motion vector locational pixel value of reference frame pointed in texture storage device, by calling texture extraction function tex2D () function, realize.
Be further used as preferred embodiment, described brightness image element interpolation Filtering Formula is 8 point interpolation Filtering Formulas, and described degree image element interpolation Filtering Formula is 4 point interpolation Filtering Formulas.
Below in conjunction with specific embodiment, the present invention is described in further detail.
Embodiment mono-
With reference to Fig. 5, the first embodiment of the present invention:
HEVC decoding framework as shown in Figure 5.HEVC decode procedure is exactly reverseization of cataloged procedure, decoder readout code stream file, from NAL (network abstract layer), obtain bit stream, decoding is by one by one in sequence, one two field picture is divided into several maximum coding unit LCU, and with the order of raster scan, the LCU of take carries out entropy decoding as base unit, then reorder, thereby obtain the residual error coefficient of corresponding encoded unit; Then residual error coefficient is carried out to inverse quantization and inverse transformation, thereby obtain Image Residual data.Meanwhile, the decoder basis header generation forecast piece that decoding obtains from code stream: if inter-frame forecast mode generates a corresponding prediction piece according to motion vector and reference frame; If intra prediction mode, generates a prediction piece from adjacent predicting unit.Then, prediction blocks of data and the summation of residual block data obtain the image block data of reconstruct, and last image block data obtains rebuilding image output after processing by deblocking filtering and sample adaptive equalization.
The difference of consecutive frame on encoding relation has been described in motion compensation, that is to say that having described certain position that how macro block of reference frame moves in present frame above gets on, according to the size that waits of the reference frame of motion vector points, predict that the value of piece and residual values addition obtain reconstructed value.This method is often used for reducing the time domain redundancy in video sequence by Video Codec.Motion compensation, for image reconstruction, is that video is compiled requisite key modules in encoding and decoding.
Motion compensation be exactly a two field picture according to division of image texture, be the coding unit differing in size, on the basis of coding unit, divide predicting unit, predicting unit comprises a luminance block and two chrominance block, and each macro block of inter-coded macroblocks obtains from the macroblock prediction of a certain formed objects of reference picture.Pel motion compensation precision determines by the precision of motion vector, and it is directly connected to the size of reconstructed image quality and code stream.Motion vector is the size of translation in the process of predicting, coding side is being moved and estimating and draw.The precision of motion vector is higher, and the accuracy of motion compensation is higher.Filtering interpolation is in motion compensation, to be a very crucial technology, and what H.264 standard adopted is the Weiner filter of six taps, and its pel motion compensation precision is 1/4 pixel precision.And HEVC has adopted more advanced efficient interpolation filter, the interpolation filter based on discrete cosine transform namely.By contrast, in HEVC standard, the generation of sub-pix more succinctly efficiently, only needs a Filtering Formula, carries out a filtering and processes just passable.What luminance signal was used is the 8 point interpolation filters based on DCT discrete cosine transform, and carrier chrominance signal is used, is the 4 point interpolation filters based on DCT discrete cosine transform, carries out the interpolation of pixel.But a large amount of interpolation calculation causes the corresponding raising of complexity, and code efficiency can be lower.In reference frame, brightness and the chroma pixel of minute location of pixels are actually non-existent, therefore need to carry out the pixel value that pixel interpolating obtains minute location of pixels by filtering interpolation algorithm, and this motion compensation belongs to the motion compensation of sub-pixel precision.
Embodiment bis-
The present embodiment describes HEVC inverse transformation parallel algorithm process of the present invention.
Inverse transform block is a process of residual error sample value matrix that the transform coefficient matrix of current block is converted to, and is that follow-up reconstruct is ready.Inverse transformation is carried out after inverse quantization is processed, and the TU converter unit of take is equally processed as base unit, and its source data used is exactly the result of inverse quantization.When HEVC decoder of the present invention carries out two-dimentional IDCT inverse transformation, first carry out the one dimension IDCT inverse discrete cosine transformation in horizontal direction, then carry out the one dimension IDCT inverse discrete cosine transformation in vertical direction, finally pass through again matrix multiple, convert transform coefficient matrix to onesize residual error data matrix, thereby complete the conversion of frequency domain to time domain.
In one two field picture, the IDCT computing of different transform blocks is separate, and transform coefficient matrix on same converter unit is while carrying out the one dimension idct transform in horizontal direction, and each row are separate, therefore can realize the parallel computation of each row.Similarly, while carrying out the one dimension IDCT inverse transformation in vertical direction, between each row, there is not the correlation of data, therefore can realize parallel computation.The present invention distributes corresponding Thread Count to process according to the size of transform coefficient matrix, each row distributes a thread to process simultaneously, each row carry out one dimension IDCT inverse transformation simultaneously, after being disposed, every a line distributes a thread to carry out the calculating of one dimension IDCT inverse transformation simultaneously, has realized the two-dimentional IDCT inverse transformation parallel processing to transform coefficient matrix after completing.
Because the size of HEVC converter unit is 4x4,8x8,16x16 or 32x32.Converter unit is larger, and degree of concurrence is higher, and acceleration effect is more obvious.For example, the transform coefficient matrix corresponding to 32x32 converter unit first can carry out the one dimension IDCT inverse transformation of 32 row simultaneously, calls syncthreads () function and carries out synchronously, and then carry out the one dimension IDCT inverse transformation of 32 row simultaneously after completing.In addition, can also carry out IDCT inverse transformation to transform coefficient matrix corresponding to each converter unit simultaneously.In order to obtain better acceleration effect, the present invention directly utilizes the conversion coefficient in global memory space obtaining after inverse quantization parallel processing.Converter unit comprises a luminance transformation piece and two chromaticity transformation pieces, therefore, need to carry out respectively the inverse transformation of brightness and colourity, and the step of the two is identical.
The inverse transformation algorithm that the present invention is based on GPU comprises:
(1) the decoding incipient stage, initialization GPU, on GPU, application is for storing the global memory space of the residual error data obtaining after inverse transformation, and directly from global memory space, reading in GPU carries out the conversion coefficient obtaining after inverse quantization simultaneously.
(2) carry out the configuration of number of threads, configuration thread sizing grid is Grid (4,4,1), thread block size is Block (16,16,1), a Grid distributes 16 Block, and each Block distributes 256 threads, then according to the corresponding distribution number of threads of the size of converter unit.
(3) be that a transform coefficient matrix corresponding to converter unit is distributed to a thread block and carried out inverse transformation processing: first each thread carries out one dimension IDCT inverse transformation according to the Thread Id number of self to each row correspondence of transform coefficient matrix, each row carry out simultaneously, and call syncthreads () function and carry out synchronously, resulting result is temporarily stored in the shared drive in thread block; Then the every a line in coefficient matrix in shared drive is carried out to one dimension IDCT inverse transformation simultaneously, for a line, distribute a thread and process, thereby completed the two-dimentional IDCT inverse transformation to conversion coefficient, and obtain residual error data matrix.The transform coefficient matrix corresponding to each converter unit carries out inverse transformation processing simultaneously, and what obtain is exactly the residual error data of whole image block.
(4) the residual error data CongGPU global memory space of each image block calculating is copied in the internal memory of CPU, thereby obtain the residual error data of whole image.
(5) discharge the global memory space of distributing in decode procedure.
Embodiment tri-
The present embodiment describes HEVC motion compensation parallel algorithm process of the present invention.
Inter motion compensation realize principle, briefly, the motion vector obtaining by code stream analyzing exactly, according to the position of pointing to, try to achieve predicted value on reference frame, what point to is that directly read the whole pixel position of reference frame, if a minute pixel position needs to obtain a minute pixel predictors through pixel interpolating, then predicted value and the Image Residual value addition that obtains through inverse quantization, inverse transformation are obtained to image reconstruction value.In motion compensating module, the calculating of pixel interpolation filtering has probably been occupied to 70% operand.So the realization of motion compensation of the present invention on GPU is mainly to carry out pixel interpolating.Motion vector was continuous originally, but while carrying out inter prediction motion compensation, in order to improve the accuracy of video image interframe prediction in cataloged procedure, when search match block, motion vector is a minute pixel precision, the precision of brightness movement vector is 1/4 pixel, and chroma motion vector is 1/8 pixel precision.Therefore,, when motion vector points be reference frame minute location of pixels, need to carry out the pixel value that interpolation obtains correspondence position according to neighboring pixel value.
Wherein, it is as shown in table 1 that brightness minute pixel is carried out filtering interpolation coefficient used.
Table 1 interpolation of luminance pixels filtering coefficient used
Divide location of pixels | Filtering interpolation coefficient |
1/4 pixel | {-1,4,-10,58,17,-5,1,0} |
2/4 pixel | {-1,4,-11,40,40,-11,4,-1} |
3/4 pixel | {0,1,-5,17,58,-10,4,-1} |
Shown in Fig. 6, for brightness whole pixel position with by interpolation minute pixel position out, the position of capitalization representative is whole pixel, and what lowercase represented is sub-pix point.
In HEVC standard software, by xFracL in parameter list and yFracL, determine the position of pixel, the fractional part that represents the horizontal component of motion vector on xFracL practical significance, yFracL is actual represents the fractional part in the vertical component of motion vector, both combine and in HEVC standard, are representing the position of pixel, xFracL and yFracL are that the position that 0 representative refers to is whole pixel position, and all the other are a minute location of pixels.By determining position, select the neighborhood pixels in corresponding interpolation coefficient and reference frame to carry out the pixel value that interpolation obtains correspondence position.XFracL and yFracL are corresponding as shown in table 2 with the position of pixel in Fig. 6.
Table 2 luminance pixel point position mapping relations
Luminance pixel interpolation needs to select corresponding interpolation coefficient to solve according to the value of the position of minute pixel, with a of whole pixel in same level position
0,0, b
0,0, c
0,0corresponding to 1/4,2/4,3/4 pixel position, according to the coefficient in table 1 and A-
3,0, A-
2,0, A-
1,0, A
0,0, A
1,0, A
2,0, A
3,0, A
4, Othese whole pixels calculate.Wherein, variable shift1 equals (BitDepthY-8), shift2 be set to 6 and shift3 be arranged to (14-BitDepthY).Specific formula for calculation is:
a
0,0=(-A
-3,0+4*A
-2,0-10*A
-1,0+58*A
0,0+17*A
1,0-5*A
2,0+A
3,0)>>shift1
b
0,0=(-A
-3,0+4*A
-2,0-11*A
-1,0+40*A
0,0+40*A
1,0-11*A
2,0+4*A
3,0-A
4,0)>>shift1
c
0,0=(A
-2,0-5*A
-1,0+17*A
0,0+58*A
1,0-10*A
2,0+4*A
3,0-A
4,0)>>shift1
And d
0,0, h
0,0and n
0,01/4,2/4,3/4 locational pixel in corresponding vertical direction, also needs while carrying out pixel interpolating to know the whole pixel in same upright position, and its interpolation calculation is:
d
0,0=(-A
0,-3+4*A
0,-2-10*A
0,-1+58*A
0,0+17*A
0,1-5*A
0,2+A
0,3)>>shift1
h
0,0=(-A
0,-3+4*A
0,-2-11*A
0,-1+40*A
0,0+40*A
0,1-11*A
0,2+4*A
0,3-A
0,-4)>>shift1
n
0,0=(A
0,-2-5*A
0,-1+17*A
0,0+58*A
0,1-10*A
0,2+4*A
0,3-A
0,-4)>>shift1
A
0,0, b
0,0, c
0,0, d
0,0, h
0,0and n
0,0pixel value can directly be released by whole pixel and filtering interpolation coefficient one step, and the pixel when these positions of motion vector points, can try to achieve corresponding pixel value according to algorithm above.
The value of minute pixel in other position needs to carry out in two steps just trying to achieve.
E
0,0, i
0,0, p
0,0the calculating of pixel value.First according to asking the value with minute pixel position of whole pixel in same level or same upright position to try to achieve a above
0 ,-3, a
0 ,-2, a
0 ,-1, a
0,0, a
0,1, a
0,2, a
0,3, a
0,4value, and then can be calculated as follows:
e
0,0=(-a
0,-3+4*a
0,-2-10*a
0,-1+58*a
0,0+17*a
0,1-5*a
0,2+a
0,3)>>shift2
i
0,0=(-a
0,-3+4*a
0,-2-11*a
0,-1+40*a
0,0+40*a
0,1-11*a
0,2+4*a
0,3-a
0,4)>>shift2
p
0,0=(a
0,-2-5*a
0,-1+17*a
0,0+58*a
0,1-10*a
0,2+4*a
0,3-a
0,4)>>shift2
F
0,0, j
0,0, q
0,0the calculating of pixel value, need to first try to achieve b equally
0 ,-3, b
0 ,-2, b
0 ,-1, b
0,0, b
0,1, b
0,2, b
0,3, b
0,4value, and then utilize filtering interpolation parameter to be handled as follows:
f
0,0=(-b
0,-3+4*b
0,-2-10*b
0,-1+58*b
0,0+17*b
0,1-5*b
0,2+b
0,3)>>shift2
j
0,0=(-b
0,-3+4*b
0,-2-11*b
0,-1+40*b
0,0+40*b
0,1-11*b
0,2+4*b
0,3-b
0,4)>>shift2
q
0,0=(b
0,-2-5*b
0,-1+17*b
0,0+58*b
0,1-10*b
0,2+4*b
0,3-b
0,4)>>shift2
G
0,0, k
0,0, r
0,0calculating need first to try to achieve c
0 ,-3, c
0 ,-2, c
0 ,-1, c
0,0, c
0,1, c
0,2, c
0,3, c
0,4value, then calculate as follows:
g
0,0=(-c
0,-3+4*c
0,-2-10*c
0,-1+58*c
0,0+17*c
0,1-5*c
0,2+c
0,3)>>shift2
k
0,0=(-c
0,-3+4*c
0,-2-11*c
0,-1+40*c
0,0+40*c
0,1-11*c
0,2+4*c
0,3-c
0,4)>>shift2
r
0,0=(c
0,-2-5*c
0,-1+17*c
0,0+58*c
0,1-10*c
0,2+4*c
0,3-c
0,4)>>shift2
When the position that motion vector MV points to is the whole pixel on reference frame just, the value of the pixel of sensing is exactly predicted pixel values.The predicted pixel values of whole like this inter prediction piece can, according to corresponding MV and reference frame, calculate by each thread on GPU.
The process of the image element interpolation of colourity is identical with brightness principle, but colourity adopts, is the interpolation filter of 4 taps, does not here remake and is elaborated.During its interpolation, coefficient used is as shown in table 3.
Table 3 chroma pixel filtering interpolation coefficient used
Divide location of pixels | Filtering interpolation coefficient |
1/8 pixel | {-2,58,10,-2} |
2/8 pixel | {-4,54,16,-2} |
3/8 pixel | {-6,46,28,-4} |
4/8 pixel | {-4,36,36,-4} |
5/8 pixel | {-4,28,46,-6} |
6/8 pixel | {-2,16,54,-4} |
7/8 pixel | {-2,10,58,-2} |
In HEVC encoding and decoding standard, motion compensation be take image block and is carried out as unit, and the base unit of in fact processing is pixel, when each pixel carries out motion compensation, do not have relation of interdependence, the position calculation that only need to point to reference frame according to MV obtains corresponding predicted pixel values and is added with residual error pixel value that to obtain the pixel value rebuild just passable again.The base unit of each thread process is the pixel being converted to by PU prediction piece, and a thread carries out the calculating of a predicted pixel values, and like this, the inter prediction pixel value on whole image block just can calculate simultaneously.
HEVC motion compensation parallel algorithm of the present invention comprises:
(1) first, in the decoding incipient stage, GPU is carried out to initialization, in GPU, application is used for storing the memory space of the predicted pixel values of motion vector, reference frame and generation that each pixel of inter-frame forecast mode is corresponding.
(2) then by cudaMemcpy function, copy motion vector and corresponding reference frame image to equipment end, call cudaBindTexteure function simultaneously and be tied on texture storage device with reference to frame.During texture storage device reading out data, speed is fast, can ignore, and has further improved the efficiency of operation.
(3) carry out thread configuration, be a Thread Id number of processing distribution of each predicted pixel values, and open up for storing the global memory space of the predicted pixel values of corresponding generation in equipment end.It is Grid (4,4,1) that thread sizing grid is set, and thread block size is Block (16,16,1).A Grid distributes 16 Block, and each Block distributes 256 threads.
(4) according to the position of motion vector points reference frame, ask for predicted pixel values: if sensing is whole pixel value position, the locational value of reference frame that directly reads respective motion vectors sensing is exactly predicted pixel values; If minute location of pixels selects corresponding image element interpolation Filtering Formula to carry out evaluation according to the position of pixel, obtaining a corresponding minute pixel value is exactly predicted pixel values.Each thread is carried out identical execution step according to the Thread Id number of oneself and is obtained corresponding pixel predictors.
(5) pixel value prediction obtaining and corresponding pixel residual values addition are obtained to pixel reconstructed value the security verification that carries out data cutting.
(6) result is copied back to host side internal memory, the memory space of release device end.
Compared with prior art, the present invention has built the decoding framework being comprised of CPU and GPU, inverse transformation processing and motion compensation process that decoding complex degree is higher are transferred to the upper realization of GPU, and designed HEVC inverse transformation parallel algorithm and the HEVC motion compensation parallel algorithm based on GPU, effectively improved decoding speed and decoding efficiency.
More than that better enforcement of the present invention is illustrated, but the invention is not limited to described embodiment, those of ordinary skill in the art also can make all equivalent variations or replacement under the prerequisite without prejudice to spirit of the present invention, and the distortion that these are equal to or replacement are all included in the application's claim limited range.
Claims (6)
1. the HEVC parallel decoding method based on GPU, is characterized in that: comprising:
A, GPU carry out entropy decoding, reorder and inverse quantization the ASCII stream file ASCII reading, thereby obtain transform coefficient matrix, and GPU resolves the ASCII stream file ASCII obtaining simultaneously, thereby obtains motion vector and reference frame;
B, GPU adopt HEVC inverse transformation parallel algorithm to process transform coefficient matrix, thereby obtain the residual error data of image, and GPU adopts HEVC motion compensation parallel algorithm simultaneously, asks for the predicted pixel values of image according to the reference frame position of motion vector points;
C, GPU by the predicted pixel values of the residual error data of image and image sue for peace successively, deblocking filtering and sample adaptive equalization process, thereby obtain rebuilding image, and the pixel value of rebuilding image is copied in the internal memory of CPU.
2. a kind of HEVC parallel decoding method based on GPU according to claim 1, is characterized in that: in described step B, GPU adopts HEVC inverse transformation parallel algorithm to process this step to transform coefficient matrix, and it comprises:
B11, initialization GPU, on GPU, application is for the equipment end global memory of store transformed coefficient matrix and residual error data;
B12, the size of the sizing grid of thread and thread block is set, and be thread and the corresponding Thread Id number that each converter unit distributes respective numbers according to the size of converter unit;
The corresponding transform coefficient matrix of each converter unit in B13, fetch equipment end global memory, then according to Thread Id number, each transform coefficient matrix is entered to row-column parallel calculation one dimension IDCT inverse transformation and the parallel one dimension IDCT inverse transformation of row successively, thereby obtain the residual error data of whole image block;
B14, the residual error data of each image block calculating is copied back to CPU internal memory, obtain the residual error data of whole image, then release device end global memory space.
3. a kind of HEVC parallel decoding method based on GPU according to claim 2, is characterized in that: described step B13, and it comprises:
The corresponding transform coefficient matrix of each converter unit in B131, fetch equipment end global memory;
B132, according to Thread Id number, each row of each transform coefficient matrix are carried out to one dimension IDCT inverse transformation simultaneously, the coefficient matrix after being converted is also temporarily stored in the result of conversion in the shared drive of thread block;
Every a line of B133, the coefficient matrix according to Thread Id number after to conversion in shared drive is carried out one dimension IDCT inverse transformation simultaneously, obtains residual error data matrix, and according to the residual error data of the whole image block of residual error data matrix computations.
4. a kind of HEVC parallel decoding method based on GPU according to claim 1, it is characterized in that: in described step B, GPU adopts HEVC motion compensation parallel algorithm, this step of predicted pixel values of asking for image according to the reference frame position of motion vector points, it comprises:
S1, initialization GPU, in GPU, application is used for storing the memory space of motion vector, reference frame and predicted pixel values that each pixel of inter-frame forecast mode is corresponding;
S2, copy motion vector and corresponding reference frame image to equipment end, with reference to frame, be tied on texture storage device simultaneously;
S3, carry out thread configuration, for the processing of each predicted pixel values distributes a Thread Id number, in equipment end, open up the global memory space for Storage Estimation pixel value;
S4, each thread carry out direct texture reads according to the position of the Thread Id of self number and motion vector points reference frame simultaneously or filtering interpolation is processed, thereby obtains the pixel predictors of each thread;
S5, the pixel predictors of each thread is copied back to CPU internal memory, then the global memory space of release device end.
5. a kind of HEVC parallel decoding method based on GPU according to claim 4, is characterized in that: described step S4, and it is specially:
Each thread directly reads with the position of motion vector points reference frame according to the Thread Id of self number simultaneously or filtering interpolation is processed: if the motion vector points of this thread is whole pixel value position, directly read this motion vector locational pixel value of reference frame pointed in texture storage device, and using the pixel value that the reads pixel predictors as this thread; If the motion vector points of this thread is a minute location of pixels, according to the position of minute pixel, selects corresponding brightness or colourity image element interpolation Filtering Formula to calculate, thereby obtain the pixel predictors of this thread.
6. a kind of HEVC parallel decoding method based on GPU according to claim 5, is characterized in that: described brightness image element interpolation Filtering Formula is 8 point interpolation Filtering Formulas, and described degree image element interpolation Filtering Formula is 4 point interpolation Filtering Formulas.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201410328646.2A CN104125466B (en) | 2014-07-10 | 2014-07-10 | A kind of HEVC parallel decoding methods based on GPU |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201410328646.2A CN104125466B (en) | 2014-07-10 | 2014-07-10 | A kind of HEVC parallel decoding methods based on GPU |
Publications (2)
Publication Number | Publication Date |
---|---|
CN104125466A true CN104125466A (en) | 2014-10-29 |
CN104125466B CN104125466B (en) | 2017-10-10 |
Family
ID=51770711
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201410328646.2A Expired - Fee Related CN104125466B (en) | 2014-07-10 | 2014-07-10 | A kind of HEVC parallel decoding methods based on GPU |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN104125466B (en) |
Cited By (13)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104469488A (en) * | 2014-12-29 | 2015-03-25 | 北京奇艺世纪科技有限公司 | Video decoding method and system |
CN104506867A (en) * | 2014-12-01 | 2015-04-08 | 北京大学 | Sample adaptive offset parameter estimation method and device |
CN104780377A (en) * | 2015-03-18 | 2015-07-15 | 同济大学 | Parallel high efficiency video coding (HEVC) system and method based on distributed computer system |
CN105245896A (en) * | 2015-10-09 | 2016-01-13 | 传线网络科技(上海)有限公司 | HEVC (High Efficiency Video Coding) parallel motion compensation method and device |
CN106658012A (en) * | 2017-01-06 | 2017-05-10 | 华南理工大学 | Parallel pipeline task division method for VP9 decoder |
CN109451322A (en) * | 2018-09-14 | 2019-03-08 | 北京航天控制仪器研究所 | DCT algorithm and DWT algorithm for compression of images based on CUDA framework speed up to realize method |
WO2019076201A1 (en) * | 2017-10-16 | 2019-04-25 | Huawei Technologies Co., Ltd. | Coding method and apparatus |
CN111541901A (en) * | 2020-05-11 | 2020-08-14 | 网易(杭州)网络有限公司 | Picture decoding method and device |
CN113068049A (en) * | 2021-03-16 | 2021-07-02 | 上海富瀚微电子股份有限公司 | Fractional pixel motion estimation apparatus |
CN113965761A (en) * | 2021-10-22 | 2022-01-21 | 长安大学 | HTJ2K image compression method, device and equipment realized based on GPU |
US11252426B2 (en) | 2018-05-31 | 2022-02-15 | Huawei Technologies Co., Ltd. | Spatially varying transform with adaptive transform type |
US11388402B2 (en) | 2018-02-23 | 2022-07-12 | Huawei Technologies Co., Ltd. | Position dependent spatial varying transform for video coding |
CN118037870A (en) * | 2024-04-08 | 2024-05-14 | 中国矿业大学 | Zdepth-compatible parallelization depth image compression algorithm, zdepth-compatible parallelization depth image compression device and terminal equipment |
Citations (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102932003A (en) * | 2012-09-07 | 2013-02-13 | 上海交通大学 | Accelerated QC-LDPC (Quasi-Cyclic Low-Density Parity-Check Code) decoding method based on GPU (Graphics Processing Unit) framework |
CN102984522A (en) * | 2012-12-14 | 2013-03-20 | 深圳百科信息技术有限公司 | Brightness transformation domain intra-frame prediction coding and decoding method and system |
CN103297777A (en) * | 2013-05-23 | 2013-09-11 | 广州高清视信数码科技股份有限公司 | Method and device for increasing video encoding speed |
US20130287114A1 (en) * | 2007-06-30 | 2013-10-31 | Microsoft Corporation | Fractional interpolation for hardware-accelerated video decoding |
US20140043347A1 (en) * | 2012-08-10 | 2014-02-13 | Electronics And Telecommunications Research Institute | Methods for jpeg2000 encoding and decoding based on gpu |
CN103747262A (en) * | 2014-01-08 | 2014-04-23 | 中山大学 | Motion estimation method based on GPU (Graphic Processing Unit) |
CN103763569A (en) * | 2014-01-06 | 2014-04-30 | 上海交通大学 | HEVC fine grit parallel prediction method based on first input first output queues |
CN103888771A (en) * | 2013-12-30 | 2014-06-25 | 中山大学深圳研究院 | Parallel video image processing method based on GPGPU technology |
-
2014
- 2014-07-10 CN CN201410328646.2A patent/CN104125466B/en not_active Expired - Fee Related
Patent Citations (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20130287114A1 (en) * | 2007-06-30 | 2013-10-31 | Microsoft Corporation | Fractional interpolation for hardware-accelerated video decoding |
US20140098887A1 (en) * | 2007-06-30 | 2014-04-10 | Microsoft Corporation | Reducing memory consumption during video decoding |
US20140043347A1 (en) * | 2012-08-10 | 2014-02-13 | Electronics And Telecommunications Research Institute | Methods for jpeg2000 encoding and decoding based on gpu |
CN102932003A (en) * | 2012-09-07 | 2013-02-13 | 上海交通大学 | Accelerated QC-LDPC (Quasi-Cyclic Low-Density Parity-Check Code) decoding method based on GPU (Graphics Processing Unit) framework |
CN102984522A (en) * | 2012-12-14 | 2013-03-20 | 深圳百科信息技术有限公司 | Brightness transformation domain intra-frame prediction coding and decoding method and system |
CN103297777A (en) * | 2013-05-23 | 2013-09-11 | 广州高清视信数码科技股份有限公司 | Method and device for increasing video encoding speed |
CN103888771A (en) * | 2013-12-30 | 2014-06-25 | 中山大学深圳研究院 | Parallel video image processing method based on GPGPU technology |
CN103763569A (en) * | 2014-01-06 | 2014-04-30 | 上海交通大学 | HEVC fine grit parallel prediction method based on first input first output queues |
CN103747262A (en) * | 2014-01-08 | 2014-04-23 | 中山大学 | Motion estimation method based on GPU (Graphic Processing Unit) |
Non-Patent Citations (3)
Title |
---|
DIEGO F. DE SOUZA: "COOPERATIVE CPU+GPU DEBLOCKING FILTER PARALLELIZATION FOR HIGH PERFORMANCE HEVC VIDEO CODECS", 《ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP)》 * |
JIAJIA LU, FAN LIANG: "A Fast Block Partition Algorithm For HEVC", 《INFORMATION, COMMUNICATIONS AND SIGNAL PROCESSING (ICICS)》 * |
邹彬彬,梁凡: "一种基于CPU+GPU的AVS视频并行编码方法", 《上海大学学报》 * |
Cited By (26)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104506867A (en) * | 2014-12-01 | 2015-04-08 | 北京大学 | Sample adaptive offset parameter estimation method and device |
CN104506867B (en) * | 2014-12-01 | 2017-07-21 | 北京大学 | Sample point self-adapted offset parameter method of estimation and device |
CN104469488A (en) * | 2014-12-29 | 2015-03-25 | 北京奇艺世纪科技有限公司 | Video decoding method and system |
CN104469488B (en) * | 2014-12-29 | 2018-02-09 | 北京奇艺世纪科技有限公司 | Video encoding/decoding method and system |
CN104780377A (en) * | 2015-03-18 | 2015-07-15 | 同济大学 | Parallel high efficiency video coding (HEVC) system and method based on distributed computer system |
CN104780377B (en) * | 2015-03-18 | 2017-12-15 | 同济大学 | A kind of parallel HEVC coded systems and method based on Distributed Computer System |
CN105245896A (en) * | 2015-10-09 | 2016-01-13 | 传线网络科技(上海)有限公司 | HEVC (High Efficiency Video Coding) parallel motion compensation method and device |
CN106658012A (en) * | 2017-01-06 | 2017-05-10 | 华南理工大学 | Parallel pipeline task division method for VP9 decoder |
US11006139B2 (en) | 2017-10-16 | 2021-05-11 | Huawei Technologies Co., Ltd. | Encoding method and apparatus |
US11343523B2 (en) | 2017-10-16 | 2022-05-24 | Huawei Technologies Co., Ltd. | Coding method and apparatus |
US11956455B2 (en) | 2017-10-16 | 2024-04-09 | Huawei Technologies Co., Ltd. | Coding method and apparatus |
WO2019076201A1 (en) * | 2017-10-16 | 2019-04-25 | Huawei Technologies Co., Ltd. | Coding method and apparatus |
US11523129B2 (en) | 2017-10-16 | 2022-12-06 | Huawei Technologies Co., Ltd. | Encoding method and apparatus |
US11917152B2 (en) | 2018-02-23 | 2024-02-27 | Huawei Technologies Co., Ltd. | Position dependent spatial varying transform for video coding |
US11388402B2 (en) | 2018-02-23 | 2022-07-12 | Huawei Technologies Co., Ltd. | Position dependent spatial varying transform for video coding |
US11252426B2 (en) | 2018-05-31 | 2022-02-15 | Huawei Technologies Co., Ltd. | Spatially varying transform with adaptive transform type |
US11601663B2 (en) | 2018-05-31 | 2023-03-07 | Huawei Technologies Co., Ltd. | Spatially varying transform with adaptive transform type |
US12022100B2 (en) | 2018-05-31 | 2024-06-25 | Huawei Technologies Co., Ltd. | Spatially varying transform with adaptive transform type |
CN109451322A (en) * | 2018-09-14 | 2019-03-08 | 北京航天控制仪器研究所 | DCT algorithm and DWT algorithm for compression of images based on CUDA framework speed up to realize method |
CN109451322B (en) * | 2018-09-14 | 2021-02-02 | 北京航天控制仪器研究所 | Acceleration implementation method of DCT (discrete cosine transform) algorithm and DWT (discrete wavelet transform) algorithm based on CUDA (compute unified device architecture) for image compression |
CN111541901A (en) * | 2020-05-11 | 2020-08-14 | 网易(杭州)网络有限公司 | Picture decoding method and device |
CN113068049A (en) * | 2021-03-16 | 2021-07-02 | 上海富瀚微电子股份有限公司 | Fractional pixel motion estimation apparatus |
CN113965761A (en) * | 2021-10-22 | 2022-01-21 | 长安大学 | HTJ2K image compression method, device and equipment realized based on GPU |
CN113965761B (en) * | 2021-10-22 | 2023-06-02 | 长安大学 | HTJ2K image compression method, device and equipment based on GPU |
CN118037870A (en) * | 2024-04-08 | 2024-05-14 | 中国矿业大学 | Zdepth-compatible parallelization depth image compression algorithm, zdepth-compatible parallelization depth image compression device and terminal equipment |
CN118037870B (en) * | 2024-04-08 | 2024-06-07 | 中国矿业大学 | Zdepth-compatible parallelization depth image compression algorithm, zdepth-compatible parallelization depth image compression device and terminal equipment |
Also Published As
Publication number | Publication date |
---|---|
CN104125466B (en) | 2017-10-10 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN104125466A (en) | GPU (Graphics Processing Unit)-based HEVC (High Efficiency Video Coding) parallel decoding method | |
CN111819852A (en) | Method and apparatus for residual symbol prediction in transform domain | |
KR102587638B1 (en) | Motion estimation method and system using neighboring block patterns for video coding | |
US20210067803A1 (en) | Matrix weighted intra prediction of video signals | |
WO2018052552A1 (en) | Dual filter type for motion compensated prediction in video coding | |
JP7547557B2 (en) | Method and apparatus for decoding and encoding video pictures | |
GB2498550A (en) | A method of processing image components for coding using image sample subsets comprising samples selected from neighbouring borders of first and second image | |
CN111010495A (en) | Video denoising processing method and device | |
WO2023028965A1 (en) | Hardware codec accelerators for high-performance video encoding | |
KR20230117570A (en) | Chroma from Luma Prediction for Video Coding | |
WO2023005830A1 (en) | Predictive coding method and apparatus, and electronic device | |
CN112088534B (en) | Method, device and equipment for inter-frame prediction and storage medium | |
JP2024129129A (en) | Video image component prediction method and apparatus - Computer storage medium | |
KR20230149295A (en) | Buffers for video coding in palette mode | |
CN115280770B (en) | Method and apparatus for encoding or decoding video | |
CN116723328A (en) | Video coding method, device, equipment and storage medium | |
CN116114246B (en) | Intra-frame prediction smoothing filter system and method | |
WO2023048646A9 (en) | Methods and systems for performing combined inter and intra prediction | |
KR20090041944A (en) | Method and apparatus for motion estimation using mode information of neighbor blocks | |
KR101691380B1 (en) | Dct based subpixel accuracy motion estimation utilizing shifting matrix | |
JP2014090327A (en) | Moving image encoder, moving image decoder, moving image encoding method and moving image decoding method | |
JP2014090326A (en) | Moving image encoder, moving image decoder, moving image encoding method and moving image decoding method | |
CN111869211B (en) | Image encoding device and method | |
US20240357090A1 (en) | Chroma-from-luma mode selection for high-performance video encoding | |
CN105744269B (en) | A kind of code-transferring method based on down-sampling and sub-pel motion estimation |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant | ||
CF01 | Termination of patent right due to non-payment of annual fee |
Granted publication date: 20171010 Termination date: 20210710 |
|
CF01 | Termination of patent right due to non-payment of annual fee |