CN105847840A - Parallel structure construction method for high efficiency video coding inverse transform operation - Google Patents

Parallel structure construction method for high efficiency video coding inverse transform operation Download PDF

Info

Publication number
CN105847840A
CN105847840A CN201510795422.7A CN201510795422A CN105847840A CN 105847840 A CN105847840 A CN 105847840A CN 201510795422 A CN201510795422 A CN 201510795422A CN 105847840 A CN105847840 A CN 105847840A
Authority
CN
China
Prior art keywords
inverse transformation
computing
video coding
parallel
data
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201510795422.7A
Other languages
Chinese (zh)
Other versions
CN105847840B (en
Inventor
刘镇弢
王杏军
蒋林
刘有耀
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Xi'an Yuntie Intelligent Technology Co.,Ltd.
Original Assignee
Xian University of Posts and Telecommunications
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Xian University of Posts and Telecommunications filed Critical Xian University of Posts and Telecommunications
Priority to CN201510795422.7A priority Critical patent/CN105847840B/en
Publication of CN105847840A publication Critical patent/CN105847840A/en
Application granted granted Critical
Publication of CN105847840B publication Critical patent/CN105847840B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Landscapes

  • Compression Or Coding Systems Of Tv Signals (AREA)

Abstract

The invention discloses a parallel structure construction method for high efficiency video coding inverse transform operation. Specific to an inverse transform algorithm in HEVC, parallel analysis is carried out on the inverse transformation. Parallel processing of the inverse transform operation is carried out on a 4x4 TU (Transform Unit) by using 2x4 two-dimensional processing elements. The 2x4 two-dimensional processing elements successively finishes vertical and horizontal inverse transform operation based on odd-even decomposition operation in a parallel mode. According to the method, the calculation complexity of inverse transformation can be effectively reduced, the coding and decoding time is shortened, and the coding and decoding processes are accelerated.

Description

A kind of parallel organization building method for efficient video coding inverse transformation computing
Technical field
The present invention relates to technical field of video coding, be specifically related to a kind of efficient video coding (High Efficiency Video Coding, is called for short HEVC) parallelization of inverse transformation algorithm in standard.
Technical background
Along with high definition, ultra high-definition video (resolution reaches 4K × 2K, 8K × 4K) application are flooded with the visual field of people the most more and more, the variation of Video Applications and the video data of magnanimity are badly in need of the existence of redundant that new coding techniques is eliminated as much as in video data, reduce the data volume characterizing video.To this end, joint working group (the Joint Collaborative Team that Video Coding Experts group ITU/VCEG of in November, 2013 International Telecommunication Union and international movement motion picture expert group version ISO-IEC/MPEG are set up On Video Coding, JCT-VC) formally issue video encoding standard of new generation H.265/HEVC.
Compared to conventional video encoding standard, coding efficiency H.265/HEVC is greatly improved.But, H.265/HEVC need to expend substantial amounts of calculation cost relative to H.264/AVC standard.The common method improving code efficiency can not meet the requirement of H.265/HEVC video encoding standard, so needing to find a kind of method that can improve and H.265/HEVC calculate speed.
H.265/HEVC reference software also needs to be optimized the requirement that can be only achieved application in real time, can realize, in addition to this it is possible to utilize other hardware or software resource to improve arithmetic speed further by optimizing software code and optimized algorithm aspect.
H.265/HEVC the process of the conversion in standard and inverse transformation algorithm is to liking TU separate in a two field picture, and it belongs to computing intensity algorithm, processes substantial amounts of repetitive operation, and the dependence between data stream is the most regular, have substantial amounts of concurrency.The most fast and effeciently carry out inverse transformation algorithm particularly significant to reducing its computation complexity.
Summary of the invention
The present invention seeks to, for the problems referred to above, to propose a kind of parallel organization building method for efficient video coding inverse transformation computing.The present invention can be substantially reduced the complexity that Video coding calculates in the case of not reducing code efficiency.
For achieving the above object, the technical method that the present invention uses is as follows: a kind of parallel organization building method for efficient video coding inverse transformation computing.It is characterized in that: on 2 × 4 two-dimensional process element array (PE00-PE13) of adjacent interconnected, TU to 4 × 4 carries out the parallel processing of inverse transformation computing.
Dequantized data (X00 and X02, X20 and X22, X10 and X12, X30 and X32, X01 and X03, X21 and X23, X11 and X13, X31 and X33) by 4 × 4 is loaded into the data register of Processor Array (PE00-PE13) respectively from internal memory, is sequentially completed vertically and horizontally inverse transformation based on Parity-decomposition computing in a parallel fashion;
Its vertical inverse transformation computational methods based on Parity-decomposition computing: dequantized data is loaded by Processor Array (PE00-PE13) respectively, load etc. data, perform Parity-decomposition computing, respectively input data (X00 and X02, X20 and X22, X10 and X12, X30 and X32, X01 and X03, X21 and X23, X11 and X13, X31 and X33) is shifted and plus and minus calculation step by step, obtain the intermediate variable (M00 and M01, M20 and M21, M10 and M11, M30 and M31, M02 and M03, M22 and M23, M12 and M13, M32 and M33) of vertical inverse transformation;
Between vertically adjacent processing elements (PE00 and PE10, PE01 and PE11, PE02 and PE12, PE03 and PE13), intermediate variable (M00-M33) is obtained vertical inverse transformation result (Z00 and Z01, Z20 and Z21, Z10 and Z11, Z30 and Z31, Z02 and Z03, Z22 and Z23, Z12 and Z13, Z32 and Z33) by plus and minus calculation;
Horizontal reverse transformation calculations method based on Parity-decomposition computing: between the adjacent processing elements of horizontal direction (PE00 and PE01, PE02 and PE03, PE10 and PE11, PE12 and PE13), vertical inverse transformation result (Z00-Z33) is obtained by displacement and plus and minus calculation the intermediate variable (N00 and N01, N20 and N21, N10 and N11, N30 and N31, N02 and N03, N22 and N23, N12 and N13, N32 and N33) of horizontal reverse conversion;
Between horizontal direction interval processing elements (PE00 and PE02, PE01 and PE03, PE10 and PE12, PE11 and PE13), intermediate variable (N00-N33) is obtained final inverse transformation result (Y00-Y33) by plus and minus calculation.
The present invention proposes a kind of parallel organization building method for efficient video coding inverse transformation computing, effectively shortens the process time of inverse transformation algorithm, accelerates encoding-decoding process, and can carry out inverse transformation fast and effectively, reduce the complexity of calculating simultaneously.
Accompanying drawing explanation
Fig. 1 is 2 × 4 two-dimensional process element array of adjacent interconnected.
Fig. 2 is that 4 × 4 dequantized coefficients load figure.
Fig. 3 is the intermediate variable flow chart of vertical inverse transformation.
Fig. 4 is the intermediate variable sharing mode figure of vertical inverse transformation.
Fig. 5 is the intermediate variable flow chart of horizontal reverse conversion.
Fig. 6 is the intermediate variable sharing mode figure of horizontal reverse conversion.
Detailed description of the invention
2 × 4 two-dimensional process element array PE00-PE13, these 8 processing elements form two-dimensional process element array by near-neighbor interconnection.
Converter unit (Transform Unit, write a Chinese character in simplified form TU) be H.265/HEVC in a basic conception.In general, TU size can be 4 × 4 to 64 × 64.During coding, TU is by change quantization and the elementary cell of inverse transformation inverse quantization.
A kind of parallel organization building method for efficient video coding inverse transformation computing that the present invention provides, what it comprised implements step is:
The first step: owing to video array processor uses the data transfer mode of adjacent interconnected, it is contemplated that to the transmission range problem of PE data sharing, dequantized coefficients X of 4 × 4 is loaded in the data register in PE00 to PE13, during wherein X00 and X02 leaves PE00 in, X20 and X22 leaves in PE01, X10 and X12 leaves in PE02, X30 and X32 leaves in PE03, X01 and X03 leaves in PE10, X11 and X13 leaves in PE11, X21 and X23 leaves in PE12, X31 and X33 leaves in PE13.
Second step: be respectively configured PE00 according to formula (1) and formula (2), PE01, PE02, PE03 and PE10, PE11, PE12, the instruction of PE13, calculate each intermediate variable M0 of vertical inverse transformation, M1, M2 and M3, wherein PE00 respectively obtains M00 and M01 by X00 and X02 carries out displacement addition and subtraction, PE01 respectively obtains M20 and M21 by X20 and X22 carries out displacement addition and subtraction, PE02 respectively obtains M10 and M11 by X10 and X12 carries out displacement addition and subtraction, PE03 respectively obtains M30 and M31 by X30 and X32 carries out displacement addition and subtraction.
3rd step: according to formula (3), by sharing the 1 dimension conversion that register data completes in inverse transformation between vertically adjacent PE.PE00 and PE10 is by carrying out subtraction to shared data M01 and M03 and additive operation respectively obtains Z00, Z30 and Z10, Z20;PE01 and PE11 is by carrying out subtraction to shared data M21 and M23 and additive operation respectively obtains Z02, Z32 and Z12, Z22;PE02 and PE12 is by carrying out subtraction to shared data M11 and M13 and additive operation respectively obtains Z01, Z31 and Z11, Z21;PE03 and PE13 is by carrying out subtraction to shared data M31 and M33 and additive operation respectively obtains Z03, Z13 and Z23, Z33.
null5th step: be respectively configured PE00 according to formula (4) and formula (5)、PE01、PE10、PE11 and PE02、PE03、PE12、The instruction of PE13,Calculate each intermediate variable N0 of horizontal reverse conversion、N1、N2 and N3,Wherein PE00 and PE01 respectively obtains N00 by shared data Z02 and Z30 carry out displacement addition and subtraction、N01 and M30、N31,PE02 and PE03 respectively obtains N02 by shared data Z03 and Z31 carry out displacement addition and subtraction、N03 and N32、N33,PE10 and PE11 respectively obtains N10 by shared data Z12 and Z20 carry out displacement addition and subtraction、N11 and N20、N21,PE12 and PE13 respectively obtains N12 by shared data Z13 and Z21 carry out displacement addition and subtraction、N13 and N22、N23.
6th step: according to formula (6), obtains the final result of inverse transformation by sharing register data between the adjacent PE of horizontal direction, and wherein PE00 and PE02 is by carrying out subtraction to shared data N03 and N01 and additive operation respectively obtains Y00, Y03 and Y01, Y02;PE01 and PE03 is by carrying out subtraction to shared data N33 and N31 and additive operation respectively obtains Y30, Y33 and Y31, Y32;PE10 and PE12 is by carrying out subtraction to shared data N13 and N11 and additive operation respectively obtains Y10, Y13 and Y11, Y12;PE11 and PE13 is by carrying out subtraction to shared data N23 and N21 and additive operation respectively obtains Y20, Y23 and Y21, Y22.
It is last that it is noted that obviously above-mentioned enforcement is only for clearly demonstrating example of the present invention, and not restriction to embodiment.For the those of ordinary skill in described field, every technical spirit according to the present invention, to any simple modification made for any of the above embodiments, equivalent variations and modification, the most still belongs to the protection domain of technical solution of the present invention.

Claims (4)

1. the parallel organization building method for efficient video coding inverse transformation computing, it is characterised in that: on 2 × 4 two-dimensional process element array (PE00-PE13) of adjacent interconnected, TU to 4 × 4 carries out the parallel processing of inverse transformation computing.
2. according to a kind of parallel organization building method for efficient video coding inverse transformation computing described in claims 1, it is characterized in that: the dequantized data (X00 and X02, X20 and X22, X10 and X12, X30 and X32, X01 and X03, X21 and X23, X11 and X13, X31 and X33) by 4 × 4 is loaded into the data register of Processor Array (PE00-PE13) respectively from internal memory, is sequentially completed vertically and horizontally inverse transformation based on Parity-decomposition computing in a parallel fashion.
null3. according to a kind of parallel organization building method for efficient video coding inverse transformation computing described in claims 2,Its vertical inverse transformation based on Parity-decomposition computing is characterised by: dequantized data is loaded by Processor Array (PE00-PE13) respectively,Load etc. data,Perform Parity-decomposition computing,Respectively to input data (X00 and X02、X20 and X22、X10 and X12、X30 and X32、X01 and X03、X21 and X23、X11 and X13、X31 and X33) shift step by step and plus and minus calculation,Obtain intermediate variable (M00 and M01 of vertical inverse transformation、M20 and M21、M10 and M11、M30 and M31、M02 and M03、M22 and M23、M12 and M13、M32 and M33);Between vertically adjacent processing elements (PE00 and PE10, PE01 and PE11, PE02 and PE12, PE03 and PE13), intermediate variable (M00-M33) is obtained vertical inverse transformation result (Z00 and Z01, Z20 and Z21, Z10 and Z11, Z30 and Z31, Z02 and Z03, Z22 and Z23, Z12 and Z13, Z32 and Z33) by plus and minus calculation.
4. according to a kind of parallel organization building method for efficient video coding inverse transformation computing described in claims 2,3, its horizontal reverse transform characteristics based on Parity-decomposition computing is: vertical inverse transformation result (Z00-Z33) is obtained between the adjacent processing elements of horizontal direction (PE00 and PE01, PE02 and PE03, PE10 and PE11, PE12 and PE13) intermediate variable (N00 and N01, N20 and N21, N10 and N11, N30 and N31, N02 and N03, N22 and N23, N12 and N13, N32 and N33) of horizontal reverse conversion by displacement and plus and minus calculation;Between horizontal direction interval processing elements (PE00 and PE02, PE01 and PE03, PE10 and PE12, PE11 and PE13), intermediate variable (N00-N33) is obtained final inverse transformation result (Y00-Y33) by plus and minus calculation.
CN201510795422.7A 2015-11-18 2015-11-18 A kind of parallel organization building method for efficient video coding inverse transformation operation Active CN105847840B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201510795422.7A CN105847840B (en) 2015-11-18 2015-11-18 A kind of parallel organization building method for efficient video coding inverse transformation operation

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201510795422.7A CN105847840B (en) 2015-11-18 2015-11-18 A kind of parallel organization building method for efficient video coding inverse transformation operation

Publications (2)

Publication Number Publication Date
CN105847840A true CN105847840A (en) 2016-08-10
CN105847840B CN105847840B (en) 2018-12-07

Family

ID=56580390

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201510795422.7A Active CN105847840B (en) 2015-11-18 2015-11-18 A kind of parallel organization building method for efficient video coding inverse transformation operation

Country Status (1)

Country Link
CN (1) CN105847840B (en)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102438149A (en) * 2011-10-10 2012-05-02 上海交通大学 Realization method of AVS (Audio Video Standard) inverse transformation based on reconfiguration technology
CN103888782A (en) * 2014-03-04 2014-06-25 上海交通大学 Parallel task partitioning method for HEVC decoder
US20140233652A1 (en) * 2007-02-06 2014-08-21 Microsoft Corporation Scalable multi-thread video decoding
CN104320668A (en) * 2014-10-31 2015-01-28 上海交通大学 SIMD optimization method for DCT and IDCT of HEVC/H.265
CN104683817A (en) * 2015-02-11 2015-06-03 广州柯维新数码科技有限公司 AVS-based methods for parallel transformation and inverse transformation

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20140233652A1 (en) * 2007-02-06 2014-08-21 Microsoft Corporation Scalable multi-thread video decoding
CN102438149A (en) * 2011-10-10 2012-05-02 上海交通大学 Realization method of AVS (Audio Video Standard) inverse transformation based on reconfiguration technology
CN103888782A (en) * 2014-03-04 2014-06-25 上海交通大学 Parallel task partitioning method for HEVC decoder
CN104320668A (en) * 2014-10-31 2015-01-28 上海交通大学 SIMD optimization method for DCT and IDCT of HEVC/H.265
CN104683817A (en) * 2015-02-11 2015-06-03 广州柯维新数码科技有限公司 AVS-based methods for parallel transformation and inverse transformation

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
LIU ZHEN-TAO,LI TAO,HAN JUN-GANG: "A Novel Reconfigurable Data-Flow Architecture for Real Time Video Processing", 《JOURNAL OF SHANGHAI JIAOTONG UNIVERSITY(SCIENCE)》 *

Also Published As

Publication number Publication date
CN105847840B (en) 2018-12-07

Similar Documents

Publication Publication Date Title
CN110741640B (en) Optical flow estimation for motion compensated prediction in video coding
US20100290529A1 (en) Real-time superresolution and video transmission
CN102263951B (en) Quick fractal video compression and decompression method
US9596467B2 (en) Motion estimation device for predicting a vector by referring to motion vectors of adjacent blocks, motion estimation method and storage medium of motion estimation program
US10694205B2 (en) Entropy coding of motion vectors using categories of transform blocks
US11115678B2 (en) Diversified motion using multiple global motion models
CN101247523B (en) Semi-pixel motion estimating method for H.264 encoder
CN106688234B (en) Processor system, Video Codec and method for image transformation
US20220329833A1 (en) Nearest neighbor search method, apparatus, device, and storage medium
CN102801982B (en) Estimation method applied on video compression and based on quick movement of block integration
RU2487489C2 (en) Method of searching for displacement vectors in dynamic images
JP7480319B2 (en) Intra Prediction for Image and Video Compression
CN105847840A (en) Parallel structure construction method for high efficiency video coding inverse transform operation
US11259051B2 (en) Pyramid algorithm for video compression and video analysis
Gnavi et al. Wavelet kernels on a DSP: a comparison between lifting and filter banks for image coding
CN102263954B (en) Quick fractal video compression and decompression method based on object
JP5871714B2 (en) Distributed video encoding method and system, and decoding apparatus
CN112422976B (en) Brightness component motion compensation method in video coding standard and video coding method
CN103647969A (en) Object-based fast fractal video compression and decompression method
KR101059649B1 (en) Encoding device, method and decoding device using adaptive interpolation filter
KR100804451B1 (en) 1/4 quarter pixel interpolation method for imaging process and processor thereof
JP2024087036A (en) Intra Prediction for Image and Video Compression
EP3024236A1 (en) Method and device for transmission of a video
Zhao et al. Super-Resolution Image Reconstruction Based on Wavelet Transform and Edge-Directed Interpolation
Hanoosh et al. A parallel architecture for motion estimation in HEVC encoder

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
TR01 Transfer of patent right
TR01 Transfer of patent right

Effective date of registration: 20220819

Address after: Room 031, Room F901, 9th Floor, Block 4-C, Xixian Financial Port, Fengdong New Town, Xixian New District, Xi'an City, Shaanxi Province 710000

Patentee after: Xi'an Yuntie Intelligent Technology Co.,Ltd.

Address before: 710121 West Chang'an Street, Chang'an District, Xi'an City, Shaanxi Province

Patentee before: XI'AN University OF POSTS & TELECOMMUNICATIONS