CN107730436A - A kind of wavelet transformation optimization method for accelerating lifting based on GPU - Google Patents
A kind of wavelet transformation optimization method for accelerating lifting based on GPU Download PDFInfo
- Publication number
- CN107730436A CN107730436A CN201711057966.9A CN201711057966A CN107730436A CN 107730436 A CN107730436 A CN 107730436A CN 201711057966 A CN201711057966 A CN 201711057966A CN 107730436 A CN107730436 A CN 107730436A
- Authority
- CN
- China
- Prior art keywords
- data
- entered
- wavelet transformation
- enter
- row
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T1/00—General purpose image data processing
- G06T1/20—Processor architectures; Processor configuration, e.g. pipelining
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/10—Complex mathematical operations
- G06F17/14—Fourier, Walsh or analogous domain transformations, e.g. Laplace, Hilbert, Karhunen-Loeve, transforms
- G06F17/148—Wavelet transforms
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T9/00—Image coding
Abstract
The present invention provides a kind of wavelet transformation optimization method for accelerating lifting based on GPU, it is characterised in that:Including:The first step, it would be desirable to carry out path of the view data of wavelet transformation along main frame, equipment end and copy on the main memory 1 of equipment end;Second step, row bound symmetric extension is entered to the data in main memory in the first step 1, data are entered in shared drive, and then data are predicted and renewal is handled;3rd step, row bound symmetric extension is entered again to the data finally obtained in second step, data enter shared drive, and then data are predicted and renewal is handled:4th step, row bound symmetric extension is entered to the data finally obtained in the 3rd step, data enter shared drive, then enter row coefficient scaling:5th step, data are returned in host memory by the path of equipment end, main frame.The present invention is advantageous in that:Its main operational of this product obtains tens times of acceleration lifting, saves the time, reduce energy consumption on the premise of picture signal integrality is ensured.
Description
(1) technical field
The present invention relates to image procossing parallel calculating method field, accelerates the small of lifting based on GPU more particularly to a kind of
Wave conversion optimization method.
(2) background technology
Image Compression is always popular research direction as the effective means for realizing data compression, wavelet transform
DWT (Discrete Wavelet Transform) turns into compression of images by its own good local characteristics and time-frequency characteristic
The kernel kernal mapping algorithm of technology, important status is occupied in compression of images field, have important theoretical research value and
Practical application meaning.But Wavelet Transformation Algorithm is based on convolution algorithm, computation complexity compared with it is high, committed memory space is big, memory access
Often, dependence be present between the adjacency and in conversion process, cause Wavelet Transformation Algorithm to face high definition large scale figure
During picture, treatment effeciency reduces, and accelerates compression ratio undesirable.The high direction of research temperature is that hardware platform speeds up to small echo at present
Become scaling method, in the hope of obtaining more preferable speed-up ratio.
(3) content of the invention
In order to overcome the weak point of conventional images processing parallel calculating method, added the invention provides one kind based on GPU
The wavelet transformation optimization method of speed lifting.
The technical proposal of the invention is realized in this way:
A kind of wavelet transformation optimization method for accelerating lifting based on GPU of this programme, it is characterised in that:Including:
The first step, it would be desirable to carry out path of the view data of wavelet transformation along main frame, equipment end and copy to equipment end
On main memory 1;
Second step, row bound symmetric extension is entered to the data in main memory in the first step 1, data enter shared drive
It is interior, then data are handled as follows:
1) 1 is predicted:
c1(2n+1)=x (2n+1)+p'[x (2n)+x (2n+2)]
2) 1 is updated:
d1(2n)=x (2n)+u'[c1(2n-1)+c1(2n+1)]
Data after calculating return to main memory 1;
The data finally obtained in second step are entered row bound symmetric extension by the 3rd step again, and data are entered in shared
Deposit, then data are handled as follows:
1) 2 are predicted:
c2(2n+1)=c1(2n+1)+p”[d1(2n)+d1(2n+2)]
2) 2 are updated:
d2(2n)=d1(2n)+u”[c1(2n-1)+c1(2n+1)]
Data after calculating enter main memory 2;
4th step, row bound symmetric extension is entered to the data finally obtained in the 3rd step, data enter shared drive, so
Laggard row coefficient scaling:
1) coefficient scaling 1:
c3(2n+1)=(- K) c2(2n+1)
2) coefficient scaling 2:
d3(2n)=(1/K) d2(2n)
Data after processing enter main memory 2;
5th step, data matrix transposition, the data procession after processing are carried out to the data finally obtained in the 4th step
Convert the judgement that whether terminates, judgement does not terminate data and return to second step to continue to handle, and the data that row-column transform terminates are entered
Enter the judgement whether 5 layers of conversion process terminate, judge that unclosed data return to second step and continue to handle, judge successful
Data are returned in host memory by the path of equipment end, main frame.
Wherein primary signal is x (n), predictive operator p'=h4/h3, p "=r1/s0;Update operator u'=h3/r3, u "=
s0/t0, K=1/t0。
Assuming that the low pass resolution filter coefficient of Wavelet Transformation Algorithm is h0,h1,h2,h3,h4, there is r1=h2-h4-h4h1/h3,
s0=h1-h3-h3r0/r1, t0=r0-2r1, r0=h0-2h4h1/h3。
The present invention is advantageous in that:Its main operational of this product takes on the premise of picture signal integrality is ensured
Tens times of acceleration lifting is obtained, the time is saved, reduces energy consumption.
(4) illustrate
Fig. 1 is the flow chart of the present invention.
Fig. 2 is Wavelet Transformation Algorithm shared drive schematic diagram of mechanism in the present invention.
Fig. 3 is wavelet transformation algorithm routine block schematic illustration in the present invention.
(5) embodiment
Brief description is made to one embodiment of the present invention below in conjunction with the accompanying drawings.
Such as a kind of Fig. 1 to Fig. 3 wavelet transformation optimization method for accelerating lifting based on GPU, it is characterised in that:Including:
The first step, it would be desirable to carry out path of the view data of wavelet transformation along main frame, equipment end and copy to equipment end
On main memory 1;
Second step, row bound symmetric extension is entered to the data in main memory in the first step 1, data enter shared drive
It is interior, then data are handled as follows:
1) 1 is predicted:
c1(2n+1)=x (2n+1)+p'[x (2n)+x (2n+2)]
2) 1 is updated:
d1(2n)=x (2n)+u'[c1(2n-1)+c1(2n+1)]
Data after calculating return to main memory 1;
The data finally obtained in second step are entered row bound symmetric extension by the 3rd step again, and data are entered in shared
Deposit, then data are handled as follows:
1) 2 are predicted:
c2(2n+1)=c1(2n+1)+p”[d1(2n)+d1(2n+2)]
2) 2 are updated:
d2(2n)=d1(2n)+u”[c1(2n-1)+c1(2n+1)]
Data after calculating enter main memory 2;
4th step, row bound symmetric extension is entered to the data finally obtained in the 3rd step, data enter shared drive, so
Laggard row coefficient scaling:
1) coefficient scaling 1:
c3(2n+1)=(- K) c2(2n+1)
2) coefficient scaling 2:
d3(2n)=(1/K) d2(2n)
Data after processing enter main memory 2;
5th step, data matrix transposition, the data procession after processing are carried out to the data finally obtained in the 4th step
Convert the judgement that whether terminates, judgement does not terminate data and return to second step to continue to handle, and the data that row-column transform terminates are entered
Enter the judgement whether 5 layers of conversion process terminate, judge that unclosed data return to second step and continue to handle, judge successful
Data are returned in host memory by the path of equipment end, main frame.
Wherein primary signal is x (n), predictive operator p'=h4/h3, p "=r1/s0;Update operator u'=h3/r3, u "=
s0/t0, K=1/t0。
Assuming that the low pass resolution filter coefficient of Wavelet Transformation Algorithm is h0,h1,h2,h3,h4, there is r1=h2-h4-h4h1/h3,
s0=h1-h3-h3r0/r1, t0=r0-2r1, r0=h0-2h4h1/h3。
First, the technical program is directed to Wavelet Transformation Algorithm feature, i.e. filter coefficient and view data is floating type
Convolution algorithm feature, parser optimization key point and direction of improvement, increase once prediction and once updates step, can be with
The expansion of image reconstruction errors is prevented, improves the stability of a system.
Test result indicates that original image can perfectly be recovered using the wavelet transformation of this programme, and it is relatively conventional
Wavelet Transformation Algorithm has 3~4 times of acceleration, and lifting scheme achieves good effect.
Secondly, traditional Wavelet Transformation Algorithm lifting scheme is that (In-place) in situ is calculated, and has saved memory cost simultaneously
Multiplication number is reduced, but dependence be present between the adjacency in conversion process, is not suitable for Parallel Implementation.Research process
It is inside prediction or renewal and be not present phase it was found that the dependence of lifting scheme is existed only between prediction and renewal step
Mutually rely on, propose a kind of parallel lifting scheme accordingly, the parallel separation of the parallel of prediction steps and renewal step is realized.Experiment
As a result show, lifting scheme achieves tens times of speed-up ratio, while avoids the parallel difficulty for interdepending and bringing.
One embodiment of the present of invention is described in detail above, but the content is only the preferable implementation of the present invention
Example, it is impossible to be considered as the practical range for limiting the present invention.All equivalent changes made according to the present patent application scope and improvement
Deng, all should still belong to the present invention patent covering scope within.
Claims (2)
- A kind of 1. wavelet transformation optimization method for accelerating lifting based on GPU, it is characterised in that:Including:The first step, it would be desirable to carry out in the master that path of the view data of wavelet transformation along main frame, equipment end copies to equipment end Deposit on 1;Second step, row bound symmetric extension is entered to the data in main memory in the first step 1, data are entered in shared drive, so Data are handled as follows afterwards:1) 1 is predicted:c1(2n+1)=x (2n+1)+p'[x (2n)+x (2n+2)]2) 1 is updated:d1(2n)=x (2n)+u'[c1(2n-1)+c1(2n+1)]Data after calculating return to main memory 1;3rd step, row bound symmetric extension is entered again to the data finally obtained in second step, data enter shared drive, so Data are handled as follows afterwards:1) 2 are predicted:c2(2n+1)=c1(2n+1)+p”[d1(2n)+d1(2n+2)]2) 2 are updated:d2(2n)=d1(2n)+u”[c1(2n-1)+c1(2n+1)]Data after calculating enter main memory 2;4th step, row bound symmetric extension is entered to the data finally obtained in the 3rd step, data enter shared drive, Ran Houjin Row coefficient scaling:1) coefficient scaling 1:c3(2n+1)=(- K) c2(2n+1)2) coefficient scaling 2:d3(2n)=(1/K) d2(2n)Data after processing enter main memory 2;5th step, data matrix transposition is carried out to the data finally obtained in the 4th step, the data procession conversion after processing The judgement whether terminated, judgement do not terminate data and return to second step to continue to handle, and the data that row-column transform terminates enter 5 layers The judgement whether conversion process terminates, judge that unclosed data return to second step and continue to handle, judge successful data Returned to by the path of equipment end, main frame in host memory.
- A kind of 2. wavelet transformation optimization method for accelerating lifting based on GPU according to claim 1, it is characterised in that:Its Middle primary signal is x (n), predictive operator p'=h4/h3, p "=r1/s0;Update operator u'=h3/r3, u "=s0/t0, K=1/ t0;Assuming that the low pass resolution filter coefficient of Wavelet Transformation Algorithm is h0,h1,h2,h3,h4, there is r1=h2-h4-h4h1/h3, s0= h1-h3-h3r0/r1, t0=r0-2r1, r0=h0-2h4h1/h3。
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201711057966.9A CN107730436A (en) | 2017-11-01 | 2017-11-01 | A kind of wavelet transformation optimization method for accelerating lifting based on GPU |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201711057966.9A CN107730436A (en) | 2017-11-01 | 2017-11-01 | A kind of wavelet transformation optimization method for accelerating lifting based on GPU |
Publications (1)
Publication Number | Publication Date |
---|---|
CN107730436A true CN107730436A (en) | 2018-02-23 |
Family
ID=61221342
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201711057966.9A Pending CN107730436A (en) | 2017-11-01 | 2017-11-01 | A kind of wavelet transformation optimization method for accelerating lifting based on GPU |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN107730436A (en) |
Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
AU6863998A (en) * | 1997-03-11 | 1998-09-29 | Computer Information And Sciences, Inc. | System and method for image compression and decompression |
EP1282075A2 (en) * | 2001-07-31 | 2003-02-05 | Ricoh Company, Ltd. | Enhancement of compressed images |
US20030046322A1 (en) * | 2001-06-01 | 2003-03-06 | David Guevorkian | Flowgraph representation of discrete wavelet transforms and wavelet packets for their efficient parallel implementation |
CN101059866A (en) * | 2007-05-23 | 2007-10-24 | 华中科技大学 | VLSI structure for promoting in parallel 9/7 wavelet base |
CN101404772A (en) * | 2008-11-19 | 2009-04-08 | 中国科学院光电技术研究所 | VLSI image compression encoder based on wavelet transformation |
CN101867809A (en) * | 2010-04-09 | 2010-10-20 | 中国科学院光电技术研究所 | High-speed image compression VLSI coding method based on systolic array, and encoder |
CN103198451A (en) * | 2013-01-31 | 2013-07-10 | 西安电子科技大学 | Method utilizing graphic processing unit (GPU) for achieving rapid wavelet transformation through segmentation |
-
2017
- 2017-11-01 CN CN201711057966.9A patent/CN107730436A/en active Pending
Patent Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
AU6863998A (en) * | 1997-03-11 | 1998-09-29 | Computer Information And Sciences, Inc. | System and method for image compression and decompression |
US20030046322A1 (en) * | 2001-06-01 | 2003-03-06 | David Guevorkian | Flowgraph representation of discrete wavelet transforms and wavelet packets for their efficient parallel implementation |
EP1282075A2 (en) * | 2001-07-31 | 2003-02-05 | Ricoh Company, Ltd. | Enhancement of compressed images |
CN101059866A (en) * | 2007-05-23 | 2007-10-24 | 华中科技大学 | VLSI structure for promoting in parallel 9/7 wavelet base |
CN101404772A (en) * | 2008-11-19 | 2009-04-08 | 中国科学院光电技术研究所 | VLSI image compression encoder based on wavelet transformation |
CN101867809A (en) * | 2010-04-09 | 2010-10-20 | 中国科学院光电技术研究所 | High-speed image compression VLSI coding method based on systolic array, and encoder |
CN103198451A (en) * | 2013-01-31 | 2013-07-10 | 西安电子科技大学 | Method utilizing graphic processing unit (GPU) for achieving rapid wavelet transformation through segmentation |
Non-Patent Citations (3)
Title |
---|
ENRICO MAGLI 等: "Integer Wavelet Packets and their application to a lossy compression system for SAR images", 《PROCEEDINGS 10TH INTERNATIONAL CONFERENCE ON IMAGE ANALYSIS AND PROCESSING》 * |
成礼智: "《离散与小波变模新型算法及其在图像处理中应用的研究》", 31 July 2007, 长沙:国防科技大学出版社 * |
李玉峰 等: "基于GPGPU的JPEG2000图像压缩方法", 《电子器件》 * |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN112907449B (en) | Image super-resolution reconstruction method based on depth convolution sparse coding | |
CN103810755B (en) | Compressed sensing spectrum picture method for reconstructing based on documents structured Cluster rarefaction representation | |
CN112465846B (en) | Cloud-containing remote sensing image compression method based on filling strategy | |
CN107154021B (en) | Image super-resolution method based on deep layer thresholding convolutional neural networks | |
CN105631807A (en) | Single-frame image super resolution reconstruction method based on sparse domain selection | |
CN109035146A (en) | A kind of low-quality image oversubscription method based on deep learning | |
CN113808032A (en) | Multi-stage progressive image denoising algorithm | |
CN103473744B (en) | Spatial domain based on the sampling of variable weight formula compressed sensing can downscaled images reconstructing method | |
CN102158694A (en) | Remote-sensing image decompression method based on GPU (Graphics Processing Unit) | |
CN111681293A (en) | SAR image compression method based on convolutional neural network | |
CN106651974A (en) | Image compressive sensing reconstruction system and method utilizing weighted structural group sparse regulation | |
CN114842216A (en) | Indoor RGB-D image semantic segmentation method based on wavelet transformation | |
CN112399176A (en) | Video coding method and device, computer equipment and storage medium | |
CN111242999B (en) | Parallax estimation optimization method based on up-sampling and accurate re-matching | |
CN110047038B (en) | Single-image super-resolution reconstruction method based on hierarchical progressive network | |
CN115526779A (en) | Infrared image super-resolution reconstruction method based on dynamic attention mechanism | |
CN110244299A (en) | A kind of distributed method that the SAR image based on ADMM is restored | |
CN106851399A (en) | Video resolution method for improving and device | |
CN114138919A (en) | Seismic data reconstruction method based on non-local attention convolution neural network | |
CN107730436A (en) | A kind of wavelet transformation optimization method for accelerating lifting based on GPU | |
CN107483964A (en) | A kind of accelerated method that inverse wavelet transform in JPEG2000 decompression algorithms is realized using GPU | |
CN104683818A (en) | Image compression method based on biorthogonal invariant set multi-wavelets | |
CN109859119B (en) | Video image rain removing method based on self-adaptive low-rank tensor recovery | |
CN111951202A (en) | Power transmission line satellite remote sensing image fusion method and device | |
CN104376198B (en) | Self adaptation MRI parallel imaging method utilizing and device |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20180223 |