CN101534439A - Low power consumption parallel wavelet transforming VLSI structure - Google Patents

Low power consumption parallel wavelet transforming VLSI structure Download PDF

Info

Publication number
CN101534439A
CN101534439A CN200810101834A CN200810101834A CN101534439A CN 101534439 A CN101534439 A CN 101534439A CN 200810101834 A CN200810101834 A CN 200810101834A CN 200810101834 A CN200810101834 A CN 200810101834A CN 101534439 A CN101534439 A CN 101534439A
Authority
CN
China
Prior art keywords
vlsi structure
vlsi
wavelet transformation
data stream
data flow
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN200810101834A
Other languages
Chinese (zh)
Inventor
刘鸿瑾
王东辉
张铁军
侯朝焕
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Institute of Acoustics CAS
Original Assignee
Institute of Acoustics CAS
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Institute of Acoustics CAS filed Critical Institute of Acoustics CAS
Priority to CN200810101834A priority Critical patent/CN101534439A/en
Publication of CN101534439A publication Critical patent/CN101534439A/en
Pending legal-status Critical Current

Links

Images

Abstract

The invention relates to a low power consumption parallel wavelet transforming VLSI structure, wherein the low power consumption parallel wavelet transforming VLSI structure is connected with two parallel delay cells and a data stream selector or two shunt-wound shift registers and a data stream selector in series before each adder of the data stream in a VLSI structure which directly realizes discrete wavelet transformation; an embedded boundary extension circuit is parallelly and synchronously connected with a structure in which two shunt-wound delay cells/shift registers and a data stream selector form a group. The invention can process two lines of data synchronously to realize time sharing multiplex of main arithmetic units with the hardware utilization rate up to 100%, and reduce the required cache on chips and the access to the exterior memory, thereby efficiently decreasing the power consumption of the whole design. Due to the simple hardware structure, the VLSI is easy to realize.

Description

A kind of VLSI structure of wavelet transformation of low power consumption parallel
Technical field
The present invention relates to the VLSI design field, in video, image encoding standard wavelet transform the hardware implementation structure, particularly a kind of VLSI structure of wavelet transformation of low power consumption parallel.
Background technology
In recent years, along with developing rapidly of computer and digital communication technology, particularly network and rise of multimedia technology, image encoding and compress technique have received increasing concern.Simultaneously under the restriction of communication bandwidth and memory capacity, image is encoded and compression seems extremely important.Wavelet transformation has good time-frequency characteristic, has overcome the drawback that can produce blocking artifact when traditional DCT is coded in low bit rate, and can realize multiple function flexibly.So it has obtained using widely in static and dynamic image compression field, oneself is through becoming the kernel kernal mapping technology of Joint Photographic Experts Group JPEG2000 of new generation, and the trend of alternative dct transform in the video compression standard in future is arranged.But its amount of calculation is big, is difficult to satisfy the requirement of handling in real time, so the hardware implementation structure of wavelet transformation becomes the domestic and international research focus.
In early days, when carrying out DWT calculating, most widely used is the Mallat algorithm, and its adopts bank of filters technology to reduce complexity of calculation.Recently, much adopt the structure of boosting algorithm to be suggested, whole wavelet filtering process is broken down into several lifting step and realizes that with traditional comparing based on the bank of filters technology, the complexity of calculating has reduced half.According to performance evaluating, two-dimensional discrete wavelet conversion (2D-DWT) has consumed the power consumption of whole design nearly 80% to the visit of external memory storage, so the visit to external memory storage has become a key issue when reducing 2-D DWT hardware and realizing.Based on the VLSI structure of row cache, by increasing several row caches, reduced visit effectively, thereby reached the purpose that reduces power consumption external memory storage.But the extra row cache that increases has increased the complexity of area of chip and control.
Summary of the invention
The objective of the invention is, a kind of VLSI structure of wavelet transformation of the low power consumption parallel based on boosting algorithm has been proposed, by increasing shift register/delay cell and data flow selector, make the main calculating unit time division multiplexing in the structure, can handle two line data simultaneously, arithmetic unit is in running order always, the hardware utilance brings up to 100%, reduced the buffer memory that needs on operand and the sheet by increasing embedded border symmetry expanded circuit, reduced visit simultaneously to external memory storage, reduce the power consumption of whole design, increased the efficient of area.
For achieving the above object, the present invention proposes a kind of VLSI structure of wavelet transformation of the low power consumption parallel based on boosting algorithm, this structure is in the VLSI structure of the wavelet transformation of directly realizing, before each adder of data flow, delay cell/shift register and a data flow selector of two parallel connections of series connection; Its effect is to make main arithmetic element time-sharing multiplex, and hardware utilization reaches 100%, and performance is largely increased.
Improvement as the VLSI structure of the wavelet transformation of low power consumption parallel also comprises, embedded border extension circuit in parallel on the structure of the delay cell/shift register of every group of two parallel connections and a data flow selector; Described embedded border extension circuit is to be made of a data boundary expansion selector and an adder series connection; Its effect is the use amount that has reduced buffer memory on operand and the sheet, has reduced the power consumption that the visit of external memory storage has been reduced entire chip.
The one dimension VLSI structure of the wavelet transformation of the present invention proposes a kind of low power consumption parallel based on boosting algorithm, this structure is in the VLSI structure of the one-dimensional wavelet transform of directly realizing, before each adder of data flow, delay cell/shift register and a data flow selector of two parallel connections of series connection.
Improvement as the VLSI structure of the one-dimensional wavelet transform of low power consumption parallel also comprises, embedded border extension circuit in parallel on the structure of the delay cell/shift register of every group of two parallel connections and a data flow selector, described embedded border extension circuit is to be made of a data boundary expansion selector and an adder series connection.
The VLSI structure of the two-dimensional discrete wavelet conversion of the present invention proposes a kind of low power consumption parallel based on boosting algorithm, described structure comprises:
(1) going processor, is in the VLSI structure of the one-dimensional discrete wavelet transformation of directly realizing, before each adder of data flow, the delay cell/shift register of two parallel connections of series connection and data flow selector constitute;
(2) column processor is in the VLSI structure of the one-dimensional discrete wavelet transformation of directly realizing, before each adder of data flow, constitutes corresponding to the shift register/delay cell of two parallel connections of row processor series connection and data flow selector;
Its effect makes main arithmetic element time-sharing multiplex, and hardware utilization reaches 100%, and performance is largely increased; The concurrent working of row, column processor, the result of row processor output does not need directly to deliver to through middle buffer memory the input of column processor, first group of data when the input of row processor, when the lifting step of space processor is exported wavelet coefficient, column processor begins startup work, column processor successively began to start in several clock cycle at once, and kept concurrent working in the processing procedure of back.
As the improvement of the VLSI structure of the two-dimensional discrete wavelet conversion of above-mentioned low power consumption parallel, also be included in embedded border extension circuit in parallel on the structure of the delay cell/shift register of every group of two parallel connections and a data flow selector; Described embedded border extension circuit is to be made of a data boundary expansion selector and an adder series connection.
Its effect is to have reduced the use amount of buffer memory on the sheet, has reduced the power consumption that the visit of external memory storage has been reduced entire chip.
The invention has the advantages that,
1, by the main calculating unit in the time division multiplexing lifting structure, can handle two line data simultaneously, make arithmetic unit in running order always, the hardware utilance brings up to 100%.
2, adopt embedded border symmetry expanded circuit, reduced the buffer memory that needs on operand and the sheet, reduced visit simultaneously, thereby reduced the power consumption of whole design effectively external memory storage.
3, this structure scans input two line data simultaneously, and ranks processor parallel transformation has improved data throughput.
Description of drawings
Fig. 1 is the VLSI structure based on (9/7) wavelet transformation of boosting algorithm of prior art.
Fig. 2 is the VLSI structure of the one-dimensional discrete conversion based on boosting algorithm (9/7) small echo of the present invention.
Fig. 3 is the symmetrical border extension schematic diagram of even sequence in the wavelet transformation of the present invention.
Fig. 4 is the embedded symmetrical border extension algorithm schematic diagram of (9/7) of the present invention small echo.
Fig. 5 is the VLSI structure of the two-dimensional discrete conversion of (9/7) of the present invention small echo.
Embodiment
With the example that is embodied as of (9/7) small echo, introduce the specific embodiment of the present invention below.
As shown in Figure 1, when boosting algorithm was directly realized, the signal of coming at first carried out the odd even division, and each group number (an even number sequence number and an odd number sequence number) every interval one-period enters the lifting performing step of back then.So whole data path nearly has the time of half idle, processing speed and efficient are not high.
As shown in Figure 2, the VLSI structure of the one-dimensional discrete wavelet transformation of the low power consumption parallel that the present invention proposes, in the VLSI structure of the one-dimensional discrete wavelet transformation of directly realizing, before each adder of data flow, delay cell/shift register and a data flow selector of two parallel connections of series connection; The selx signal selects a data circulation flow path to carry out computing from two data circulation flow paths every one-period, with the odd-numbered line that realizes input simultaneously and the time-division processing of even number of lines certificate.By the main arithmetic element in the time-sharing multiplex lifting step, make data path in running order always, improved processing speed, data throughput and hardware utilization rate effectively.
The image place of reality as reason in, identical in order in the signal decomposition process, to keep data volume with original image, need carry out special processing to the border.Boundary effect when the symmetric extension algorithm can overcome compressed transform effectively is so JPEG2000 adopts the border extension method of symmetric extension algorithm as wavelet transformation.Border symmetry when as shown in Figure 3, being even number for signal length is expanded.
In order to reduce the use amount of buffer memory when the stored boundary symmetry growth data, the present invention proposes a kind of Embedded border extension algorithm, as shown in Figure 4; Embedded border extension circuit in parallel on the structure of every group of two delay cell/shift registers arranged side by side and a data flow selector can be realized the symmetry expansion of data boundary, as shown in Figure 2; The Ext_enx signal is the control signal of data boundary expansion selector.Adopt this embedded border symmetry expanded circuit, reduced the use amount of buffer memory on operand and the sheet, reduced visit, thereby reduced the power consumption of entire chip external memory storage.
The VLSI structure that the present invention proposes based on the two-dimensional discrete wavelet conversion of the low power consumption parallel of boosting algorithm, described structure comprises:
(1) going processor, is in the VLSI structure of the one-dimensional discrete wavelet transformation of directly realizing, before each adder of data flow, the delay cell/shift register of two parallel connections of series connection and data flow selector constitute;
(2) column processor is in the VLSI structure of the one-dimensional discrete wavelet transformation of directly realizing, before each adder of data flow, constitutes corresponding to the shift register/delay cell of two parallel connections of row processor series connection and data flow selector;
Its effect makes main arithmetic element time-sharing multiplex, and hardware utilization reaches 100%, and performance is largely increased; The concurrent working of row, column processor, the result of row processor output does not need directly to deliver to through middle buffer memory the input of column processor, first group of data when the input of row processor, when the lifting step of space processor is exported wavelet coefficient, column processor begins startup work, column processor successively began to start in several clock cycle at once, and kept concurrent working in the processing procedure of back.
As the improvement of the VLSI structure of the two-dimensional discrete wavelet conversion of above-mentioned low power consumption parallel, also be included in embedded border extension circuit in parallel on the structure of every group of two delay cell/shift registers arranged side by side and a data flow selector; Described embedded border extension circuit is to be made of a data boundary expansion selector and an adder series connection.
Its effect is the use amount that has reduced buffer memory on operand and the sheet, has reduced the power consumption that the visit of external memory storage has been reduced entire chip.
When calculating 2-D DWT, traditional algorithm be expert at direction calculating intact after calculated column direction again.Delay between the row-column transform is bigger, has limited the speed of whole system.Structure according to the present invention's proposition, (9/7) structure of two-dimensional discrete wavelet conversion, as shown in Figure 5, the processor of wherein going is exactly the structure of the one-dimensional discrete wavelet transformation of the present invention's proposition, and column processor is expert on the basis of processor, and the shift register replacement of the delay cell (delay unit) in the row processor with a line data length obtained.The concurrent working of ranks processor, the result of row processor output does not need directly to deliver to through middle buffer memory the input of column processor, so first group of data when the input of row processor, when the lifting step of space processor is exported wavelet coefficient, column processor begins startup work, column processor successively began to start in several clock cycle at once, and kept concurrent working in the processing procedure of back.The low frequency output of column processor is through a MUX, and wherein the LL subband is admitted to (a N 2/ 4) Da Xiao memory carries out the next stage wavelet transform, and the HL of LH subband and column processor high frequency output, data conversion process that the HH subband enters next stage together (as the quantification of wavelet coefficient, coding etc.).

Claims (5)

1, a kind of VLSI structure of wavelet transformation of low power consumption parallel, it is characterized in that, in the VLSI structure of the wavelet transformation of directly realizing, before each adder of data flow, delay cell and a data flow selector of two parallel connections of series connection, or the shift register of two parallel connections of connecting and a data flow selector.
2, the VLSI structure of wavelet transformation according to claim 1, it is characterized in that, the VLSI structure that described VLSI structure is the one-dimensional discrete wavelet transformation, before odd data stream in this VLSI structure and each adder of even data stream, delay cell and a data flow selector of two parallel connections of series connection, or the shift register of two parallel connections of connecting and a data flow selector.
3, the VLSI structure of wavelet transformation according to claim 1 is characterized in that, the VLSI structure that described VLSI structure is a two-dimensional discrete wavelet conversion comprises a VLSI structure row processor, a VLSI structure column processor,
Described capable processor is in the VLSI structure of the one-dimensional discrete wavelet transformation of directly realizing, before each adder of odd data stream and even data stream, and delay cell/shift register and a data flow selector of two parallel connections of series connection;
Described column processor is in the VLSI structure of the one-dimensional discrete wavelet transformation of directly realizing, before each adder of odd data stream and even data stream, corresponding to shift register/delay cell and data flow selector of two parallel connections of row processor series connection.
4, according to the VLSI structure of the wavelet transformation of claim 1~3 described in each, it is characterized in that, on the structure of the delay cell/shift register of described two parallel connections and a data flow selector, embedded border extension circuit in parallel is used to reduce the size of the holder that was used to reprint operational data before entering arithmetic element.
5, the VLSI structure of wavelet transformation according to claim 4 is characterized in that, described embedded border extension circuit is to be made of a data boundary expansion selector and an adder series connection.
CN200810101834A 2008-03-13 2008-03-13 Low power consumption parallel wavelet transforming VLSI structure Pending CN101534439A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN200810101834A CN101534439A (en) 2008-03-13 2008-03-13 Low power consumption parallel wavelet transforming VLSI structure

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN200810101834A CN101534439A (en) 2008-03-13 2008-03-13 Low power consumption parallel wavelet transforming VLSI structure

Publications (1)

Publication Number Publication Date
CN101534439A true CN101534439A (en) 2009-09-16

Family

ID=41104788

Family Applications (1)

Application Number Title Priority Date Filing Date
CN200810101834A Pending CN101534439A (en) 2008-03-13 2008-03-13 Low power consumption parallel wavelet transforming VLSI structure

Country Status (1)

Country Link
CN (1) CN101534439A (en)

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102281437A (en) * 2011-06-02 2011-12-14 东南大学 Lifting structure two-dimensional discrete wavelet transform interlaced scanning method for image compression
CN102333222A (en) * 2011-10-24 2012-01-25 哈尔滨工业大学 Two-dimensional discrete wavelet transform circuit and image compression method using same
CN103067023A (en) * 2012-11-29 2013-04-24 天津大学 Efficient discrete wavelet transform (DWT) encoding method and encoder based on promotion
CN111683258A (en) * 2020-06-12 2020-09-18 上海集成电路研发中心有限公司 Image data compression method and interface circuit
CN112136128A (en) * 2019-08-30 2020-12-25 深圳市大疆创新科技有限公司 Data processing method and device
CN113473136A (en) * 2020-03-30 2021-10-01 炬芯科技股份有限公司 Video encoder and code rate control device thereof

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1374692A (en) * 2002-04-17 2002-10-16 西安交通大学 Design method of built-in parallel two-dimensional discrete wavelet conversion VLSI structure
US20030046322A1 (en) * 2001-06-01 2003-03-06 David Guevorkian Flowgraph representation of discrete wavelet transforms and wavelet packets for their efficient parallel implementation
CN1448871A (en) * 2003-04-07 2003-10-15 西安交通大学 Design method of built-in parallel two-dimensional discrete wavelet conversion VLSI structure

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20030046322A1 (en) * 2001-06-01 2003-03-06 David Guevorkian Flowgraph representation of discrete wavelet transforms and wavelet packets for their efficient parallel implementation
CN1374692A (en) * 2002-04-17 2002-10-16 西安交通大学 Design method of built-in parallel two-dimensional discrete wavelet conversion VLSI structure
CN1448871A (en) * 2003-04-07 2003-10-15 西安交通大学 Design method of built-in parallel two-dimensional discrete wavelet conversion VLSI structure

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
吴平,陈心浩: "应用于JPEG2000的5/3提升小波的VLSI结构设计", 《光电技术应用》 *
崔巍等: "二维提升小波变换的FPGA结构设计", 《计算机工程》 *

Cited By (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102281437A (en) * 2011-06-02 2011-12-14 东南大学 Lifting structure two-dimensional discrete wavelet transform interlaced scanning method for image compression
CN102333222A (en) * 2011-10-24 2012-01-25 哈尔滨工业大学 Two-dimensional discrete wavelet transform circuit and image compression method using same
CN102333222B (en) * 2011-10-24 2013-06-05 哈尔滨工业大学 Two-dimensional discrete wavelet transform circuit and image compression method using same
CN103067023A (en) * 2012-11-29 2013-04-24 天津大学 Efficient discrete wavelet transform (DWT) encoding method and encoder based on promotion
CN112136128A (en) * 2019-08-30 2020-12-25 深圳市大疆创新科技有限公司 Data processing method and device
WO2021035715A1 (en) * 2019-08-30 2021-03-04 深圳市大疆创新科技有限公司 Data processing method and device
CN113473136A (en) * 2020-03-30 2021-10-01 炬芯科技股份有限公司 Video encoder and code rate control device thereof
CN113473136B (en) * 2020-03-30 2024-02-09 炬芯科技股份有限公司 Video encoder and code rate control device thereof
CN111683258A (en) * 2020-06-12 2020-09-18 上海集成电路研发中心有限公司 Image data compression method and interface circuit
CN111683258B (en) * 2020-06-12 2022-04-22 上海集成电路研发中心有限公司 Image data compression method and interface circuit

Similar Documents

Publication Publication Date Title
Mohanty et al. Memory efficient modular VLSI architecture for highthroughput and low-latency implementation of multilevel lifting 2-D DWT
Lai et al. A high-performance and memory-efficient VLSI architecture with parallel scanning method for 2-D lifting-based discrete wavelet transform
CN101534439A (en) Low power consumption parallel wavelet transforming VLSI structure
CN102724499B (en) Variable-compression ratio image compression system and method based on FPGA
CN101697486A (en) Two-dimensional wavelet transformation integrated circuit structure
CN103414901A (en) Quick JPED 2000 image compression system
CN102572429A (en) Hardware framework for two-dimensional discrete wavelet transformation
CN102223534B (en) All-parallel bit plane coding method for image compression
CN101488225B (en) VLSI system structure of bit plane encoder
Jain et al. Image compression using 2D-discrete wavelet transform on a light weight reconfigurable hardware
CN102333222B (en) Two-dimensional discrete wavelet transform circuit and image compression method using same
Meher et al. Hardware-efficient systolic-like modular design for two-dimensional discrete wavelet transform
De Cea-Dominguez et al. GPU-oriented architecture for an end-to-end image/video codec based on JPEG2000
Nagabushanam et al. FPGA Implementation of 1D and 2D DWT Architecture using modified Lifting Scheme
CN104811738B (en) The one-dimensional discrete cosine converting circuit of low overhead multi-standard 8 × 8 based on resource-sharing
Wu et al. Analysis and architecture design for high performance JPEG2000 coprocessor
Wu et al. An efficient architecture for JPEG2000 coprocessor
CN103067023A (en) Efficient discrete wavelet transform (DWT) encoding method and encoder based on promotion
Liang et al. A full-pipelined 2-D IDCT/IDST VLSI architecture with adaptive block-size for HEVC standard
Wu et al. Memory-efficient architecture for JPEG 2000 coprocessor with large tile image
Hsieh et al. Implementation of an Efficient DWT Using a FPGA on a Real-time Platform
Patil et al. Low Power High Speed VLSI Architecture for 1-D Discrete Wavelet Transform
Seth et al. VLSI Implementation of 2-D DWT/IDWT Cores Using 9/7-Tap Filter Banks Based on the Non-Expansive Symmetric Extension Scheme.
Wu et al. An efficient architecture for two-dimensional inverse discrete wavelet transform
CN201365321Y (en) VLSI system framework of bit-plane encoder

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C12 Rejection of a patent application after its publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20090916