CN1374692A - Design method of built-in parallel two-dimensional discrete wavelet conversion VLSI structure - Google Patents

Design method of built-in parallel two-dimensional discrete wavelet conversion VLSI structure Download PDF

Info

Publication number
CN1374692A
CN1374692A CN 02114522 CN02114522A CN1374692A CN 1374692 A CN1374692 A CN 1374692A CN 02114522 CN02114522 CN 02114522 CN 02114522 A CN02114522 A CN 02114522A CN 1374692 A CN1374692 A CN 1374692A
Authority
CN
China
Prior art keywords
filter
technology
filtering
subexpression
pass filter
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN 02114522
Other languages
Chinese (zh)
Other versions
CN1215553C (en
Inventor
郑南宁
王瑞轩
吴勇
张光烈
徐维朴
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Xian Jiaotong University
Original Assignee
Xian Jiaotong University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Xian Jiaotong University filed Critical Xian Jiaotong University
Priority to CN 02114522 priority Critical patent/CN1215553C/en
Publication of CN1374692A publication Critical patent/CN1374692A/en
Application granted granted Critical
Publication of CN1215553C publication Critical patent/CN1215553C/en
Anticipated expiration legal-status Critical
Expired - Fee Related legal-status Critical Current

Links

Images

Landscapes

  • Complex Calculations (AREA)
  • Image Processing (AREA)

Abstract

The design method of built-in parallel two-dimensional discrete wavelet conversion VLSI structure includes: to adopt shift addition technology to change multiplication operation in filter into shift register and adder operation; to adopt sharer expression technology to reduce the number of shift register and adder for realizing the filter; to adopt filter merging technology to merge the common terms of both high-pass and low-pass filters and to realize parallel both high-pass and low-pass operation in one compact hardware structure; and to adopt filter combination merging technology to merge two 2-DDWT filters into one and to raise the hardware utilization. The present invention is one high-efficiency 2-DDWT scheme with very high hardware utilization and other advantages.

Description

The VLSI structure Design method of the two-dimensional discrete wavelet conversion that a kind of inherence is parallel
One, affiliated field
The invention belongs to the VLSI design field.Be specifically related in JPEG2000 hardware is realized, design a kind of parallel method for designing of two-dimensional discrete wavelet conversion (2-D DWT) hardware configuration efficiently.
Two, background technology
In a lot of real systems, as digital camera, video telephone, field camera and palmtop PC etc. are considered the requirement of speed and area, need realize compressibility with chip.In the present existing 2-D DWT chip structure, most of structure is just the same for the structural design in two stages of fast discrete wavelet transformation (row filtering and row filtering), though reduced the control complexity like this, reduced the hardware utilance also, increased chip cost.In addition, present chip structure operating result of high pass filter and low pass filter in each stage is output in turn successively, and this spline structure has not only reduced the hardware utilance, has also limited the speed of chip deal with data.
Three, summary of the invention
According to defective that exists in the above-mentioned background technology and deficiency, the objective of the invention is to, provide a kind of hardware utilance height, cost low, have the VLSI structure Design method of the parallel two-dimensional discrete wavelet conversion in the inherence of concurrency, hardware spending 2-D DWT little, simple in structure.
For achieving the above object, the solution that the present invention adopts is: the VLSI construction design method of the inherent two-dimensional discrete wavelet conversion that walks abreast, carry out in the following manner:
1) multiply operation in the filter is become the operation of shift register and adder by " displacement adds " technology;
2) by " common subexpression " technology reduce as far as possible filter in realizing shift register and the number of adder;
3) by " filter merging " technology the public keys in low pass filter and the high pass filter operation is merged, in the hardware configuration of a compactness, realize the parallel work-flow of low pass and high pass simultaneously;
4) unite two into one by two groups of filters of " bank of filters merging " technology, to increase hardware utilization with the filtering of 2-D DWT row;
Described displacement adds technology, and real multiply is with 2 n power, is equivalent to this real number n position that moves to left; And any one wavelet filter coefficient can by limited 2 integral number power and expression; Like this, the product of a real number and a wavelet filter coefficient just can and be represented by the limited individual that be shifted of this real number; Limited the convolution of importing data and wavelet filter just can realize with limited displacement and addition.
Described common subexpression technology is, have only in the displacement and the formula of addition at one, may there be such unit, it by wherein two (more than) form a subexpression, this subexpression repeats again through certain displacement in formula, be exactly common subexpression, common subexpression is merged, to reduce the number of adder and shift register; Common subexpression can carry out the coupling of multilayer.
Described filter folding is: in capable filtering (or the row filtering) stage of wavelet transform, may there be public subexpression in low pass filter and the high pass filter, if exist, then common subexpression is merged, can reduce the number of shift register and adder; In addition,, in a clock, can carry out the operation of high pass and low pass filter simultaneously, so just realize the concurrency of high-pass filtering and low-pass filtering according to the sequential relationship of high pass filter in the biorthogonal wavelet and low pass filter.
Described bank of filters merging technology is, at the row filtering stage, because the hardware utilance of two groups of filters all has only 50%, only uses one group of filter construction to realize the function that two groups of filters are finished, and two groups of outputs of row filtering take turns processing;
The present invention is a kind of hard-wired design of efficient 2-D DWT with multiple advantage.By optimal design to discrete wavelet transformer line feed filtering and row filtering, the hardware utilance reaches 100%, two groups of outputs are arranged in a work clock, realized concurrency, but do not increased hardware spending, the hardware configuration that obtains also has two groups of outputs in the work clock, realized concurrency, but do not increase hardware spending, the hardware configuration that obtains is also very simple, is highly susceptible to the realization of VLSI.
Fig. 1 is the high pass and the low pass filter presentation graphs of the embodiment of the invention, and wherein (a) is the low pass filter presentation graphs, (b) is the high pass filter presentation graphs.
Fig. 2 is the high pass and the low pass filter combined diagram of the embodiment of the invention.
Fig. 3 is the filter construction after one group of the customary filtering stage of the invention process is optimized.
Fig. 4 is the structure after two groups of filters of embodiment of the invention row filtering stage merge.
Five, embodiment
The present invention is described in more detail below in conjunction with drawings and Examples, but the invention is not restricted to this embodiment.
According to technical scheme of the present invention, the inventor has provided embodiments of the invention.What use in the present embodiment is one group of biorthogonal wavelet filter---Le Gall 5/3 biorthogonal wavelet in the JPEG2000 standard.
In the present embodiment, at first the multiply operation in the filter is realized with shift register and adder by general " displacement adds " technology.
By " displacement adds " technology, present embodiment has obtained low pass filter y1[n] and high pass filter y2[n]; That is:
y1[n]=(-x[2n-2]+x[2n-1]<<1+x[2n]<<1+x[2n]<<2+x[2n+[1]<<1-x[2n+2])>>3
y2[n]=(-x[2n]+x[2n+1]<<1-x[2n+2])>>2
In Fig. 1, Fig. 1 (a) is the y1[n in the present embodiment] represent that at (3) low pass filter before that moves to right Fig. 1 (b) is the y2[n in the present embodiment] (2) high pass filter is before represented moving to right.In these two tables, the line display time-delay, displacement is shown in tabulation.The element representation x[2n-2 of first row first row wherein]<<2, the element representation x[2n+2 of last row of last column (fifth line the 5th row)]>>2, input value 1 or-1 is represented the plus or minus of element respectively in the table.
The common subexpression technology is, in a formula of having only displacement and addition, may have such unit, it by wherein two (more than) form a subexpression, this subexpression is passed through certain displacement and is repeated again in formula; This subexpression is exactly a common subexpression, and common subexpression is merged, and can reduce the number of adder and shift register; Common subexpression can carry out the coupling of multilayer; Seek public keys in high pass and the low pass filter respectively by " common subexpression " matching algorithm.Matching result is all not have common subexpression in two tables.The particularity of this explanation Le Gall 5/3 biorthogonal wavelet and the superiority that itself has.
In Fig. 2, provided the high pass filter and the low pass filter combined diagram of present embodiment.Two among Fig. 1 tables are merged,, obtain one group of public keys by seeking the public keys of high pass filter and low pass filter, such as among the figure the circle part.In the present embodiment, the merging by filter has reduced an adder and a shift register.And according to the sequential relationship of high pass filter in the biorthogonal wavelet and low pass filter, represented in the table is the operation of having carried out high pass filter and low pass filter in a clock simultaneously.Like this, just realized the concurrency of high-pass filtering and low-pass filtering.
The present invention has also adopted bank of filters merging technology, bank of filters merging technology is, at the row filtering stage, because the hardware utilance of two groups of filters all has only 50%, in order to improve the hardware utilance, only realize the function that two groups of filters are finished with one group of filter construction.Two groups of outputs to row filtering take turns to handle, and have just realized the purpose of two groups of data of one group of filter process; So not only reduced the hardware spending of one group of filter, also made hardware (room and time) utilance of filter reach 100%.
The inherent concurrency here refers on the basis that does not increase hardware cost, has realized actual parallel processing, can handle two groups of data in each work clock, produce two outputs, and general structure can only be handled one group of data in a clock at present, produces an output.
The structure of the embodiment of the invention is represented the capable filtering and the row filtering stage of two-dimensional discrete wavelet conversion respectively as shown in Figure 3 and Figure 4.In Fig. 3, provided the capable filtering stage of present embodiment and optimized a filter construction for.Can see that from structure Le Gall 5/3 biorthogonal wavelet filter only needs 6 adders, 5 registers and 6 shift registers just can be realized.Because shift register can and be realized with line, so shift register does not increase spending of hardware basically.Traditional method then needs 8 multipliers, and 9 adders and 8 registers could be realized.Compare with conventional method, this structure has greatly reduced the complexity of spending of hardware and structure.In this is optimized structure, in fact handled two groups of data in the work clock, obtained two outputs.So just need import two data in a clock, this not so difficult realization on assurance core processing module (two stages of filtering and row filtering at once) clock frequency basis of invariable, only need double to get final product to the clock frequency at chip input interface place.If from chip exterior, the input clock frequency that is doubled exactly, just the data volume of handling in the chip unit interval has increased by one times, so just makes the processing speed of chip be doubled.So this structure has not only reduced the hardware spending of chip, also realized the parallel work-flow of high-pass filtering and low-pass filtering, make the processing speed of chip be doubled.
In Fig. 4, provided the structure after two groups of filters of present embodiment row filtering stage merge.The design of filter construction is identical with the row filtering stage.By using the filtering technique based on row, the intermediate object program that only need store restricted driving gets final product.In the present embodiment, need low-pass filtering and the high-pass filtering of storage phase I to export each four lines, and the number of every row is M/2 (establishing the number of pixels that M is the every row of input picture), so the total memory capacity that needs is 4M.Needed memory cell can realize with register or memory according to actual conditions.The advantage of register is that control logic is simple, and access is convenient, and speed is fast, and shortcoming is that hardware spending is bigger.The advantage of memory is to take up room for a short time, and shortcoming is that access speed does not have register fast, and power consumption is big, the control logic complexity.But no matter use what memory cell, the basic structure of Fig. 4 is all constant.What memory cell was used in the present embodiment is register.
Be similar to capable filtering, every group of column filter can be handled two line data simultaneously in a line period (M/2 clock cycle), but has only the input of delegation (M/2) data.Every group of filter will emptyly be waited for a line period like this, and the time availability of two groups of filters has only 50%.In order to improve the hardware utilance, we unite two into one two groups of filters.And two groups of outputs that we need only row filtering take turns to handle, and have just realized the purpose of two groups of data of one group of filter process.But so also additionally need one group of buffer memory to store wherein one group of result (M/2).The structure that finally obtains has only been used one group of filter construction as shown in Figure 4, has so not only reduced the hardware spending of one group of filter, also makes hardware (room and time) utilance of filter reach 100%.

Claims (1)

1. the VLSI construction design method of the parallel two-dimensional discrete wavelet conversion in an inherence is characterized in that, carries out in the following manner:
1) multiply operation in the filter is become the operation of shift register and adder by " displacement adds " technology;
2) by " common subexpression " technology reduce as far as possible filter in realizing shift register and the number of adder;
3) by " filter merging " technology the public keys in low pass filter and the high pass filter operation is merged, in the hardware configuration of a compactness, realize the parallel work-flow of low pass and high pass simultaneously;
4) unite two into one by two groups of filters of " bank of filters merging " technology, to increase hardware utilization with the filtering of 2-D DWT row;
Described " displacement adds " technology is, real multiply is with 2 n power, is equivalent to this real number n position that moves to left; And any one wavelet filter coefficient can by limited 2 integral number power and expression; Like this, the product of a real number and a wavelet filter coefficient just can and be represented by the limited individual that be shifted of this real number; Limited the convolution of importing data and wavelet filter just can realize with limited displacement and addition;
Described common subexpression technology is, have only in the displacement and the formula of addition at one, may there be such unit, it by wherein two (more than) form a subexpression, this subexpression repeats again through certain displacement in formula, then it is exactly a common subexpression, common subexpression is merged, to reduce the number of adder and shift register; Common subexpression can carry out the coupling of multilayer;
Described filter folding is: in capable filtering (or the row filtering) stage of wavelet transform, may there be public subexpression in low pass filter and the high pass filter, if exist, then common subexpression is merged, can reduce the number of shift register and adder; In addition,, in a clock, can carry out the operation of high pass and low pass filter simultaneously, so just realize the parallel work-flow of high-pass filtering and low-pass filtering according to the sequential relationship of high pass filter in the biorthogonal wavelet and low pass filter;
Described bank of filters merging technology is, at the row filtering stage, because the hardware utilance of two groups of filters all has only 50%,, two groups of outputs of row filtering take turns processing here so only use one group of filter construction to realize the function that two groups of filters are finished.
CN 02114522 2002-04-17 2002-04-17 Design method of built-in parallel two-dimensional discrete wavelet conversion VLSI structure Expired - Fee Related CN1215553C (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN 02114522 CN1215553C (en) 2002-04-17 2002-04-17 Design method of built-in parallel two-dimensional discrete wavelet conversion VLSI structure

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN 02114522 CN1215553C (en) 2002-04-17 2002-04-17 Design method of built-in parallel two-dimensional discrete wavelet conversion VLSI structure

Publications (2)

Publication Number Publication Date
CN1374692A true CN1374692A (en) 2002-10-16
CN1215553C CN1215553C (en) 2005-08-17

Family

ID=4743140

Family Applications (1)

Application Number Title Priority Date Filing Date
CN 02114522 Expired - Fee Related CN1215553C (en) 2002-04-17 2002-04-17 Design method of built-in parallel two-dimensional discrete wavelet conversion VLSI structure

Country Status (1)

Country Link
CN (1) CN1215553C (en)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101534439A (en) * 2008-03-13 2009-09-16 中国科学院声学研究所 Low power consumption parallel wavelet transforming VLSI structure
CN101488225B (en) * 2009-03-05 2012-03-28 山东大学 VLSI system structure of bit plane encoder
CN106570272A (en) * 2017-01-10 2017-04-19 天津大学 VLSI (Very Large Scale Integration) design method for two-dimensional discrete wavelet transform
CN107430760A (en) * 2015-04-23 2017-12-01 谷歌公司 Two-dimensional shift array for image processor
CN108205700A (en) * 2016-12-20 2018-06-26 上海寒武纪信息科技有限公司 Neural network computing device and method

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101534439A (en) * 2008-03-13 2009-09-16 中国科学院声学研究所 Low power consumption parallel wavelet transforming VLSI structure
CN101488225B (en) * 2009-03-05 2012-03-28 山东大学 VLSI system structure of bit plane encoder
CN107430760A (en) * 2015-04-23 2017-12-01 谷歌公司 Two-dimensional shift array for image processor
US11153464B2 (en) 2015-04-23 2021-10-19 Google Llc Two dimensional shift array for image processor
CN108205700A (en) * 2016-12-20 2018-06-26 上海寒武纪信息科技有限公司 Neural network computing device and method
CN106570272A (en) * 2017-01-10 2017-04-19 天津大学 VLSI (Very Large Scale Integration) design method for two-dimensional discrete wavelet transform

Also Published As

Publication number Publication date
CN1215553C (en) 2005-08-17

Similar Documents

Publication Publication Date Title
Lian et al. Lifting based discrete wavelet transform architecture for JPEG2000
CN109189473A (en) Processing with Neural Network device and its method for executing vector exchange instruction
JPH0683857A (en) Two-dimensional fast fourier transform converter
CN1215553C (en) Design method of built-in parallel two-dimensional discrete wavelet conversion VLSI structure
CN1187698C (en) Design method of built-in parallel two-dimensional discrete wavelet conversion VLSI structure
Tewari et al. High-speed & memory efficient 2-d dwt on xilinx spartan3a dsp using scalable polyphase structure with da for jpeg2000 standard
US6587589B1 (en) Architecture for performing two-dimensional discrete wavelet transform
CN101697486A (en) Two-dimensional wavelet transformation integrated circuit structure
Xiong et al. Efficient high-speed/low-power line-based architecture for two-dimensional discrete wavelet transform using lifting scheme
CN1295653C (en) Circuit for realizing direct two dimension discrete small wave change
CN201111042Y (en) Two-dimension wavelet transform integrate circuit structure
Meher et al. Hardware-efficient systolic-like modular design for two-dimensional discrete wavelet transform
KR101061008B1 (en) Convolution-based Discrete Wavelet Transform
Ahmed et al. VLSI implementation of 16-point DCT for H. 265/HEVC using walsh hadamard transform and lifting scheme
Patil et al. Low Power High Speed VLSI Architecture for 1-D Discrete Wavelet Transform
CN1187688C (en) Memory control method realized by lifting wavelet fast algorithm VLSI
Tan et al. Shift-accumulator ALU centric JPEG2000 5/3 lifting based discrete wavelet transform architecture
Seth et al. VLSI Implementation of 2-D DWT/IDWT Cores Using 9/7-Tap Filter Banks Based on the Non-Expansive Symmetric Extension Scheme.
Chang et al. Design of highly efficient VLSI architectures for 2-D DWT and 2-D IDWT
Albanesi et al. A high speed Haar transform implementation
Guo et al. Enlargement and reduction of image/video via discrete cosine transform pair, Part 1: novel three-dimensional discrete cosine transform and enlargement
CN1137583C (en) Integer biorthogonal wavelet conversion circuit for vedeo and image data compression
CN1668092A (en) A storage space saved storage processing method
Lang et al. Performance/area tradeoffs in tree-based VLSI architectures for the two-dimensional wavelet transform
CN1620108A (en) Bidimonsional digit filter

Legal Events

Date Code Title Description
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant
C17 Cessation of patent right
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20050817

Termination date: 20110417