CN100388316C - High-precision number cosine converting circuit without multiplier and its conversion - Google Patents

High-precision number cosine converting circuit without multiplier and its conversion Download PDF

Info

Publication number
CN100388316C
CN100388316C CNB2005100252037A CN200510025203A CN100388316C CN 100388316 C CN100388316 C CN 100388316C CN B2005100252037 A CNB2005100252037 A CN B2005100252037A CN 200510025203 A CN200510025203 A CN 200510025203A CN 100388316 C CN100388316 C CN 100388316C
Authority
CN
China
Prior art keywords
register
input
circuit
output
module circuit
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CNB2005100252037A
Other languages
Chinese (zh)
Other versions
CN1855149A (en
Inventor
林豪
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Xiamen Ziguang exhibition Rui Technology Co. Ltd.
Original Assignee
Spreadtrum Communications Shanghai Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Spreadtrum Communications Shanghai Co Ltd filed Critical Spreadtrum Communications Shanghai Co Ltd
Priority to CNB2005100252037A priority Critical patent/CN100388316C/en
Publication of CN1855149A publication Critical patent/CN1855149A/en
Application granted granted Critical
Publication of CN100388316C publication Critical patent/CN100388316C/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Abstract

The present invention provides a high precision number cosine transform and quantization method without a multiplier. The present invention comprises following steps: step 1. one multiplier is replaced with two shifters and one adder-subtracter to complete number cosine transform with proportionality coefficient s, and step 2. a transformed output result is multiplied with L on a quantize, and L is a quantized DCT transformation result obtained by dividing quantization coefficient by s. By selecting a given number, the multiplier which consumes more resources does not need to be used in a hardware circuit, and the present invention is a DCT transforming method without multiplier. The present invention has the parallelism and the consistency of height structurally, and a hardware computing unit can be repeatedly used, so the hardware circuit is very simple, and DCT IDCT can be realized in the same hardware circuit to reach very high computation precision.

Description

The number cosine converting circuit of high-precision multiplier-less and transform method thereof
Technical field
The present invention relates to a kind of number cosine converting circuit and transform method thereof of multiplier-less.
Background technology
Most of still image compression standards (as JPEG) and dynamic image compression standard are (as MPEG1, MPEG2, MPEG4, H263 etc.) in the cataloged procedure, at first adopt number cosine converting (Digital Cosine Transform, DCT) module is carried out the conversion of time domain to frequency domain to raw image data (or the view data after the estimation), adopts quantizer that frequency-region signal (being the result of DCT) is quantized then, at last the frequency-region signal after quantizing is compressed.In this process, quantizer is finished the operation of frequency-region signal divided by specific quantization parameter, and employing realizes the inverse that frequency-region signal multiply by quantization parameter usually.
Decode procedure is: the frequency-region signal after at first the compressed code flow decompress(ion) being obtained quantizing, then its inverse quantization is obtained correct frequency-region signal (being the result of DCT), (Invert DigitalCosine Transform, IDCT) module generates raw image data (or the view data after the estimation) by reverse number cosine converting at last.In this process, the frequency-region signal after quantizer will quantize multiply by specific quantization parameter, to recover correct frequency-region signal.
In a word, DCT (and IDCT) and quantizer are most important calculation procedures in compression of images (decompress(ion)) process.Design a kind of DCT of being convenient to hard-wired low complex degree (and IDCT) and quantizer, to improve system performance, reduce system power dissipation, to reduce system cost significant.
The DCT algorithm that uses in still image compression standard and the dynamic image compression standard is 8 * 8 two-dimensional dcts, and is as follows
F ( u , v ) = 1 4 C ( u ) C ( v ) Σ x = 0 7 Σ y = 0 7 f ( x , y ) cos ( 2 x + 1 ) uπ 16 cos ( 2 y + 1 ) vπ 16
u,v,x,y=0,1,2,... 7
X, y are the spatial domain coordinate, and u, v are the frequency domain coordinate
C ( u ) , C ( v ) = 1 2 for u , v = 0 1 otherwise
The IDCT algorithm is 8 * 8 two-dimentional IDCT, and is as follows
f ( x , y ) = 1 4 Σ u = 0 7 Σ v = 0 7 C ( u ) C ( v ) F ( u , v ) cos ( 2 x + 1 ) uπ 16 cos ( 2 y + 1 ) vπ 16
In hardware is realized, usually 8 * 8 two-dimensional dcts can be decomposed into 16 8 DCT of one dimension (as shown in Figure 1), one dimension DCT, IDCT are as follows:
F ( u ) = 1 2 C ( u ) Σ x = 0 7 f ( x ) cos ( 2 x + 1 ) uπ 16
f ( x ) = 1 2 Σ u = 0 7 C ( u ) F ( u ) cos ( 2 x + 1 ) uπ 16
As seen, comprise a large amount of multiplication in above-mentioned DCT, the IDCT formula, existing various software algorithms and hardware circuit are intended to reduce the number of times of multiplication, but too complicated algorithm is unfavorable for the hardware circuit realization.In addition, to use same structure also be the key factor that hardware circuit realize to need is considered for DCT and IDCT.
Such as, existing one dimension DCT, idct circuit adopt Fig. 2 and decomposing scheme shown in Figure 3 usually, below are called Chen scheme and Loeffler scheme according to the presenter.These two schemes have not only reduced the number of times of required multiplication, and have good structural symmetry and be convenient to realize with hardware circuit that especially DCT and IDCT can use same structure to realize.The core calculations unit of Chen scheme and Loeffler scheme is for intersecting multiplicaton addition unit circuit (as shown in Figure 4), and this element circuit comprises four multiplication and two addition (out1=in1*a+in2*b; Out2=in2*a-in1*b).The inverse operation of this computing unit circuit is this computing unit circuit itself just, so the DCT of Chen scheme and Loeffler scheme and IDCT can use same circuit to realize.
Many existing schemes substitute intersection multiplicaton addition unit (out1=in1*p1+in2 shown in Figure 4 with the displacement and the plus-minus method (as shown in Figure 5) of several series connection; Out2=in1* (1-p1*p2)-in2*p2), and, substitute multiplication with addition by selecting the least possible p value of binary expression figure place.Because the restriction of p value, the output of Fig. 5 also needs it to multiply by a specific scale-up factor s to obtain the result identical with Fig. 4.The operation of " multiply by a specific scale-up factor s " can carrying out (as shown in Figure 6) at the one dimension dct transform at last.Further, the operation of twice " multiply by a special ratios coefficient s " can be merged to carrying out (as shown in Figure 7) at last of 8 * 8 two-dimensional dcts.Like this, this class scheme has just been cancelled most of multiplication calculating.
Usually, in still image compression standard and the dynamic image compression standard, the result of dct transform will send into quantizer, divided by quantization parameter, promptly multiply by the inverse of quantization parameter in quantizer.So, can " multiply by a toatl proportion coefficient " with quantizer in " multiply by the inverse of quantization parameter " merge into multiplication one time, common such dct transform is otherwise known as " multiplier-less dct transform " or the dct transform of scale-up factor " band " (as shown in Figure 8).
But the computation schemes precision that substitutes a multiplication at the displacement of using series connection several times and plus-minus method is not high.If reach the requirement of IEEE Std 1180-1990 standard, required plus-minus method may be greater than saving the multiplication that gets off.In addition, the displacement of series connection and plus-minus method increase the step of calculating, cause calculation delay to increase, and are unfavorable for realizing high speed DCT circuit with hardware.
Summary of the invention
Technical matters to be solved by this invention provides a kind of hardware circuit of high-precision multiplier-less, only uses seldom shift unit and totalizer to replace multiplier, and can reach very high computational accuracy.
In order to solve the problems of the technologies described above, the technical solution adopted in the present invention is:
A kind of number cosine converting circuit of high-precision multiplier-less is characterized in that, comprises first module circuit, second element circuit, register that circuit connects;
Wherein, described first module circuit is made of 4 first adder-subtractors, comprises 4 input ends and 4 output terminals;
Described second element circuit is made up of 16 shift units and 12 adder-subtractors, all corresponding one second adder-subtractor that connects behind per two shift units, also all corresponding one the 3rd adder-subtractor that connects behind per two second adder-subtractors, the shared input end of per four shift units, each the 3rd adder-subtractor is equipped with an output terminal;
Input end, the output terminal of described first module circuit, second element circuit all are connected with register.
Further, the present invention also provides a kind of number cosine converting and quantization method of high-precision multiplier-less, makes h2/h6 ≈ C6/C2, h3/h5 ≈ C3/C5, h1/h7 ≈ C1/C7 and h4/h0 ≈ C4/C0, wherein Ck=cos (k ∏/16) (k=0,1,, 7), each h value is for being natural number, the figure place that its binary expression needs is the least possible, and the number of " 1 " that wherein comprises or " 1 " is minimum;
Computation process comprises the steps:
Step 1, replace a multiplier, to finish the number cosine converting of band scale-up factor s with two shift units and an adder-subtractor;
Step 2, the result exported in conversion in quantizer, be multiplied by L, wherein L be s divided by quantization parameter, the dct transform result after obtaining quantizing.
Further, the structure of the reverse number cosine converting circuit of high-precision multiplier-less of the present invention is identical with described number cosine converting circuit structure, and it comprises first module circuit, second element circuit, register that circuit connects;
Wherein, described first module circuit is made of 4 first adder-subtractors, comprises 4 input ends and 4 output terminals;
Described second element circuit is made up of 16 shift units and 12 adder-subtractors, all corresponding one second adder-subtractor that connects behind per two shift units, also all corresponding one the 3rd adder-subtractor that connects behind per two second adder-subtractors, the shared input end of per four shift units, each the 3rd adder-subtractor is equipped with an output terminal;
Input end, the output terminal of described first module circuit, second element circuit all are connected with register.
Further, the reverse number cosine converting and the quantization method of high-precision multiplier-less of the present invention are at first selected particular value s ', and s ' and s differ a scale-up factor.
Computation process comprises the steps:
Step 1, the dct transform result be multiply by L ', obtain the x value, wherein to be s ' with quantization parameter long-pending for L ';
Step 2, replace a multiplier,, obtain correct inverse quantization and reverse number cosine converting result to finish the reverse number cosine converting of band scale-up factor s ' with two shift units and an adder-subtractor;
Advantage of the present invention is:
1, by selecting specific number, need not the multiplier that uses consumes resources more in the hardware circuit.Be a kind of " multiplier-less dct transform ";
2, have the concurrency and the consistance of height on the structure, the hardware computing unit can reuse, so hardware circuit is very simple;
3, DCT, IDCT can realize with same hardware circuit;
4, can arrive very high computational accuracy.
Description of drawings
Fig. 1 is a theory diagram of realizing 8 * 8 two-dimensional dcts with 8 DCT of one dimension
The theory diagram of Fig. 2 existing C hen scheme hardware counting circuit.
Fig. 3 is the theory diagram of existing Loeffler scheme hardware counting circuit.
Fig. 4 is the intersection multiplicaton addition unit synoptic diagram in existing C hen scheme and the Loeffler scheme.
Fig. 5 is the displacement of existing series connection and the computing unit synoptic diagram of plus-minus method.
Fig. 6 is the theory diagram that carries out multiply operation at last at the one dimension dct transform.
Fig. 7 is the theory diagram that carries out multiply operation at last at two-dimensional dct transform.
Fig. 8 is the dct transform and the quantizer principle of combining block diagram of multiplier-less.
Fig. 9 is DCT of the present invention (and IDCT) translation circuit principle schematic.
Figure 10 is a computing unit circuit exploded pictorial schematic diagram shown in Figure 9.
Figure 11 is the electrical block diagram of A unit shown in Figure 10.
Figure 12 is the electrical block diagram of B unit shown in Figure 10.
Figure 13 is the pipeline organization synoptic diagram of dct transform shown in Figure 9.
Figure 14 is the pipeline organization synoptic diagram of idct transform shown in Figure 9.
Figure 15 is the dct transform circuit theory synoptic diagram of another specific embodiment of the present invention.
Embodiment
The present invention adopts another thinking to realize intersection multiplicaton addition unit circuit in the Chen scheme, has realized a kind of high-precision " multiplier-less dct transform circuit ", now is described in detail as follows:
As shown in Figure 5, with the multiplication factor a of the core calculations unit of Chen scheme and Loeffler scheme, b is divided by s, and then output is also by divided by s, thus also need it multiply by s to obtain correct result with output.Core concept of the present invention is by changing s value, select proper A, and the B value makes A ≈ a/s and B ≈ b/s, A wherein, and B is a natural number, the figure place of its binary expression needs is the least possible, and the number of " 1 " that wherein comprises or " 1 " is minimum.For example limit A and B for smaller or equal to 24 natural number, and only comprise no more than 2 " 1 " or " 1 ", then multiply by A or B and can substitute multiplier with two shift units and an adder-subtractor, wherein, shift unit 0 ~ 4 bit that a binary number can be shifted left.A, B optionally number are 1,2,3,4,5,6,7 (7=8-1), 8,9,10,12,14 (14=16-2), 15 (15=16-1), 16,17,18,20 and 24.Other number comprises 11,13, and 19,21,22 and 23 can't be expressed as the binary number of 2 " 1 " or " 1 ".
The present invention proposes one dimension multiplier-less dct transform circuit as shown in Figure 9.By deriving as can be known, " multiply by a specific scale-up factor s " can be the carrying out of two-dimensional dct at last, and with quantizer in " multiply by the inverse of quantization parameter " merge into multiplication one time, as shown in Figure 8.
Below in conjunction with Fig. 2, the h among detailed description Fig. 9 and the computing method of s value:
For in v2, v3}->x2, the computing unit of x3} can be done following processing:
Consider
c 2=cos(2π/16)=0.92387953
c 6=cos(6π/16)=0.38268343
c 2 c 6 = 2.41421356 ≈ 12 5
We can select
h 2=12
h 6=5
We can obtain error less than 0.5% approximation like this
h 2 h 2 2 + h 6 2 = 0.92307692 = 0.99913126 c 2 ≈ c 2
h 6 h 2 2 + h 6 2 = 0.38461538 = 1.00504843 c 6 ≈ c 6
For among Fig. 2 w4, w7}->x4, the computing unit of x7} can be done following processing:
Consider
c 1=cos(π/16)=0.98078528
c 7=cos(7π/16)=0.19509032
c 1 c 7 = 5.0273394 ≈ 5
We select
h 1=5
h 7=1
We can obtain error less than 0.6% approximation like this
h 1 h 1 2 + h 7 2 = 0.98058067 = 1.00020865 c 1 ≈ c 1
h 7 h 1 2 + h 7 2 = 0.19611613 = 0.99476935 c 7 ≈ c 7
For among Fig. 2 w5, w6}->x5, the computing unit of x6} can be done following processing:
Consider
c 3=cos(3π/16)=0.83146961
c 5=cos(5π/16)=0.55557023
c 3 c 5 = 1.49660576 ≈ 3 2
We select
h 3=3
h 5=2
We can obtain error less than 0.2% approximation like this
h 3 h 3 2 + h 5 2 = 0.83205029 = 1.00069838 c 3 ≈ c 3
h 5 h 3 2 + h 5 2 = 0.55470019 = 0.99843397 c 5 ≈ c 5
For among Fig. 2 u4, u5, u6, u7}->v4, v5, v6, the computing unit of v7}, can do following processing:
c 4 = cos ( 4 π / 16 ) = 0.70710678 ≈ 12 17
We can obtain error less than 0.2% approximation like this
h 0=17
h 4=12
h 4 h 0 = 0.70588235 = 0.99826840 c 4 ≈ c 4
For s0, s1, s2, s3 can obtain according to following formula:
S 0=4c 4
s 2 = 2 h 2 2 + h 6 2
s 1 = 1 2 · ( c 0 h 0 + c 4 h 4 ) h 1 2 + h 7 2
s 3 = 1 2 · ( c 0 h 0 + c 4 h 4 ) h 3 2 + h 5 2
Utilize the symmetry and the consistance of circuit structure, computing unit shown in Figure 9 can be by 7 parts that are decomposed into shown in Figure 10, (w, x are kept in the register, are generally two's complement and represent for input f, output F and intermediate variable u, v).Wherein, part A 1, A2, A3, A4 is in full accord, constitute by 4 first adder-subtractors 1, comprise 4 input ends and 4 output terminals, can share a first module circuit, as shown in figure 11, (in hardware circuit design, generally do not distinguish complement adder and complement code subtracter, be collectively referred to as the complement code adder-subtractor).Part B1, B2, B3, B4 also can share one second element circuit, this second element circuit is made up of 16 shift units and 12 adder-subtractors, all corresponding one second adder-subtractor 3 that connects in per two shift units 2 backs, also all corresponding one the 3rd adder-subtractor 4 that connects in per two second adder-subtractors 3 backs, per four shift units, 2 shared input ends, each the 3rd adder-subtractor is equipped with an output terminal.As shown in figure 12.Described register all is connected with the input/output terminal of first module circuit, second element circuit.
In the first module circuit, output signal out0 and output signal out1 be input signal in0 and input signal in1 and or poor; Output signal out2 and output signal out3 are input signal in2 and input signal in3 and or poor.In second element circuit, input signal in0 displacement is obtained signal in0_0 and signal in0_1, summation (perhaps poor) obtains output signal out0_0 again, and then output signal out0_0 is multiplied by the long-pending of h for input signal in0; Earlier input signal in1 displacement is obtained signal in1_0 and signal in1_1, summation (perhaps poor) obtains output signal out0_1 again, and then output signal out0_1 is multiplied by the long-pending of h for input signal in1; At last, ask output signal out0_0 and output signal out0_1 and or difference obtain output signal out0.
Dct transform circuit of the present invention can adopt pipeline system shown in Figure 13, can reach 4 clock period to finish the processing speed of an one dimension dct transform.Transverse axis express time among Figure 13, each A or B are partly finished with 1 clock period.Arbitrary moment is finished the calculating of an A part and a B part at most, so only an A element circuit of need and one second element circuit can be realized the present invention, and can reach 4 clock period and finishes the processing speed of an one dimension dct transform.
As shown in figure 13, circuit of the present invention is to work according to the following steps to finish dct transform and quantification: (supposing that s obtains L divided by quantization parameter)
Step 1: from register, take out f1, f6, f2, f5} be as the input of first module circuit, and then the first module circuit be output as u1, u6, u2, u5} deposits register in;
Step 2: from register, take out f0, f7, f3, f4} be as the input of first module circuit, and then the first module circuit be output as u0, u7, u3, u4} deposits register in;
Step 3: from register, take out u1, u2, u3, u0} be as the input of first module circuit, and then the first module circuit be output as v1, v2, v3, v0} deposits register in; Simultaneously, from register, take out u7, u4, u5, u6} be as the input of second element circuit, and then second element circuit be output as v7, v4, v5, v6} deposits register in;
Step 4: from register, take out v4, v5, v6, v7} be as the input of first module circuit, and then the first module circuit be output as w4, w5, w6, w7} deposits register in; Simultaneously, from register, take out v1, v0, v2, v3} be as the input of second element circuit, and then second element circuit be output as x1, x0, x2, x3} deposits register in;
Step 5: from register, take out w4, w7, w5, w6} be as the input of second element circuit, and then second element circuit be output as x4, x7, x5, x6} deposits register in; Simultaneously, from register, take out one dimension dct transform next time f1, f6, f2, f5} be as the input of first module circuit, and then the first module circuit be output as one dimension dct transform next time u1, u6, u2, u5} deposits register in;
Step 6: repeating step 2 ~ 5, finish until 16 one dimension dct transforms;
Step 7:, obtain the result of 8 * 8 two-dimensional dct transforms with the x that the obtains L that goes up on duty; Idct circuit:
The Chen scheme has structural symmetry, with the data flow direction negate among Fig. 2, can obtain the idct circuit of Chen scheme.This be because among Fig. 2 all intersections to take advantage of the inverse operation that adds computing unit be exactly this computing unit itself.
The present invention has inherited this characteristic of Chen scheme, with the data flow direction negate among Fig. 9, can obtain idct circuit schematic diagram of the present invention.Different is that the inverse operation of B computing unit and B computing unit differ a scale-up factor.This scale-up factor can be compensated in s, the value of s ' when below being IDCT.
S 0 ′ = 1 s 0
s 2 ′ = 1 s 2 · ( h 2 2 + h 6 2 )
s 1 ′ = 1 s 1 · ( h 1 2 + h 7 2 ) h 0 h 4
s 3 ′ = 1 s 3 · ( h 3 2 + h 5 2 ) h 0 h 4
So dct transform and idct transform can be realized with same circuit among the present invention.As shown in figure 14, circuit of the present invention is to work according to the following steps to finish inverse quantization and idct transform: (supposing that s ' and amassing of quantization parameter are L ')
Step 1: the result of 8 * 8 two-dimensional dct transforms be multiply by L ', obtain the x value;
Step 2: from register, take out x4, x7, x5, x6} be as the input of second element circuit, and then second element circuit be output as w4, w7, w5, w6} deposits register in;
Step 3: from register, take out w4, w5, w6, w7} be as the input of A element circuit, and then the first module circuit be output as v4, v5, v6, v7} deposits register in; Simultaneously, from register, take out x1, x0, x2, x3} be as the input of second element circuit, and then second element circuit be output as v1, v0, v2, v3} deposits register in;
Step 4: from register, take out v1, v2, v3, v0} be as the input of first module circuit, and then the first module circuit be output as u1, u2, u3, u0} deposits register in; Simultaneously, from register, take out v7, v4, v5, v6} be as the input of second element circuit, and then second element circuit be output as u7, u4, u5, u6} deposits register in;
Step 5: from register, take out u0, u7, u3, u4} be as the input of first module circuit, and then the first module circuit be output as f0, f7, f3, f4} deposits register in;
Step 6: from register, take out u1, u6, u2, u5} be as the input of first module circuit, the then output of first module circuit f1, and f6, f2, f5} deposits register in; Simultaneously, from register, take out one dimension dct transform next time x4, x7, x5, x6} be as the input of second element circuit, and then second element circuit be output as one dimension dct transform next time w4, w7, w5, w6} deposits register in;
Step 7: repeating step 3 ~ 6, finish until 16 one dimension idct transforms;
As shown in figure 15: can select similarly in another specific embodiment of the present invention (Loeffler):
h 2=12
h 6=5
h 1=5
h 7=1
h 3=3
h 5=2
For t0 and t1, can be by following processing,
Consider
r 0 = 1 h 3 2 + h 5 2 = 0.27735009
r 1 = 1 h 1 2 + h 7 2 = 0.19611613
r 0 r 1 = 26 13 = 2 ≈ 17 12
We select
t 0=17
t 1=12
We can obtain error less than 0.1% approximation like this
t 0 1 2 ( t 0 r 0 + t 1 r 1 ) = 0.27759043 = 1.000866551 r 0 ≈ r 0
t 1 1 2 ( t 0 r 0 + t 1 r 1 ) = 0.19594619 = 0.999133448 r 1 ≈ r 1
In a word, for dct transform
S 0=C 4
s 2 = 1 h 2 2 + h 6 2
s 1 = c 4 1 2 ( t 0 r 0 + t 1 r 1 )
s 3 = 1 1 2 ( t 0 r 0 + t 1 r 1 )
For idct transform
s 0 ′ = 1 s 0
s 2 ′ = 1 s 2 · ( h 2 2 + h 6 2 )
s 1 ′ = 1 s 1 · ( h 1 2 + h 7 2 ) t 0 t 1
s 3 ′ = 1 s 3 · ( h 3 2 + h 5 2 ) t 0 t 1
Similarly, dct transform circuit in this specific embodiment and idct transform circuit also can be realized by an A element circuit (as Figure 11) and a B element circuit (as Figure 12).But, can only reach 5 clock period to finish the processing speed of an one dimension dct transform.
More than each routine error of calculation in 1%, can satisfy the requirement of most of Standard of image compression.For reaching higher computational accuracy, we can seek the h value in bigger scope, approach C7/C1, C6/C2, C5/C3 and C4.If employing h is the natural number smaller or equal to 224, and only comprises no more than 3 " 1 " or " 1 ", then can reach the requirement of IEEE Std 1180-1990 standard.
Protection scope of the present invention is not limited to above-mentioned specific embodiment, the known technology conversion of all those skilled in that art all drops in protection scope of the present invention, such as to circuit theory diagrams Fig. 9 of the present invention and Figure 15 different decomposition methods and different pipeline organizations being arranged; In circuit theory diagrams Fig. 9 of the present invention and Figure 15, can adopt different special several to h value and various combination thereof.

Claims (9)

1. the number cosine converting circuit of a high-precision multiplier-less is characterized in that, comprises first module circuit, second element circuit, register that circuit connects;
Wherein, described first module circuit is made of 4 first adder-subtractors, comprises 4 input ends and 4 output terminals;
Described second element circuit is made up of 16 shift units and 12 adder-subtractors, all corresponding one second adder-subtractor that connects behind per two shift units, also all corresponding one the 3rd adder-subtractor that connects behind per two second adder-subtractors, the shared input end of per four shift units, each the 3rd adder-subtractor is equipped with an output terminal;
Input end, the output terminal of described first module circuit, second element circuit all are connected with register.
2. the number cosine converting of a high-precision multiplier-less and quantization method, it is characterized in that, at first select specific s value, make A ≈ a/s and B ≈ b/s, wherein a, b are multiplication factor, A, B is a natural number, the figure place that its binary expression needs is the least possible, and the number of " 1 " that wherein comprises or " 1 " is minimum, comprises the steps:
Step 1, replace a multiplier, to finish the number cosine converting of band scale-up factor s with shift unit and adder-subtractor;
Step 2, the result exported in conversion in quantizer, be multiplied by L, wherein L be s divided by quantization parameter, the dct transform result after obtaining quantizing;
Wherein, the system of selection of described s value is: make h2/h6 ≈ C6/C2, h3/h5 ≈ C3/C5, h1/h7 ≈ C1/C7 and h4/h0 ≈ C4/C0, Ck=cos (k П/16) (k=0 wherein, 1, ... 7), each h value is a natural number, and the figure place that its binary expression needs is the least possible, and the number of " 1 " that wherein comprises or " 1 " is minimum.
3. the number cosine converting of high-precision multiplier-less according to claim 2 and quantization method is characterized in that, described A, B or h optionally number are 1,2,3,4,5,6,7,8,9,10,12,14,15,16,17,18,20 and 24, then available two shift units and an adder-subtractor replace a multiplier, to finish number cosine converting.
4. the number cosine converting of high-precision multiplier-less according to claim 3 and quantization method is characterized in that, described A, B or h are 1,2,3,4,5,6,7,8,9,10,12,14,15,16,17,18,20 or 24 divided by or multiply by 2 power.
5. the number cosine converting of high-precision multiplier-less according to claim 2 and quantization method is characterized in that, the number cosine converting method of band scale-up factor s comprises the steps: in the step 1
Step 1: from register, take out f1, f6, f2, f5} be as the input of first module circuit, and then the first module circuit be output as u1, u6, u2, u5} deposits register in;
Step 2: from register, take out f0, f7, f3, f4} be as the input of first module circuit, and then the first module circuit be output as u0, u7, u3, u4} deposits register in;
Step 3: from register, take out u1, u2, u3, u0} be as the input of first module circuit, and then the first module circuit be output as v1, v2, v3, v0} deposits register in; Simultaneously, from register, take out u7, u4, u5, u6} be as the input of second element circuit, and then second element circuit be output as v7, v4, v5, v6} deposits register in;
Step 4: from register, take out v4, v5, v6, v7} be as the input of first module circuit, and then the first module circuit be output as w4, w5, w6, w7} deposits register in; Simultaneously, from register, take out v1, v0, v2, v3} be as the input of second element circuit, and then second element circuit be output as x1, x0, x2, x3} deposits register in;
Step 5: from register, take out w4, w7, w5, w6} be as the input of second element circuit, and then second element circuit be output as x4, x7, x5, x6} deposits register in; Simultaneously, from register, take out one dimension dct transform next time f1, f6, f2, f5} be as the input of first module circuit, and then the first module circuit be output as one dimension dct transform next time u1, u6, u2, u5} deposits register in;
Step 6: repeating step 2 ~ 5, finish until 16 one dimension dct transforms.
6. the number cosine converting of high-precision multiplier-less according to claim 5 and quantization method, it is characterized in that the input/output relation of described first module circuit is: output signal out0 and output signal out1 be input signal in0 and input signal in1 and or poor; Output signal out2 and output signal out3 are input signal in2 and input signal in3 and or poor.
7. the number cosine converting of high-precision multiplier-less according to claim 5 and quantization method, it is characterized in that, the input/output relation of described second element circuit is: input signal in0 displacement obtains signal in0_0 and signal in0_1, summation or difference obtain output signal out0_0 again, and then output signal out0_0 is multiplied by the long-pending of h for input signal in0; Input signal in1 displacement obtains signal in1_0 and in1_1, and summation or difference obtain output signal out0_1 again, and then output signal out0_1 is multiplied by the long-pending of h for input signal in1; At last, ask output signal out0_0 and output signal out0_1 and or difference obtain output signal out0.
8. the reverse number cosine converting circuit of a high-precision multiplier-less is characterized in that, comprises first module circuit, second element circuit, register that circuit connects;
Wherein, described first module circuit is made of 4 first adder-subtractors, comprises 4 input ends and 4 output terminals;
Described second element circuit is made up of 16 shift units and 12 adder-subtractors, all corresponding one second adder-subtractor that connects behind per two shift units, also all corresponding one the 3rd adder-subtractor that connects behind per two second adder-subtractors, the shared input end of per four shift units, each the 3rd adder-subtractor is equipped with an output terminal;
Input end, the output terminal of described first module circuit, second element circuit all are connected with register.
9. the reverse number cosine converting and the quantization method of a high-precision multiplier-less is characterized in that, at first select particular value s ', and s ' and s differ a scale-up factor, and computation process comprises the steps:
Step 1, the dct transform result be multiply by L ', obtain the x value, wherein to be s ' with quantization parameter long-pending for L ';
Step 2, replace a multiplier,, obtain correct inverse quantization and reverse number cosine converting result to finish the reverse number cosine converting of band scale-up factor s ' with two shift units and an adder-subtractor;
Wherein, the reverse number cosine converting method of band scale-up factor s comprises the steps: in the described step 2
Step 1: from register, take out x4, x7, x5, x6} be as the input of second element circuit, and then second element circuit be output as w4, w7, w5, w6} deposits register in;
Step 2: from register, take out w4, w5, w6, w7} be as the input of first module circuit, and then the first module circuit be output as v4, v5, v6, v7} deposits register in; Simultaneously, from register, take out x1, x0, x2, x3} be as the input of second element circuit, and then second element circuit be output as v1, v0, v2, v3} deposits register in;
Step 3: from register, take out v1, v2, v3, v0} be as the input of first module circuit, and then the first module circuit be output as u1, u2, u3, u0} deposits register in; Simultaneously, from register, take out v7, v4, v5, v6} be as the input of second element circuit, and then second element circuit be output as u7, u4, u5, u6} deposits register in;
Step 4: from register, take out u0, u7, u3, u4} be as the input of first module circuit, and then the first module circuit be output as f0, f7, f3, f4} deposits register in;
Step 5: from register, take out u1, u6, u2, u5} be as the input of first module circuit, the then output of first module circuit f1, and f6, f2, f5} deposits register in; Simultaneously, from register, take out one dimension dct transform next time x4, x7, x5, x6} be as the input of second element circuit, and then second element circuit be output as one dimension dct transform next time w4, w7, w5, w6} deposits register in;
Step 6: repeating step 2 ~ 5, finish until 16 one dimension idct transforms.
CNB2005100252037A 2005-04-19 2005-04-19 High-precision number cosine converting circuit without multiplier and its conversion Active CN100388316C (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CNB2005100252037A CN100388316C (en) 2005-04-19 2005-04-19 High-precision number cosine converting circuit without multiplier and its conversion

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CNB2005100252037A CN100388316C (en) 2005-04-19 2005-04-19 High-precision number cosine converting circuit without multiplier and its conversion

Publications (2)

Publication Number Publication Date
CN1855149A CN1855149A (en) 2006-11-01
CN100388316C true CN100388316C (en) 2008-05-14

Family

ID=37195298

Family Applications (1)

Application Number Title Priority Date Filing Date
CNB2005100252037A Active CN100388316C (en) 2005-04-19 2005-04-19 High-precision number cosine converting circuit without multiplier and its conversion

Country Status (1)

Country Link
CN (1) CN100388316C (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8595281B2 (en) * 2006-01-11 2013-11-26 Qualcomm Incorporated Transforms with common factors
CN103237219A (en) * 2013-04-24 2013-08-07 南京龙渊微电子科技有限公司 Two-dimensional discrete cosine transformation (DCT)/inverse DCT circuit and method

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20040117418A1 (en) * 2002-12-11 2004-06-17 Leonardo Vainsencher Forward discrete cosine transform engine

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20040117418A1 (en) * 2002-12-11 2004-06-17 Leonardo Vainsencher Forward discrete cosine transform engine

Non-Patent Citations (6)

* Cited by examiner, † Cited by third party
Title
Fast Multiplierless Approximations of the DCT With the LiftingScheme. Jie Liang ET AL.IEEE TRANSACTIONS ON SIGNAL PROCESSING,Vol.49 No.12. 2001
Fast Multiplierless Approximations of the DCT With the LiftingScheme. Jie Liang ET AL.IEEE TRANSACTIONS ON SIGNAL PROCESSING,Vol.49 No.12. 2001 *
二进制整数离散余弦变换无乘法提升阶梯算法. 陈,力.汕头大学学报(自然科学版),第19卷第3期. 2004
二进制整数离散余弦变换无乘法提升阶梯算法. 陈,力.汕头大学学报(自然科学版),第19卷第3期. 2004 *
基于查找表的无乘法DCT快速算法. 杜相文.计算机工程,第30卷第20期. 2004
基于查找表的无乘法DCT快速算法. 杜相文.计算机工程,第30卷第20期. 2004 *

Also Published As

Publication number Publication date
CN1855149A (en) 2006-11-01

Similar Documents

Publication Publication Date Title
CN101796506B (en) Transform design with scaled and non-scaled interfaces
CN101399989B (en) Method for reduced bit-depth quantization
CN100463522C (en) Improved block transform and quantization for image and video coding
US7127482B2 (en) Performance optimized approach for efficient downsampling operations
RU2429531C2 (en) Transformations with common factors
US5659362A (en) VLSI circuit structure for implementing JPEG image compression standard
Di Nola et al. Łukasiewicz transform and its application to compression and reconstruction of digital images
JP2004038451A (en) Hadamard transformation processing method and device
CN102084594B (en) Method for treating digital data
CN104244010B (en) Improve the method and digital signal converting method and device of digital signal conversion performance
CN100388316C (en) High-precision number cosine converting circuit without multiplier and its conversion
CN103237219A (en) Two-dimensional discrete cosine transformation (DCT)/inverse DCT circuit and method
US20110060433A1 (en) Bilinear algorithms and vlsi implementations of forward and inverse mdct with applications to mp3 audio
US6832232B1 (en) Dual-block inverse discrete cosine transform method
CN100452880C (en) Integral discrete cosine transform method in use for encoding video
Mukherjee et al. Hardware efficient architecture for 2D DCT and IDCT using Taylor-series expansion of trigonometric functions
CN203279074U (en) Two-dimensional discrete cosine transform (DCT)/inverse discrete cosine transform (IDCT) circuit
CN102067108A (en) Fast computation of products by dyadic fractions with sign-symmetric rounding errors
Agostini et al. A FPGA based design of a multiplierless and fully pipelined JPEG compressor
Parfieniuk et al. Short‐critical‐path and structurally orthogonal scaled CORDIC‐based approximations of the eight‐point discrete cosine transform
CN102043605B (en) Multimedia transformation multiplier and processing method thereof
CN101765013B (en) Data transform device and control method thereof
CN102395031B (en) Data compression method
Gustafsson On lifting-based fixed-point complex multiplications and rotations
US6742010B1 (en) Dual-block discrete consine transform method

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant
C41 Transfer of patent application or patent right or utility model
TR01 Transfer of patent right

Effective date of registration: 20161213

Address after: 361015 Xiamen Fujian torch hi tech Zone Pioneering Park Cheng Yip Building Room 201

Patentee after: Xiamen Ziguang exhibition Rui Technology Co. Ltd.

Address before: 201203 Shanghai City Songtao road Pudong Zhangjiang hi tech Park No. 696 3-5

Patentee before: Spreadtrum Communications (Shanghai) Co.,Ltd.