WO1997002535A1 - Image processing circuit and method - Google Patents

Image processing circuit and method Download PDF

Info

Publication number
WO1997002535A1
WO1997002535A1 PCT/EP1996/002816 EP9602816W WO9702535A1 WO 1997002535 A1 WO1997002535 A1 WO 1997002535A1 EP 9602816 W EP9602816 W EP 9602816W WO 9702535 A1 WO9702535 A1 WO 9702535A1
Authority
WO
WIPO (PCT)
Prior art keywords
transform function
control signals
products
circuit
receive
Prior art date
Application number
PCT/EP1996/002816
Other languages
French (fr)
Inventor
Patrick Clement
Fathy Fouad Yassa
Original Assignee
Motorola Inc.
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Motorola Inc. filed Critical Motorola Inc.
Priority to EP96923959A priority Critical patent/EP0799453A1/en
Priority to JP9504790A priority patent/JPH10505445A/en
Publication of WO1997002535A1 publication Critical patent/WO1997002535A1/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/10Complex mathematical operations
    • G06F17/14Fourier, Walsh or analogous domain transformations, e.g. Laplace, Hilbert, Karhunen-Loeve, transforms
    • G06F17/147Discrete orthonormal transforms, e.g. discrete cosine transform, discrete sine transform, and variations therefrom, e.g. modified discrete cosine transform, integer transforms approximating the discrete cosine transform
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/42Methods or arrangements for coding, decoding, compressing or decompressing digital video signals characterised by implementation details or hardware specially adapted for video compression or decompression, e.g. dedicated software implementation
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/60Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using transform coding
    • H04N19/61Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using transform coding in combination with predictive coding

Definitions

  • This invention relates to image processing and particularly but not exclusively to compression and decompression of video images.
  • Image compression and decompression techniques which compress images in order to facilitate efficient transmission are well known.
  • a number of standards exist which define particular methods. For example, the JPEG standard for still images, and the MPEGl and MPEG2 standards for moving images.
  • DCT Discrete Cosine Transform
  • I-DCT Corresponding inverse DCT
  • a dedicated processor which performs the decompression process typically handles the coefficients using butterfly computations or a row-column method to reconstruct the picture elements. These methods require a number of cascaded computations, each of which has specific combinations of operands, which are then multiplied by cosine functions.
  • the first computation processes the coefficients together in first specific combinations to provide intermediate results. These are then subject to a second computation, according to a second specific combination, in order to provide further intermediate results, and so on.
  • the intermediate results must be stored after each computation.
  • a problem with the above method is that the intermediate results must be re ⁇ ordered in order to achieve the different specific combinations required by each computation, giving rise to processing delay. Furthermore, the multiplication of the cosine functions require the use of at least one full multiplier within the dedicated processor, which reduces the speed of the processing even further.
  • This invention seeks to provide an image processing circuit and method which mitigate the above mentioned disadvantages.
  • an image processing circuit for processing transform function coefficients to generate an image, the circuit comprising: an input latch for receiving a transform function coefficient and an associated set of predetermined co-ordinates; a multiplying stage coupled to receive an unsigned value of the transform function coefficient and for multiplying the unsigned value by a plurality of transform functions to produce a plurality of products; a decoder coupled to receive the predetermined coordinates and the sign ofthe transform function coefficient from the latch, for providing control signals; a multiplexer coupled to receive the plurality of products, for selecting a number of pairs of products in dependence upon the control signals; and, an accumulator arranged to store partial picture element values and coupled to receive the number of pairs and with respect to each ofthe number, for adding or subtracting each element of the pair to a selected one of the partial picture element values in dependence upon the control signals.
  • the multiplying stage comprises a number of shifters, a number of adder/subtractors and a number of multipliers in order to provide the plurality of products.
  • the decoder preferably further comprises a plurality of cells, each cell providing a partial control signal.
  • a method for processing a set of transform function coefficients to produce picture elements of an image comprising the steps of: receiving a transform function coefficient and an associated set of predetermined coordinates; multiplying an unsigned value of the transform function coefficient by a plurality of transform functions to produce a plurality of products; decoding the predetermined coordinates and the sign of the transform function coefficient to provide control signals; selecting a number of pairs from the plurality of products in dependence upon the control signals; adding or subtracting each element of the pair to a selected one of the partial picture element values in dependence upon the control signals, whereby processing the last transform function coefficient of the set produces the picture elements from the partial picture element values.
  • the transform is an inverse discrete cosine transform.
  • the transform is preferably an inverse discrete sine transform.
  • FIG.1 shows a preferred embodiment of an image processing circuit in accordance with the invention.
  • FIG.2 shows a multiplying stage forming part of the circuit of FIG.1
  • FIG.3 shows a decoder cell forming part of the circuit of FIG.1.
  • the inverse discrete cosine transform function may be used to transform DCT coefficients into a two-dimensional image. Such coefficients use less space when stored or transmitted electronically.
  • the transform function is performed on a block of DCT coefficients, according to the following equation:-
  • N is the size ofthe block on which the I-DCT is performed
  • u and v are the transform domain coordinates ofthe DCT coefficients
  • j and k are the spatial coordinates of the picture elements
  • F(u,v) is a DCT coefficient for each u and v
  • sf, sa, sb, za, and zb are either 0 or 1, as functions of u, v, j and k,
  • muxa b2a-bla-( ip
  • muxb b2b ⁇ bib ip
  • F(u,v) is used in all seven cases and the cosine functions are fixed, therefore the seven multiplications may be replaced by a small number of additions and subtractions.
  • a latch arrangement 20 of the circuit 10 comprises a first latch 22 coupled to receive first and second coordinate values via input terminals 14 and 16 respectively, and a second latch 24, coupled to receive transform function values via an input terminal 12.
  • a supermultipHer block 30 is coupled to receive the latched transform function values from the second latch 24, for providing a number of weighted output values for each transform function value I F(u,v) I .
  • An input 100 of the supermultiplier 30 is coupled to receive the digital value I F(u,v) I .
  • I F(u,v) I is provided in binary form. Divisions by powers of two are therefore easily performed by shifting the corresponding digital word to the right.
  • Five shifters 102, 104, 106, 108 and 110 are coupled to the input terminal and are arranged to perform the powers of two divisions.
  • the seven required multiplications of equations 6-12 above may then be simply replaced by seven additions and two subtractions, which are performed in the supermultiplier 30 by seven adders 122, 124, 126, 128, 130, 142 and 144, and two subtractors 120 and 140.
  • a control input terminal 160 provides an ip value corresponding to the ip coefficient of equations 4 and 5.
  • Four multiplexers 150, 152, 154 and 156 are coupled to receive output values from the adders and subtractors 120-144 and also to receive the ip value from the control input terminal 160.
  • the subtractor 120 is coupled to receive the I F(u,v) I value from the input te ⁇ ninal 100 and a shifted value of I F(u,v) I from the shifter 110, which corresponds to a division of 2 6 .
  • the multiplexer 150 receives the I F(u,v) I value from the input terminal 100 and the calculated value from the subtractor 120. Under the control of the ip value, the multiplier 150 is therefore able to provide the value I F(u,v) I and the value of equation 6, in the combination required by the expression in parentheses on the first lines of equations 4 and 5.
  • multiplexers 152, 154 and 156 provide the appropriate combinations of derived I F(u,v) I values in order to achieve the remaining expressions in parentheses of equations 4 and 5.
  • a register 40 of FIG. 1 is coupled to receive the combinations of derived I F(u,v) I values from the supermultiplier 30, for providing the values under the control of a clocking signal.
  • a multiplexer block 60 is coupled to receive the clocked values from the register 40, and to receive control values to be further described below, for multiplexing the values in accordance with the expressions in parentheses with the control values in order to provide muxa and muxb values in accordance with equations 4 and 5.
  • the control values are the calculated values b2b, bib, b2a and bla in accordance with equations 4 and 5.
  • An adder/subtractor block 70 is coupled to receive the muxa and muxb values from the multiplexer 60, and to receive control values from the control block 50, for performing the addition or subtraction of the muxa and the muxb calculations according to the expression in square parentheses of equation 3.
  • the control values received by the adder/subtractor block 70 correspond to the values sf, sa, sb, za and zb.
  • a control block 50 provides control values to the supermultiplier 30, the multiplexer 60 and the adder/subtractor 70.
  • the u and v values may be expressed as:-
  • UQ and VQ are the least significant bits of u and v.
  • ip value may be calculated as:-
  • a parity and sign register 58 of the control block 50 is coupled to receive the u and v values from the u and v latch 22, and a value from the latch 24 in order to derive the ip values according to equation 13, to be used by the supermultiplier 30.
  • a cosine decoder 52 is also coupled to receive the u and v values from the latch 22, for providing the control values for the multiplexer 60 and the adder/subtractor 70.
  • the cosine decoder 52 comprises a number of cells 54, to be further described below.
  • the control values for the multiplexer 60 consist of calculated values b2b, bib, b2a and bla in accordance with equations 4 and 5.
  • the control values for the adder/subtractor 70 consist of the values sf, sa, sb, za and zb.
  • control values b2b, bib, b2a and bla are related to C(u), C(v) and to the expressions (2j+l)u + (2k+l)v and (2j+l)u - (2k+l)v.
  • the cosine function is periodic with a period of 2 ⁇ , the above expressions have only to be computed modulo 32.
  • the cosine decoder 54 thus performs the following computations:-
  • a cell 54 of the cosine decoder 52 is shown in more detail. Each cell performs one of the 32 expressions above. Input terminals 202 and 204 are coupled to receive the u and v values from the latch 22. An adder 210 calculates the expression, the result being a 5 bit word. Using the terminology expressed above, the five bit word contains bits b4, b3, b2, bl and bO, where bO is the least significant bit.
  • a two's complement block 220 is coupled to receive the 5 bit word, and calculates the two's complement of that word, which will be referred to as the word composed of bits b4c, b3c, b2c, ble and bOc.
  • a multiplexer 230 is coupled to receive the two's complemented word, and the 5 bit word from the adder 210, for providing the following products at outputs 254 and 256 respectively:-
  • Logic gates 240 and 242 are arranged to receive the two's complemented word, the 5 bit word from the adder 210, and a further input from input terminal 244, for providing the following products at outputs 250 and 252 respectively: -
  • input terminal 244 receives a zero value.
  • Other cells in the cosine decoder 52 are arranged to receive values appropriate for the calculations of that cell, and to provide the other values sb, zb.
  • a cosine register 56 is arranged to receive and hold the control values output from the cosine decoder 54, for providing the control values to the multiplexer 60 and the adder/subtractor 70 under the control of the clock signal.
  • An adder/accumulator 80 receives the muxa, muxb values according to the expressions in square parentheses of equation 3 from the adder/subtractor 70.
  • the output stage 90 comprises a partial sums latch 92 and an output adder 96.
  • the partial sums latch latches the partial results received from the adder/accumulator 80, at which point the adder/accumulator 80 may be reset, ready to receive the muxa, muxb values from the next block to be processed.
  • the output adder 96 is arranged to combine the partial sums stored in the latch 92, according to their j and k coordinates as follows:-
  • the DCT coefficients ofthe block are processed according to an inverse transform method, to reconstruct the picture elements of the block.
  • the clock signal controls the clocking of the first and second latches 22 and 24, the register 40 and the adder/accumulator 80.
  • the provision of registers and latches of the circuit 10 allow for pipelined operations within the circuit 10, which increases the speed and efficiency of the image processing.
  • circuit is not limited to inverse discrete cosine transform functions. It is envisaged that the circuit 10 could be simply adapted to process images according to other functions, such the inverse discrete sine transform function.
  • the supermultiplier 30 may be arranged to perform substantially the same function, but using different elements, such as the use of fewer addition and subtraction blocks, in exchange for more mux blocks.

Abstract

An image processing circuit for processing DCT coefficients includes an input latch (20) for receiving a DCT coefficient and associated coordinates. A multiplying stage (30) receives an unsigned value of the DCT coefficient and multiplies it by a number of DCT functions to produce a number of products. A decoder (50) provides control signals based on the coordinates and the sign of the DCT coefficient. A multiplexer (60) selects a number of pairs of the products in dependence upon the control signals. An accumulator (80) is arranged to store partial picture element values and adds or substracts each element of the pair to a selected one of the partial picture element values in dependence upon the control signals.

Description

IMAGE PROCESSING CIRCUIT AND METHOD
Field ofthe Invention
This invention relates to image processing and particularly but not exclusively to compression and decompression of video images.
Background of the Invention
Image compression and decompression techniques, which compress images in order to facilitate efficient transmission are well known. A number of standards exist which define particular methods. For example, the JPEG standard for still images, and the MPEGl and MPEG2 standards for moving images.
As part of the compression method, typically, Discrete Cosine Transform (DCT) functions are used to transform an image into a number of DCT coefficients. The image is typically sub-divided into a number of nominal blocks containing picture elements, and the functions are performed on the picture elements of each block. Corresponding inverse DCT (I-DCT) functions are used for subsequent decompression of the DCT coefficients to reconstruct the picture elements of the blocks and hence the image.
A dedicated processor which performs the decompression process typically handles the coefficients using butterfly computations or a row-column method to reconstruct the picture elements. These methods require a number of cascaded computations, each of which has specific combinations of operands, which are then multiplied by cosine functions. The first computation processes the coefficients together in first specific combinations to provide intermediate results. These are then subject to a second computation, according to a second specific combination, in order to provide further intermediate results, and so on. The intermediate results must be stored after each computation.
A problem with the above method is that the intermediate results must be re¬ ordered in order to achieve the different specific combinations required by each computation, giving rise to processing delay. Furthermore, the multiplication of the cosine functions require the use of at least one full multiplier within the dedicated processor, which reduces the speed of the processing even further.
This invention seeks to provide an image processing circuit and method which mitigate the above mentioned disadvantages.
Summary of the Invention
According to a first aspect of the present invention there is provided an image processing circuit for processing transform function coefficients to generate an image, the circuit comprising: an input latch for receiving a transform function coefficient and an associated set of predetermined co-ordinates; a multiplying stage coupled to receive an unsigned value of the transform function coefficient and for multiplying the unsigned value by a plurality of transform functions to produce a plurality of products; a decoder coupled to receive the predetermined coordinates and the sign ofthe transform function coefficient from the latch, for providing control signals; a multiplexer coupled to receive the plurality of products, for selecting a number of pairs of products in dependence upon the control signals; and, an accumulator arranged to store partial picture element values and coupled to receive the number of pairs and with respect to each ofthe number, for adding or subtracting each element of the pair to a selected one of the partial picture element values in dependence upon the control signals.
Preferably the multiplying stage comprises a number of shifters, a number of adder/subtractors and a number of multipliers in order to provide the plurality of products. The decoder preferably further comprises a plurality of cells, each cell providing a partial control signal.
According to a second aspect of the invention there is provided a method for processing a set of transform function coefficients to produce picture elements of an image, comprising the steps of: receiving a transform function coefficient and an associated set of predetermined coordinates; multiplying an unsigned value of the transform function coefficient by a plurality of transform functions to produce a plurality of products; decoding the predetermined coordinates and the sign of the transform function coefficient to provide control signals; selecting a number of pairs from the plurality of products in dependence upon the control signals; adding or subtracting each element of the pair to a selected one of the partial picture element values in dependence upon the control signals, whereby processing the last transform function coefficient of the set produces the picture elements from the partial picture element values.
Preferably the transform is an inverse discrete cosine transform. Alternatively the transform is preferably an inverse discrete sine transform.
In this way the processing delay associated with the prior art is substantially avoided.
Brief Description of the Drawings
An exemplary embodiment of the invention will now be described with reference to the drawing in which:
FIG.1 shows a preferred embodiment of an image processing circuit in accordance with the invention.
FIG.2 shows a multiplying stage forming part of the circuit of FIG.1
FIG.3 shows a decoder cell forming part of the circuit of FIG.1.
Detailed Description of a Preferred Embodiment
The inverse discrete cosine transform function (I-DCT) may be used to transform DCT coefficients into a two-dimensional image. Such coefficients use less space when stored or transmitted electronically. Typically the transform function is performed on a block of DCT coefficients, according to the following equation:-
f(j,k) = - ∑ ∑ C(u) CO) F(u,v) cos (2j^)uπ cos (2k^)vπ Equation 1 u = 0v = 0 where f(j,k) is a reconstructed picture element for each j and k,
N is the size ofthe block on which the I-DCT is performed, u and v are the transform domain coordinates ofthe DCT coefficients, j and k are the spatial coordinates of the picture elements,
F(u,v) is a DCT coefficient for each u and v, and
C(u) and C(v) are constants, being:
Figure imgf000006_0001
Taking N=8 as the block size, according to the MPEG2 standard, and using trigonometric manipulation, equation 1 becomes:-
7
«.,»-! Σ ∑C(u)C(v)F(u,v) u = 0v = 0
Equation 2
COS- (^2j+l)u z +— (2k+l)v π+C0S- (-2jJ+l)u-(2k+l) Lv-jr
16 lb
This may also be written as:-
f(j,k) = - ∑ ∑ f(-l)sf (-l)saza muxa + (-l)sbzb muxbl] Equation 3 u = 0v = 0
where sf, sa, sb, za, and zb are either 0 or 1, as functions of u, v, j and k,
and where
muxa = b2a-bla-( ip |F(u,v)| + ip-|F(u,v)|cos —
16
+ b2a bla I ip |F(u, v)|cos 1- ip |F(u, v)|cos —
16 16
Equation 4 4? 5-T
+ b2a- bla j ιp |F(u,v)|cos l-ip-|F(u,v)|cos —
+ b2a bla [ ip |F(u,v)|cos hip-|F(u,v)|cos —
16 16
muxb = b2b bib ip |F(u,v)| + ip-|F(u,v)|cos — v 16
+ bib- bib (ip |F(u,v)|cos— + ϊp-|F(u,v)|cos —
Equation 5 / 4? π
+ b2b blb ip |F(u,v)|cos — + ip-|F(u,v)|cos—
+ b2b-blb ip |F(u,v)|cos — + ip-|F(u,v)|cos —
where b2a, bla, b2b, bib = 0 or 1, as functions of u, v, j and k, and ip = 1 when u and v have opposite parities and ip = 0 when u and v have like parities, and I F(u,v) I represents the unsigned value ofa DCT coefficient.
Therefore the I-DCT may be performed with the computation of just seven products:-
|F(u,v)|-cos^ with n=l to 7.
F(u,v) is used in all seven cases and the cosine functions are fixed, therefore the seven multiplications may be replaced by a small number of additions and subtractions.
Expressing the cosine functions with binary weights and limiting to 6 bits, we have :-
|F(u, v)| cosjj = |F(u. v)| - lF("v)l Equation 6 |F(u,v)| - --£feϊ Equation ?
Figure imgf000008_0001
n-v M 3π |F(u,v)| |F(u,v)| |F(u,v)| |F(u,v)| _, ^ _
|F(u,v)| - cos^ = Li—-^ + ^-r^ + L^_^ + i-^-r^ Equation 8
E .-,qua.t_i■on « 9
Figure imgf000008_0002
^ cos^ 16 ^ ^ Equation 10
|F(U,v)| . cos = J^ + M^)l Equation 11
|F(u,v)|. COs^ = ^ + ^ Equation 12
It will be appreciated that more or less than 6 bits may be used, depending on the accuracy required.
Referring to FIG.1, there is shown an image processing circuit 10. A latch arrangement 20 of the circuit 10 comprises a first latch 22 coupled to receive first and second coordinate values via input terminals 14 and 16 respectively, and a second latch 24, coupled to receive transform function values via an input terminal 12.
A supermultipHer block 30 is coupled to receive the latched transform function values from the second latch 24, for providing a number of weighted output values for each transform function value I F(u,v) I .
Referring now also to FIG.2, the supermultiplier 30 is shown in greater detail. An input 100 of the supermultiplier 30 is coupled to receive the digital value I F(u,v) I . In digital processing, I F(u,v) I is provided in binary form. Divisions by powers of two are therefore easily performed by shifting the corresponding digital word to the right. Five shifters 102, 104, 106, 108 and 110 are coupled to the input terminal and are arranged to perform the powers of two divisions. The seven required multiplications of equations 6-12 above may then be simply replaced by seven additions and two subtractions, which are performed in the supermultiplier 30 by seven adders 122, 124, 126, 128, 130, 142 and 144, and two subtractors 120 and 140.
A control input terminal 160 provides an ip value corresponding to the ip coefficient of equations 4 and 5. Four multiplexers 150, 152, 154 and 156 are coupled to receive output values from the adders and subtractors 120-144 and also to receive the ip value from the control input terminal 160.
The subtractor 120 is coupled to receive the I F(u,v) I value from the input teπninal 100 and a shifted value of I F(u,v) I from the shifter 110, which corresponds to a division of 26. The multiplexer 150 receives the I F(u,v) I value from the input terminal 100 and the calculated value from the subtractor 120. Under the control of the ip value, the multiplier 150 is therefore able to provide the value I F(u,v) I and the value of equation 6, in the combination required by the expression in parentheses on the first lines of equations 4 and 5.
Similarly the multiplexers 152, 154 and 156 provide the appropriate combinations of derived I F(u,v) I values in order to achieve the remaining expressions in parentheses of equations 4 and 5.
A register 40 of FIG. 1 is coupled to receive the combinations of derived I F(u,v) I values from the supermultiplier 30, for providing the values under the control of a clocking signal.
A multiplexer block 60 is coupled to receive the clocked values from the register 40, and to receive control values to be further described below, for multiplexing the values in accordance with the expressions in parentheses with the control values in order to provide muxa and muxb values in accordance with equations 4 and 5. The control values are the calculated values b2b, bib, b2a and bla in accordance with equations 4 and 5.
An adder/subtractor block 70 is coupled to receive the muxa and muxb values from the multiplexer 60, and to receive control values from the control block 50, for performing the addition or subtraction of the muxa and the muxb calculations according to the expression in square parentheses of equation 3. The control values received by the adder/subtractor block 70 correspond to the values sf, sa, sb, za and zb.
A control block 50 provides control values to the supermultiplier 30, the multiplexer 60 and the adder/subtractor 70. The u and v values may be expressed as:-
u = u2uluθ and v = v2vχvo where UQ and VQ are the least significant bits of u and v.
Then the ip value may be calculated as:-
ip = uθ XOR VQ Equation 13
A parity and sign register 58 of the control block 50 is coupled to receive the u and v values from the u and v latch 22, and a value from the latch 24 in order to derive the ip values according to equation 13, to be used by the supermultiplier 30.
A cosine decoder 52 is also coupled to receive the u and v values from the latch 22, for providing the control values for the multiplexer 60 and the adder/subtractor 70. The cosine decoder 52 comprises a number of cells 54, to be further described below. The control values for the multiplexer 60 consist of calculated values b2b, bib, b2a and bla in accordance with equations 4 and 5. The control values for the adder/subtractor 70 consist of the values sf, sa, sb, za and zb.
The control values b2b, bib, b2a and bla are related to C(u), C(v) and to the expressions (2j+l)u + (2k+l)v and (2j+l)u - (2k+l)v. As the cosine function is periodic with a period of 2π, the above expressions have only to be computed modulo 32.
The cosine decoder 54 thus performs the following computations:-
(2j+l)uc + (2k+Dvc for the range j=0 to 7 and k=0 to 7 Equation 14
(2j+l)uc - (2k+l)vc for the range j=0 to 7 and k=0 to 7 Equation 15 where uc = u for u ≠ 0, and uc = 4 for u = 0 and vc = v for v ≠ 0, and vc = 4 for v = 0
The substitutions of uc and vc for u and v are to take account of the effect of C(u) and CO).
Due to trigonometric properties, it is necessary only to perform the calculations of equations 14 and 15 in the ranges j = 0, 1, 2 and 3 and k = 0, 1, 2 and 3, which results in 32 expressions :-
uc + vc 3uc + vc 5uc + vc 7uc + vc uc + 3vc 3uc + 3vc 5uc + 3vc 7uc + 3vc uc + 5vc 3uc + 5vc 5uc + 5vc 7uc + 5vc uc + 7vc 3uc + 7vc 5uc + 7vc 7uc + 7vc
uc- vc 3uc - vc 5uc - vc 7uc - vc uc - 3vc 3uc - 3vc 5uc - 3vc 7uc - 3vc uc - 5vc 3uc - 5vc 5uc - 5vc 7uc - 5vc uc - 7vc 3uc - 7vc 5uc - 7vc 7uc - 7vc
Referring now also to FIG.3, a cell 54 of the cosine decoder 52 is shown in more detail. Each cell performs one of the 32 expressions above. Input terminals 202 and 204 are coupled to receive the u and v values from the latch 22. An adder 210 calculates the expression, the result being a 5 bit word. Using the terminology expressed above, the five bit word contains bits b4, b3, b2, bl and bO, where bO is the least significant bit.
A two's complement block 220 is coupled to receive the 5 bit word, and calculates the two's complement of that word, which will be referred to as the word composed of bits b4c, b3c, b2c, ble and bOc.
A multiplexer 230 is coupled to receive the two's complemented word, and the 5 bit word from the adder 210, for providing the following products at outputs 254 and 256 respectively:-
(b2c b3) + (b2 • bl) Equation 16
(ble • b3) + (bl bl) Equation 17 Logic gates 240 and 242 are arranged to receive the two's complemented word, the 5 bit word from the adder 210, and a further input from input terminal 244, for providing the following products at outputs 250 and 252 respectively: -
sa = b4 XOR b3 Equation 18 za = b3 b3c Equation 19
In this case input terminal 244 receives a zero value. Other cells in the cosine decoder 52 are arranged to receive values appropriate for the calculations of that cell, and to provide the other values sb, zb.
A cosine register 56 is arranged to receive and hold the control values output from the cosine decoder 54, for providing the control values to the multiplexer 60 and the adder/subtractor 70 under the control of the clock signal.
An adder/accumulator 80 receives the muxa, muxb values according to the expressions in square parentheses of equation 3 from the adder/subtractor 70. The values output from the adder/subtractor 70 are required to be added and accumulated over the range u = 0 to 7 and v = 0 to 7.
However, this would require the adder/accumulator 80 to be very large. In order to reduce the number of adder stages within the adder/accumulator 80, four partial sums are performed instead:-
fa(j,k) = -^ ∑ ∑ {(-l)sf [(-l)saza muxa + (-l)sbzb muxb]}Equation 20 u even v even fb(j,k) = ^ ∑ ∑ {(-l)sf [(-l)sa za muxa + (-l)sbzb muxb]}Equation 21 ° u even v odd f (J>k) = | ∑ ∑ ((-l)sff(-l)saza muxa + (-l)sbzb muxb]JEquation 22 u odd v even f , (j,k) = - ∑ ∑ ((-l)sf f(-l)saza muxa + (-l)sbzb muxb]]Equation 23 8 u odd v odd 1 L JJ
These partial calculations are completed once all 64 DCT coefficients, F(u,v) of one block have been processed, the partial results being the final values stored in the elements of the adder/accumulator 80. Then the partial results are output to an output stage 90.
The output stage 90 comprises a partial sums latch 92 and an output adder 96. The partial sums latch latches the partial results received from the adder/accumulator 80, at which point the adder/accumulator 80 may be reset, ready to receive the muxa, muxb values from the next block to be processed.
The output adder 96 is arranged to combine the partial sums stored in the latch 92, according to their j and k coordinates as follows:-
f(j,k)=fa(j,k) + fbO'.k) + fc(j,k) + fd(j,k) for j=0 to 3 and k=0 to 3 f(j,k)=fa(j,7-k) + fb(j,7-k) + fc(j.7-k) + fd(j,7-k) for j=0 to 3 and k=4 to 7 f(j,k)=fa(7-j,k) + f (7-j,k) + fc(7-j,k) + fd(7-j,k) for j=4 to 7 and k=0 to 3 f(j,k)=fa(7-j,7-k) + fb(7-j,7-k) + fc(7-j,7-k) + fd(7-j,7-k) for j=4 to 7 and k=4 to 7
In this way the DCT coefficients ofthe block are processed according to an inverse transform method, to reconstruct the picture elements of the block.
The clock signal controls the clocking of the first and second latches 22 and 24, the register 40 and the adder/accumulator 80. The provision of registers and latches of the circuit 10 allow for pipelined operations within the circuit 10, which increases the speed and efficiency of the image processing.
It will be appreciated by a person skilled in the art that alternative embodiments to the one described above are possible. For example, the circuit is not limited to inverse discrete cosine transform functions. It is envisaged that the circuit 10 could be simply adapted to process images according to other functions, such the inverse discrete sine transform function.
Furthermore, the supermultiplier 30 may be arranged to perform substantially the same function, but using different elements, such as the use of fewer addition and subtraction blocks, in exchange for more mux blocks.
Finally, it will be appreciated that this method may be applied to block sizes different of N=8, assuming that N remains a power of 2.

Claims

Claims
1. An image processing circuit, for processing transform function coefficients to generate an image, the circuit comprising: an input latch for receiving a transform function coefficient and an associated set of predetermined coordinates; a multiplying stage coupled to receive an unsigned value ofthe transform function coefficient and for multiplying the unsigned value by a plurality of transform functions to produce a plurality of products; a decoder coupled to receive the predetermined coordinates and the sign of the transform function coefficient from the latch, for providing control signals; a multiplexer coupled to receive the plurality of products, for selecting a number of pairs of products in dependence upon the control signals; and, an accumulator arranged to store partial picture element values and coupled to receive the number of pairs and with respect to each of the number, for adding or subtracting each element of the pair to a selected one of the partial picture element values in dependence upon the control signals.
2. A circuit as claimed in claim 1 wherein the multiplying stage comprises a number of shifters, a number of adder/subtractors and a number of multiplexers in order to provide the plurality of products.
3. A circuit as claimed in claim 1 or claim 2 wherein the decoder further comprises a plurality of cells, each cell providing a partial control signal.
4. A circuit as claimed in any preceding claim wherein the transform function is an inverse discrete sine transform function.
5. A circuit as claimed in any of the claims 1 to 3 wherein the transform function is an inverse discrete cosine transform function.
6. A method for processing a set of transform function coefficients to produce picture elements of an image, comprising the steps of: receiving a transform function coefficient and an associated set of predetermined coordinates; multiplying an unsigned value of the transform function coefficient by a plurality of transform functions to produce a plurality of products; decoding the predeteπnined coordinates and the sign of the transform function coefficient to provide control signals; selecting a number of pairs from the plurality of products in dependence upon the control signals; adding or subtracting each element of the pair to a selected one of the partial picture element values in dependence upon the control signals, whereby processing the last transform function coefficient of the set produces the picture elements from the partial picture element values.
7. A method as claimed in claim 6 wherein the transform function is an inverse discrete cosine transform function.
8. A method as claimed in claim 6 wherein the transform function is an inverse discrete sine transform function.
9. An image processing circuit substantially as hereinbefore described and with reference to the drawings.
10. An image processing method substantially as hereinbefore described and with reference to the drawings.
PCT/EP1996/002816 1995-07-01 1996-06-27 Image processing circuit and method WO1997002535A1 (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
EP96923959A EP0799453A1 (en) 1995-07-01 1996-06-27 Image processing circuit and method
JP9504790A JPH10505445A (en) 1995-07-01 1996-06-27 Image processing circuit and method

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
GB9513431A GB2303012B (en) 1995-07-01 1995-07-01 Image processing circuit and method
GB9513431.8 1995-07-01

Publications (1)

Publication Number Publication Date
WO1997002535A1 true WO1997002535A1 (en) 1997-01-23

Family

ID=10776982

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/EP1996/002816 WO1997002535A1 (en) 1995-07-01 1996-06-27 Image processing circuit and method

Country Status (5)

Country Link
EP (1) EP0799453A1 (en)
JP (1) JPH10505445A (en)
GB (1) GB2303012B (en)
HK (1) HK1013699A1 (en)
WO (1) WO1997002535A1 (en)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP0447269A2 (en) * 1990-03-16 1991-09-18 Fujitsu Limited An image data processing system
EP0474246A2 (en) * 1990-09-06 1992-03-11 Matsushita Electric Industrial Co., Ltd. Image signal processor
EP0572262A2 (en) * 1992-05-28 1993-12-01 C-Cube Microsystems, Inc. Decoder for compressed video signals
US5276784A (en) * 1990-12-28 1994-01-04 Sony Corporation 2-D discrete cosine transform circuit with reduced number of multipliers
US5426462A (en) * 1993-05-13 1995-06-20 Intel Corporation Apparatus for encoding signals using a configurable transform circuit

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP0447269A2 (en) * 1990-03-16 1991-09-18 Fujitsu Limited An image data processing system
EP0474246A2 (en) * 1990-09-06 1992-03-11 Matsushita Electric Industrial Co., Ltd. Image signal processor
US5276784A (en) * 1990-12-28 1994-01-04 Sony Corporation 2-D discrete cosine transform circuit with reduced number of multipliers
EP0572262A2 (en) * 1992-05-28 1993-12-01 C-Cube Microsystems, Inc. Decoder for compressed video signals
US5426462A (en) * 1993-05-13 1995-06-20 Intel Corporation Apparatus for encoding signals using a configurable transform circuit

Also Published As

Publication number Publication date
GB2303012B (en) 1999-11-03
GB2303012A (en) 1997-02-05
GB9513431D0 (en) 1995-09-06
JPH10505445A (en) 1998-05-26
HK1013699A1 (en) 1999-09-03
EP0799453A1 (en) 1997-10-08

Similar Documents

Publication Publication Date Title
Shams et al. NEDA: A low-power high-performance DCT architecture
Wang et al. RNS application for digital image processing
US6546480B1 (en) Instructions for arithmetic operations on vectored data
Toivonen et al. Video filtering with Fermat number theoretic transforms using residue number system
JPH0526229B2 (en)
EP0275979A2 (en) Circuit for computing the quantized coefficient discrete cosine transform of digital signal samples
WO1995033241A1 (en) High-speed arithmetic unit for discrete cosine transform and associated operation
US6574651B1 (en) Method and apparatus for arithmetic operation on vectored data
US20050004963A1 (en) Parallel adder-based DCT/IDCT design using cyclic convolution
US6052703A (en) Method and apparatus for determining discrete cosine transforms using matrix multiplication and modified booth encoding
US6003058A (en) Apparatus and methods for performing arithimetic operations on vectors and/or matrices
Panda Performance Analysis and Design of a Discreet Cosine Transform processor Using CORDIC algorithm
Wahid et al. Error-free computation of 8/spl times/8 2D DCT and IDCT using two-dimensional algebraic integer quantization
US6463081B1 (en) Method and apparatus for fast rotation
Deepsita et al. Energy efficient and multiplierless approximate integer DCT implementation for HEVC
Shen et al. Pipelined implementation of AI-based Loeffler DCT
WO1997002535A1 (en) Image processing circuit and method
Taylor et al. Design for the discrete cosine transform in VLSI
Muddhasani et al. Bilinear algorithms for discrete cosine transforms of prime lengths
Bruguera et al. 2-D DCT using on-line arithmetic
Agostini et al. A FPGA based design of a multiplierless and fully pipelined JPEG compressor
Boussakta A novel method for parallel image processing applications
Li et al. Low power design of two-dimensional DCT
Kalyani et al. Fpga implementation of fully parallel distributed arithmetic based dct architecture
Ismail et al. High speed on-chip multiple cosine transform generator

Legal Events

Date Code Title Description
AK Designated states

Kind code of ref document: A1

Designated state(s): JP US

AL Designated countries for regional patents

Kind code of ref document: A1

Designated state(s): AT BE CH DE DK ES FI FR GB GR IE IT LU MC NL PT SE

ENP Entry into the national phase

Ref country code: US

Ref document number: 1997 750976

Date of ref document: 19970226

Kind code of ref document: A

Format of ref document f/p: F

121 Ep: the epo has been informed by wipo that ep was designated in this application
WWE Wipo information: entry into national phase

Ref document number: 1996923959

Country of ref document: EP

WWP Wipo information: published in national office

Ref document number: 1996923959

Country of ref document: EP

WWR Wipo information: refused in national office

Ref document number: 1996923959

Country of ref document: EP

WWW Wipo information: withdrawn in national office

Ref document number: 1996923959

Country of ref document: EP