US20210014090A1 - Time Domain Discrete Transform Computation - Google Patents
Time Domain Discrete Transform Computation Download PDFInfo
- Publication number
- US20210014090A1 US20210014090A1 US16/508,796 US201916508796A US2021014090A1 US 20210014090 A1 US20210014090 A1 US 20210014090A1 US 201916508796 A US201916508796 A US 201916508796A US 2021014090 A1 US2021014090 A1 US 2021014090A1
- Authority
- US
- United States
- Prior art keywords
- signals
- pulse width
- signal
- increment signal
- counters
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 239000011159 matrix material Substances 0.000 claims abstract description 78
- 230000001360 synchronised effect Effects 0.000 claims abstract description 45
- 238000000034 method Methods 0.000 claims description 36
- 230000008569 process Effects 0.000 claims description 9
- 230000000875 corresponding effect Effects 0.000 description 44
- 238000007906 compression Methods 0.000 description 36
- 230000006835 compression Effects 0.000 description 35
- 238000012545 processing Methods 0.000 description 14
- 238000010586 diagram Methods 0.000 description 12
- 238000013139 quantization Methods 0.000 description 11
- 230000005540 biological transmission Effects 0.000 description 6
- 230000000295 complement effect Effects 0.000 description 5
- 230000006870 function Effects 0.000 description 5
- 230000010354 integration Effects 0.000 description 5
- 238000013461 design Methods 0.000 description 4
- 230000006872 improvement Effects 0.000 description 4
- 230000001419 dependent effect Effects 0.000 description 3
- 238000005265 energy consumption Methods 0.000 description 3
- 230000007274 generation of a signal involved in cell-cell signaling Effects 0.000 description 3
- 238000009825 accumulation Methods 0.000 description 2
- 238000004364 calculation method Methods 0.000 description 2
- 238000006243 chemical reaction Methods 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 230000002441 reversible effect Effects 0.000 description 2
- 230000000630 rising effect Effects 0.000 description 2
- 238000005096 rolling process Methods 0.000 description 2
- 230000008054 signal transmission Effects 0.000 description 2
- 239000002131 composite material Substances 0.000 description 1
- 238000010276 construction Methods 0.000 description 1
- 238000007796 conventional method Methods 0.000 description 1
- 230000002596 correlated effect Effects 0.000 description 1
- 230000009467 reduction Effects 0.000 description 1
- 230000001960 triggered effect Effects 0.000 description 1
Images
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L25/00—Baseband systems
- H04L25/38—Synchronous or start-stop systems, e.g. for Baudot code
- H04L25/40—Transmitting circuits; Receiving circuits
- H04L25/49—Transmitting circuits; Receiving circuits using code conversion at the transmitter; using predistortion; using insertion of idle bits for obtaining a desired frequency spectrum; using three or more amplitude levels ; Baseband coding techniques specific to data transmission systems
- H04L25/4902—Pulse width modulation; Pulse position modulation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/10—Complex mathematical operations
- G06F17/14—Fourier, Walsh or analogous domain transformations, e.g. Laplace, Hilbert, Karhunen-Loeve, transforms
- G06F17/145—Square transforms, e.g. Hadamard, Walsh, Haar, Hough, Slant transforms
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/10—Complex mathematical operations
- G06F17/14—Fourier, Walsh or analogous domain transformations, e.g. Laplace, Hilbert, Karhunen-Loeve, transforms
- G06F17/148—Wavelet transforms
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/10—Complex mathematical operations
- G06F17/16—Matrix or vector computation, e.g. matrix-matrix or matrix-vector multiplication, matrix factorization
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/60—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using transform coding
- H04N19/63—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using transform coding using sub-band based transform, e.g. wavelets
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N25/00—Circuitry of solid-state image sensors [SSIS]; Control thereof
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N25/00—Circuitry of solid-state image sensors [SSIS]; Control thereof
- H04N25/70—SSIS architectures; Circuits associated therewith
- H04N25/71—Charge-coupled device [CCD] sensors; Charge-transfer registers specially adapted for CCD sensors
- H04N25/75—Circuitry for providing, modifying or processing image signals from the pixel array
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N25/00—Circuitry of solid-state image sensors [SSIS]; Control thereof
- H04N25/70—SSIS architectures; Circuits associated therewith
- H04N25/76—Addressed sensors, e.g. MOS or CMOS sensors
- H04N25/78—Readout circuits for addressed sensors, e.g. output amplifiers or A/D converters
-
- H04N5/378—
Definitions
- the present invention relates generally to a system and method for signal processing designs, and, in particular embodiments, to a transform computation apparatus and method.
- a transform is a mathematical operation which maps signals between two different domains.
- the Discrete Fourier Transform maps sampled time domain signals to the frequency domain signals.
- the changed signal properties after a transform operation can be exploited for various purposes such as analysis of the signals.
- Many types of transforms exist.
- the Discrete Cosine Transform (DCT), which is the basis for Joint Photographic Experts Group (JPEG) compression, exploits the 2D image signal sparsity in the transformed domain.
- JPEG Joint Photographic Experts Group
- DWT Discrete Wavelet Transform
- the Haar Wavelet Transform is one form of the DWT.
- a first counter of a plurality of counters of an apparatus receives a plurality of pulse width signals in the time domain.
- the first counter generates a first increment signal in the time domain from the plurality of pulse width signals based on a first row of a Discrete Transform matrix.
- a synchronizer of the apparatus receives the first increment signal.
- the synchronizer generates a first synchronized increment signal in the time domain from the first increment signal.
- a first accumulator of a plurality of accumulators of the apparatus receives the first synchronized increment signal.
- the first accumulator accumulates the first synchronized increment signal over a period of time to generate a first frequency domain signal.
- the plurality of counters may further comprise a second counter.
- the second counter may receive the plurality of pulse width signals in the time domain.
- the second counter may then generate a second increment signal in the time domain from the plurality of pulse width signals based on a second row of the Discrete Transform matrix.
- the synchronizer may receive the second increment signal.
- the synchronizer may then generate a second synchronized increment signal in the time domain from the second increment signal.
- the plurality of accumulators may further comprise a second accumulator.
- the second accumulator may receive the second synchronized increment signal.
- the second accumulator may then accumulate the second synchronized increment signal over the period of time to generate a second frequency domain signal.
- the number of the plurality of pulse width signals may equal a number of the plurality of counters.
- the plurality of counters may include N counters including the first counter, and an i-th counter of the plurality of counters may receive the plurality of pulse width signals in the time domain.
- the i-th counter may generate an i-th increment signal in the time domain from the plurality of pulse width signals based on an i-th row of the Discrete Transform matrix.
- the Discrete Transform matrix may be an N ⁇ N Discrete Transform matrix.
- the synchronizer may receive the i-th increment signal.
- the synchronizer may generate an i-th synchronized increment signal in the time domain from the i-th increment signal.
- the plurality of accumulators may further comprise N accumulators including the first accumulator.
- An i-th accumulator of the N accumulators may receive the i-th synchronized increment signal.
- the i-th accumulator may then accumulate the i-th synchronized increment signal over the period of time to generate an i-th frequency domain signal.
- the number of the plurality of counters may equal one of 4, 8, or 16. In some embodiments, the plurality of counters may process the plurality of pulse width signals in parallel. In some embodiments, the number of the plurality of counters may equal 4.
- the first row of the Discrete Transform matrix may be [1, 1, 1, 1].
- the plurality of pulse width signals may comprise a first pulse width signal, a second pulse width signal, a third pulse width signal, and a fourth pulse width signal.
- the first increment signal may comprise an addition of the first pulse width signal, the second pulse width signal, the third pulse width signal, and the fourth pulse width signal in the time domain.
- the apparatus may further comprise a clock divider.
- the divider may set a first clock rate.
- the first clock rate may be a first fraction of a system clock rate.
- the clock divider may feed the first clock rate to the first accumulator through a first multiplexor.
- the plurality of accumulators may further comprise N accumulators including the first accumulator.
- the clock divider may set an i-th clock rate.
- the i-th clock rate may be an i-th fraction of the system clock rate for an i-th accumulator of the N accumulators.
- the clock divider may feed the i-th clock rate to the i-th accumulator through an i-th multiplexor.
- the Discrete Transform matrix may be one of a Walsh matrix, or a Haar matrix.
- a time domain Discrete Transform block of an apparatus receives N pulse width signals.
- the time domain Discrete Transform block generates N frequency domain signals.
- An output module of the apparatus stores or transmits information associated with the N frequency domain signals.
- the information associated with the N frequency domain signals may be the N frequency domain signals.
- the apparatus may further comprise a run length encoder.
- the run length encoder may run length encode the N frequency domain signals to generate run length encoded signals.
- the apparatus may further comprise an entropy encoder.
- the entropy encoder may entropy encode the run length encoded signals to generate entropy encoded signals.
- the information associated with the N frequency domain signals may be the entropy encoded signals.
- the N frequency domain signals may be N quantized frequency domain signals.
- the time domain Discrete Transform block may comprise N counters.
- An i-th counter of the N counters may receive the N pulse width signals in the time domain.
- the i-th counter of the N counters may generate an i-th increment signal in the time domain from the N pulse width signals based on an i-th row of a Discrete Transform matrix.
- the Discrete Transform matrix may be an N ⁇ N Discrete Transform matrix.
- the time domain Discrete Transform block may further comprise a synchronizer.
- the synchronizer may receive the i-th increment signal.
- the synchronizer may generate an i-th synchronized increment signal in the time domain from the i-th increment signal.
- the time domain Discrete Transform block may further comprise N accumulators.
- An i-th accumulator of the N accumulators may receive the i-th synchronized increment signal.
- the i-th accumulator of the N accumulators may accumulate the i-th synchronized increment signal over a period of time to generate an i-th frequency domain signal.
- N may be one of 4, 8, or 16.
- the apparatus may comprise a plurality of N comparators.
- the plurality of N comparators may receive outputs from N pixels and generate the N pulse width signals from the N pixels.
- the apparatus may be an image sensor readout device.
- FIG. 1 illustrates a conventional system that performs the Walsh Transform
- FIG. 2A illustrates a time domain Discrete Transform block (e.g., time domain Walsh Transform block or time domain Haar Wavelet Transform block) for improved Discrete Transform computation, according to some embodiments;
- a time domain Discrete Transform block e.g., time domain Walsh Transform block or time domain Haar Wavelet Transform block
- FIG. 2B shows an example waveform diagram of the signals involved in the computation performed by the time domain Discrete Transform block (e.g., time domain Walsh Transform block or time domain Haar Wavelet Transform block), according to some embodiments;
- the time domain Discrete Transform block e.g., time domain Walsh Transform block or time domain Haar Wavelet Transform block
- FIG. 3 illustrates a time domain Discrete Transform block (e.g., time domain Walsh Transform block or time domain Haar Wavelet Transform block) with coefficient quantization for compression;
- Discrete Transform block e.g., time domain Walsh Transform block or time domain Haar Wavelet Transform block
- FIG. 4 illustrates an image readout device using a Walsh Transform block to read out a number of pixels simultaneously, according some embodiments
- FIG. 5 illustrates a flow chart of a method for performing time domain Discrete Transform, according to some embodiments
- FIG. 6 illustrates a flow chart of a method for performing image sensor readout using the Discrete Transform based compression, according to some embodiments
- FIGS. 7A-7B illustrate block diagrams of the 4-input counters used in this disclosure, according to some embodiments.
- FIGS. 8A-8B illustrates block diagrams of the 8-input counters used in this disclosure, according to some embodiments.
- FIGS. 9A-9C illustrate block diagrams of the 16-input counters used in this disclosure, according to some embodiments.
- FIG. 10 shows a block diagram of one example embodiment accumulator
- FIG. 11 shows a block diagram of one example embodiment clock divider.
- the Walsh Transform is also known as the Hadamard Transform, Walsh-Hadamard Transform, Hadamard-Rademacher-Walsh Transform, or Walsh-Fourier Transform.
- the goal of performing the Walsh Transform is to compress the signal by removing redundant data.
- the Walsh Transform itself is mathematically reversible and lossless.
- ADC analog-to-digital converter
- TDC time-to-digital converter
- FIG. 1 shows a conventional system 100 that performs the Walsh Transform using the Fast Walsh Transform algorithm.
- the Fast Walsh Transform module 104 takes digital signals and converts the digital signals to frequency domain signals h 1 , h 2 , h 3 , and h 4 .
- the input vector comprises time domain signals, such as pulse width signals c 1 , c 2 , c 3 , and c 4 .
- Pulse width signals are commonly used to represent, in the time domain, light intensity incident on a pixel in an image sensor or voltage when combined with a voltage controlled delay unit (VCDU). So, a time-to-digital converter TDC 102 is required in the system 100 .
- VCDU voltage controlled delay unit
- the TDC 102 converts the input signals c 1 , c 2 , c 3 , and c 4 in the time domain to the digital signals d 1 , d 2 , d 3 , and d 4 , respectively.
- the digital signals d 1 , d 2 , d 3 , and d 4 are digital representations of the input signals c 1 , c 2 , c 3 , and c 4 in the time domain, respectively.
- the Fast Walsh Transform module 104 takes the digital signals d 1 , d 2 , d 3 , and d 4 and converts them into frequency domain signals h 1 , h 2 , h 3 , and h 4 (labelled as h ⁇ 1:4> in FIG. 1 ).
- the TDC and ADC typically have high power consumption.
- the conventional system 100 for the Walsh Transform requires a huge amount of processing operations, arithmetic blocks, memory space, die area, and energy consumption.
- Embodiments of this disclosure provide methods and apparatuses for performing the Walsh Transform computation on the pulse width signals in the time domain during the conversion process. In so doing, embodiments of this disclosure provide technical improvement over the conventional Walsh Transform system by reducing the amount of processing operations, arithmetic blocks, memory usage, die area, and power consumption.
- FIG. 2A shows a system 200 as an example time domain Discrete Transform block (e.g., time domain Walsh Transform block or time domain Haar Wavelet Transform block) for improved Walsh Transform computation, according to some embodiments.
- the system 200 comprises multiple parallel counters 202 , a synchronizer 204 , and multiple accumulators 206 .
- the number of the parallel counters 202 and the number of the accumulators 206 depend on the number of the parallel input signals.
- the system 200 may require N parallel counters 202 and N accumulators 206 for processing N parallel input signals.
- FIG. 2A shows 4 parallel input signals, such as 4 parallel pulse width signals c 1 , c 2 , c 3 , and c 4 .
- the system 200 comprises 4 parallel counters 202 and 4 accumulators 206 .
- FIG. 2A provides a non-limiting example embodiment system 200 for processing 4 parallel input signals. If there are 8 or 16 parallel input signals, the system 200 may comprise 8 parallel counters and 8 accumulators, or 16 parallel counters and 16 accumulators, respectively, and so on.
- each of the parallel counters 202 are all the parallel input signals (e.g., c 1 , c 2 , c 3 , and c 4 ).
- Each of the parallel counters 202 takes all the parallel input signals and performs addition/subtraction to the parallel input signals according to the Walsh matrix.
- the Walsh matrix used depends on the number of parallel counters. For a system of N parallel counters for processing N parallel input signals, an N ⁇ N Walsh matrix is used.
- the first counter performs addition/subtraction to the parallel input signals according to the first row of the Walsh matrix
- the i-th counter performs addition/subtraction of the parallel input signals according to the i-th row of the Walsh matrix
- the N-th counter performs addition/subtraction to the parallel input signals according to the N-th row of the Walsh matrix.
- the output of each counter is an increment signal that is the digital representation of the result of addition/subtraction of the parallel input according to the corresponding row of the Walsh matrix.
- FIG. 2A shows an example system 200 having 4 parallel counters taking 4 pulse width signals in the time domain (c 1 , c 2 , c 3 , and c 4 ) as the input signals. So, each counter uses a corresponding row of the 4 ⁇ 4 Walsh matrix below to perform addition/subtraction of the pulse width signals c 1 , c 2 , c 3 , and c 4 .
- the value 1 in an entry of the Walsh matrix indicates addition of the corresponding input pulse width signal
- the value ⁇ 1 in an entry of the Walsh matrix indicates subtraction of the corresponding input pulse width signal.
- the first counter of the parallel counters 202 uses the first row of the 4 ⁇ 4 Walsh matrix, [1 1 1 1]. So the first counter performs INCR 1 increment signal INCR 1 , which is a digital representation of (c 1 +c 2 +c 3 +c 4 ).
- the second counter of the parallel counters 202 uses the second row of the 4 ⁇ 4 Walsh matrix, [1 ⁇ 1 1 ⁇ 1].Because the first entry and the third entry of this row [1 ⁇ 1 1 ⁇ 1] are 1, the second counter performs addition for the first and the third input pulse width signals (c 1 and c 3 ). Also, because the second entry and the fourth entry of this row [1 ⁇ 1 1 ⁇ 1] are ⁇ 1, the second counter performs subtraction for the second and the fourth input pulse signals (c 2 and c 4 ). That is, the output of the second counter is a generated second increment signal INCR 2 , which is the digital representation of (c 1 +c 3 ⁇ c 2 ⁇ c 4 ).
- the third counter of the parallel counters 202 uses the third row of the 4 ⁇ 4 Walsh matrix, [1 1 ⁇ 1 ⁇ 1]. Because the first entry and the second entry of this row [1 1 ⁇ 1 ⁇ 1] are 1, the third counter performs addition for the first and the second input pulse width signals (c 1 and c 2 ). Also, because the third entry and the fourth entry of this row [1 1 ⁇ 1 ⁇ 1] are ⁇ 1, the third counter performs subtraction for the third and the fourth input pulse width signals (c 3 and c 4 ). That is, the output of the third counter is a generated third increment signal INCR 3 , which is the digital representation of (c 1 +c 2 ⁇ c 3 ⁇ c 4 ).
- the fourth counter of the parallel counters 202 uses the fourth row of the 4 ⁇ 4 Walsh matrix, [1 ⁇ 1 ⁇ 1 1]. Because the first entry and the fourth entry of this row [1 ⁇ 1 ⁇ 1 1] are 1, the fourth counter performs addition for the first and the fourth input pulse width signals (c 1 and c 4 ). Also, because the second entry and the third entry of this row [1 ⁇ 1 ⁇ 1 1] are ⁇ 1, the fourth counter performs subtraction for the second and the third input pulse width signals (c 2 and c 3 ). That is, the output of the third counter is a generated a fourth increment signal INCR 4 , which is the digital representation of (c 1 +c 4 ⁇ c 2 ⁇ c 3 ).
- Embodiment systems using the Walsh matrix of another size can be designed similarly, as understood by people skilled in the art.
- each counter of the 8 parallel counters uses the corresponding row of the 8 ⁇ 8 Walsh matrix below.
- the system 200 is not limited to Walsh Transform computation.
- the same system 200 in FIG. 2A may be used to compute other forms of the discrete transform.
- system 200 in FIG. 2A may be used to compute the Haar Wavelet Transform using the following 4 ⁇ 4 Haar matrix for a system with 4 parallel counters.
- the value i in an entry indicates addition of the corresponding input pulse width signal
- the value ⁇ 1 in an entry indicates subtraction of the corresponding input pulse width signal
- the value 0 in an entry indicates the corresponding input pulse width signal is not counted.
- the 4 increment signals generated by the 4 parallel counters for the Haar Wavelet Transform, respectively, are shown below.
- each counter of the 8 parallel counters uses the corresponding row of the 8 ⁇ 8 Haar matrix below.
- the outputs from the parallel counters 202 are integrated using the accumulators 206 .
- Each of the accumulators 206 addition/subtraction of the corresponding increment signal over time to generate a frequency domain signal h.
- the accumulator needs the increment signals to be synchronized.
- a synchronizer 204 may be included between the parallel counters 202 and the accumulators 206 .
- the input of the synchronizer 204 are the increment signals (e.g., INCR 1 , INCR 2 , INCR 3 , and INCR 4 ) output by the parallel counters 202 .
- the synchronizer 204 synchronizes on the increment signals and outputs the synchronized increment signals.
- Hardware and circuitry for performing signal synchronization are well known to people skilled in the art.
- the input of the accumulators 206 are the synchronized increment signals.
- the accumulators 206 perform integration of the increment signals.
- Hardware and circuitry for the accumulator performing signal integration are well known to people skilled in the art.
- the result of the integration is the outputs of the accumulators 206 , which may be represented in two's complement.
- Two's complement is a mathematical operation on binary numbers. The two's complement is calculated by inverting the digits and adding one. For example, for a four-bit binary number 0001, the two's complement is 1111 (inversion of 0001 is 1110, and 1110 plus 1 is 1111).
- the 4 outputs of the 4 corresponding accumulators 206 are the four frequency domain signals, h 1 , h 2 , h 3 , and h 4 , respectively, which are the same as the outputs as the Fast Walsh Transform Module 104 as shown in FIG. 1 (labeled as h ⁇ 1:4> in both FIG. 1 and FIG. 2A ).
- the main difference between the embodiment system 200 and the conventional system 100 is that the system 200 combines time-to-digital conversion with the discrete transform computation (e.g., Walsh Transform computation or Haar Wavelet Transform computation).
- the discrete transform computation e.g., Walsh Transform computation or Haar Wavelet Transform computation.
- the Walsh Transform encodes a signal as a sum of scalar components multiplied by corresponding rows of the Walsh Matrix.
- accumulation operations only occur when an increment signal is not 0. For example, if the input signals are correlated, the increment signals are often 0. So, no accumulation is needed in these situations, which leads to further reduction of power consumption.
- the goal of performing the Walsh Transform is to compress the signals by removing redundant data.
- the data redundancy depends on the signal spatial frequency, and the Walsh Transform splits the input signals by spatial frequency. Different frequencies can be quantized with different resolutions. The resolution of certain frequencies can be coarser when the corresponding data is less important.
- FIG. 2A above illustrates a time domain computation system and method of the Walsh Transform taking pulse width signals as the input.
- Pulse width signals are 1 added/subtracted according to rows in the Walsh matrix (using the parallel counters 202 ), and the results are integrated (using up/down accumulators 206 ).
- the output of the system 200 e.g., h 1 , h 2 , h 3 , and h 4
- the output of the system 200 is a domain transformed representation of the original input pulse width signals (e.g., c 1 , c 2 , c 3 , and c 4 ).
- the output can be sparse in the Walsh domain (i.e., the frequency domain), and therefore requiring less memory for storing information (lossless).
- the output can be suitable for further compression (e.g., lossy compression).
- FIG. 2B is a waveform diagram of the signals involved in the computation performed by the system 200 over a period of time t 1 to t 6 , according to some embodiments.
- the increment signals are sampled at the rising edges of the clock cycle, t 1 , t 2 , t 3 , t 4 , t 5 , and t 6 .
- the increment signal INCR 1 is sampled with the value 4
- the increment signal INCR 1 is sampled with the value 4
- the increment signal INCR 1 is sampled with the value 2
- the increment signal INCR 1 is sampled with the value 1
- the increment signal INCR 2 is sampled with the values 0, 0, 1, 0, ⁇ 1, 0, respectively.
- the corresponding frequency domain signal h 2 accumulated by the second accumulator is 0, 0, 1, 1, 0, and 0, respectively.
- the increment signal INCR 3 is sampled with the values 0, 0, 1, 2, 1, 0, respectively.
- the corresponding frequency domain signal h 3 accumulated by the third accumulator is 0, 0, 1, 3, 4, and 4, respectively.
- the increment signal INCR 4 is sampled with the values 0, 0, ⁇ 1, 0, ⁇ 1, 0, respectively.
- the corresponding frequency domain signal h 4 accumulated by the fourth accumulator is 0, 0, ⁇ 1, ⁇ 1, ⁇ 2, and 0, respectively.
- the Walsh Transform itself is mathematically reversible and lossless.
- a time domain Walsh Computing block such as the system 200 shown in FIG. 2 A
- there is no inherent compression aside from the potentially sparser signal representation but such potentially sparser signal representation is signal dependent. Quantization of the Walsh Transform result would require additional digital circuit(s).
- Walsh coefficient quantization would normally be performed using dedicated arithmetic blocks to implement the following computation:
- h is the transformed Walsh coefficient (i.e., the frequency domain signal)
- q is a quantization factor
- Q result is the final quantized coefficient.
- this disclosure utilizes the time domain Discrete Transform block (e.g., the system 200 ) described above.
- the accumulators integrate the increment signals at a certain given system clock rate.
- Each accumulator corresponds to one frequency component (e.g., one frequency domain signal component). Knowing that the different frequency components have a different level of importance in reconstructing the original signals, less important signals can be quantized more coarsely. Such improvement can be achieved by clocking less important accumulators at lower system clock rates. Further, an individual accumulator can even be turned off (effectively clocked at the 0 system clock rate) for not capturing the information in the corresponding input signals at all.
- FIG. 3 shows a system 300 as the time domain Discrete Transform block (e.g., time domain Walsh Transform block or time domain Haar Wavelet Transform block) with coefficient quantization for compression, according to some embodiments.
- the system 300 comprises multiple parallel counters 302 , a synchronizer 304 , multiple accumulators 306 , multiple multiplexors 308 , a system clock 310 , and a clock divider 312 .
- the number of the parallel counters 302 , the number of the accumulators 306 , and the number of the multiplexors 308 depend on the number of the parallel input signals.
- the system 300 for processing N parallel input signals require N parallel counters 302 , N accumulators 306 , and up to N multiplexors 308 .
- the number of multiplexors 308 can be less than N when two or more multiplexors share the same clock rate.
- FIG. 3 shows 4 parallel input signals, such as 4 parallel pulse width signals c 1 , c 2 , c 3 , and c 4 (the same as FIG. 2A ). So, the system 300 comprises 4 parallel counters 302 , 4 accumulators 306 , and 4 multiplexors 308 .
- FIG. 3 provides a non-limiting example embodiment system 300 for processing 4 parallel input signals. If there are 8 (or 16) parallel input signals, the system 300 may comprise 8 (or 16) parallel counters, 8 (or 16) accumulators, and 8 (or 16) multiplexors, and so on.
- the parallel counters 302 may be designed the same and perform the same functions as the parallel counters 202 as described with respect to FIG. 2A .
- the synchronizer 304 may be designed the same and perform the same functions as the synchronizer 204 as described with respect to FIG. 2A .
- the accumulators 306 may and perform the same functions (e.g., integration) as the accumulators 206 as described with respect to FIG. 2A .
- each of the accumulators 306 may further quantize the corresponding one of the frequency domain signals h 1 , h 2 , h 3 , and h 4 with different resolution/accuracy/quality by selecting a different clock rate supplied to the accumulator. That is, each of the accumulators 306 may take a separate clock rate as an additional input fed by a corresponding multiplexor of the multiplexors 308 ,
- the system clock 310 may have the system clock rate of CLK.
- the clock rate divider 312 may set the clock rate for the frequency domain signals h 1 , h 2 , h 3 , and h 4 with CLK, CLK/2, CLK/4, and CLK /8, respectively.
- the different clock rates are supplied to each corresponding accumulator of the accumulators 306 through a corresponding multiplexor of the multiplexors 308 for quantization.
- Hardware and circuitry for the clock rate divider and the multiplexor are well known to the people skilled in the art.
- the clock rate CLK may be fed to the first accumulator of the accumulators 306
- the clock rate CLK/2 may be fed to the second accumulator of the accumulators 306
- the clock rate CLK/4 may be fed to the third accumulator of the accumulators 306
- the clock rate CLK/8 may be fed to the fourth accumulator of the accumulators 306 .
- the accumulators 306 may effectively quantize the frequency domain signals h 1 , h 2 , h 3 , and h 4 into the quantized frequency domain signals h 1 ,
- clock rates do not need to be limited to the factors of 2.
- FIG. 3 illustrates the time domain Discrete Transform block (e.g., time domain Walsh Transform block or time domain Haar Wavelet Transform block) with coefficient quantization for compression.
- the disclosed technique selectively controls the quantization accuracy of individual time domain computed Walsh Transform coefficients by reducing the clock rates of corresponding accumulators.
- the quantization level can be flexibly selected based on the clocking frequency (i.e., clock rate). Lower clocking frequency also results in lower power consumption. Compression severity and quality can also be controlled by setting different clock rates.
- the reconstructed signal quality can track the energy required to acquire the signal, which allows for energy and quality scaling.
- the time domain Discrete Transform block (e.g., time domain Walsh Transform block or time domain Haar Wavelet Transform block) as illustrated in FIGS. 2 and 3 can be used for image compression. Images of real scenes are characteristically sparse when represented in the frequency domain, containing large areas of low frequency signals (smooth gradients) and local high frequency signals (edges). Such characteristics allow for image compression if the signals associated with an image are transformed, for example, using the Discrete Cosine Transform (DCT) as in the JPEG standard.
- DCT Discrete Cosine Transform
- An image sensor can produce a large amount of data which is costly to store or transmit in terms of storage space or transmission resources.
- the sooner the image data is compressed the higher the overall savings for the storage space or the transmission resources. Accordingly, compressing the image at readout stage could significantly save power/die area for all subsequent stages of the image processing/transmission.
- Image compression can be a computationally and energetically intensive operation. But, image compression is still desirable where the storage space (e.g., memory or hard drive) or the available signal transmission bandwidth is limited.
- the current popular image compression techniques are designed to reduce the image file size as much as possible while preserving the quality of the images. These current compression techniques often compromise the energy used during compression process. Thus, an improved image compression solution with lower energy consumption is desirable.
- image compression close to signal generation as much as possible is preferred to reduce power consumption during later stages (e.g., signal storage or signal transmission). So, image compression would work the best when performed on two-dimensional (2D) areas.
- a rolling shutter sensor needs to expose multiple rows before the compression computations can start, which requires that the first rows be stored somewhere.
- pixels can be designed to output pulse width signals which are dependent on the light level.
- the image sensor readout embodiments are compatible with a time domain Discrete Transform based compressor, such as the systems 200 and 300 as described with respect to FIGS. 2 and 3 .
- the Walsh Matrix can be expanded to its 2D version to exploit the spatial frequency properties of real images in both directions.
- FIG. 4 shows a system 400 using a time domain Discrete Transform compression block (e.g., Walsh compression block) to read out a number of pixels simultaneously, according some embodiments.
- the pixel array is divided into blocks of 4 ⁇ 4 pixels and a row of blocks is read out simultaneously.
- the pixel blocks are connected to a column level Walsh compression block in a rolling shutter manner.
- the end result is a block row readout, with 4 rows of pixels read simultaneously, and this process is repeated to read the full pixel array of the image.
- Using the 4 ⁇ 4 pixel block is for illustration purposes only.
- the system 400 can be modified to process a 2 ⁇ 2 pixel block, or an 8 ⁇ 8 pixel block. In general, system 400 may be modified to use process N ⁇ N pixel block.
- the system 400 comprises a row decoder 402 and a readout module 404 .
- the readout module 404 comprises a Walsh compression block 420 .
- the system 400 may additionally comprise a run length encoder 408 and an entropy encoder 410 known in the art.
- the pixel array 414 comprises individual pixels of the image sensing element known in the art.
- the row decoder 402 known in the art may decode a row of 4 ⁇ 4 pixel blocks 412 from the pixel array 414 .
- the 4 ⁇ 4 pixel block 412 includes 16 pixels, p 1 , p 2 , p 3 , . . . , and p16.
- the 16 pixels are input into the Walsh compression block 420 .
- the Walsh compression block 420 comprises, multiple parallel counters 424 , a synchronizer 426 , multiple accumulators 428 , multiple multiplexors 430 , and a clock divider 432 .
- the number of the parallel counters 424 , the number of the multiple clock gating blocks 426 , and the number of the accumulators 428 depend on the number of pixels in the input pixel block. If there are N pixels in the input pixel block, the Walsh compression block 420 may comprise N parallel counters 424 , N accumulators 428 , and up to N multiplexors.
- FIG. 4 shows a 4 ⁇ 4 pixel block for illustration purpose.
- Comparators known in the art may be used to covert the pixel outputs from the pixel block 412 to pulse width signals.
- the comparators When there are 16 pixels in the pixel block 412 , the comparators generate corresponding 16 pulse width signals, c 1 , c 2 , c 3 , . . . , and c 16 (labeled as c ⁇ 1:16> in FIG. 4 ) in the time domain.
- the parallel counters 424 may be designed the same and perform the same functions as parallel counters 202 and 302 as described with respect to FIGS. 2A and 3 . When there are 16 parallel pulse width signals as shown in FIG.
- the Walsh compression block 420 comprises 16 parallel counters 424 for generating 16 increment signals (labeled as INCR ⁇ 1:16> in FIG. 4 ), using the 16 ⁇ 16 Walsh matrix.
- the accumulators 428 may be designed the same and perform the same function as the accumulators 206 in FIG. 2A , or as the accumulators 306 in FIG. 3 .
- the Walsh compression block 420 comprises 16 accumulators for generating 16 frequency domain signals h 1 , h 2 , h 3 , . . . , h 16 (labeled as h ⁇ 1:16> in FIG. 4 ).
- the Walsh compression block 420 may support coefficient quantization.
- the accumulators 428 can generate the quantized frequency domain signals. If the system 400 comprises the run length encoder 408 and the entropy encoder 410 , the output of the Walsh compression block 420 (frequency domain signals or quantized frequency domain signals) may be further compressed by these two encoders.
- the output of the system 400 is the compressed image signal output to be saved or transmitted by an output module (not shown in FIG. 4 ).
- the output module may be one known in the art for storing information locally on the device (on a hard drive or memory of the device).
- the output module may be one known in the art for transmitting information remotely over a network.
- the compressed image signal may be encoded frequency domain signals h 1 , h 2 , h 3 , . . .
- the compressed image signal may also be frequency domain signals h 1 , h 2 , h 3 , . . . , h 16 without being encoded by the run length encoder 408 and the entropy encoder 410 .
- FIG. 4 shows a readout for an image sensor array, implementing a time domain based, Walsh Transform block 420 on 2D sub-areas of the image.
- 4 ⁇ 4 or 2 ⁇ 2, or 8 ⁇ 8 pixels are read out in parallel and compressed at the same time, reducing the amount of data to be transferred to a storage medium (e.g., a memory or a hard drive) or to be transmitted to another device. In so doing, a smaller and more efficient representation of the image is achieved at a stage as close to signal generation stage as possible.
- a compressed image representation is available directly after readout, which would require less processing before storage or transmission. Less processing before storage or transmission results in lower power consumption and/or lower signal bandwidth requirement.
- the block-row readout technique disclosed with respect to FIG. 4 means that the image does not need to be stored in memory first and later retrieved for compression.
- FIG. 5 shows a flow chart of a method 500 for performing time domain Discrete Transform, according to some embodiments.
- the method 500 may be performed by a hard device, such as the system 200 or the system 300 described above.
- the method 500 starts at the operation 502 , where a first counter of a plurality of counters of an apparatus receives a plurality of pulse width signals in the time domain.
- the first counter generates a first increment signal in the time domain from the plurality of pulse width signals based on a first row of a Discrete Transform matrix.
- a synchronizer of the apparatus receives the first increment signal.
- the synchronizer generates a first synchronized increment signal in the time domain from the first increment signal.
- a first accumulator of a plurality of accumulators of the apparatus receives the first synchronized increment signal.
- the first accumulator accumulates the first synchronized increment signal over a period of time to generate a first frequency domain signal.
- the plurality of counters may further comprise a second counter.
- the second counter may receive the plurality of pulse width signals in the time domain.
- the second counter may then generate a second increment signal in the time domain from the plurality of pulse width signals based on a second row of the Discrete Transform matrix.
- the synchronizer may receive the second increment signal.
- the synchronizer may then generate a second synchronized increment signal in the time domain from the second increment signal.
- the plurality of accumulators may further comprise a second accumulator.
- the second accumulator may receive the second synchronized increment signal.
- the second accumulator may then accumulate the second synchronized increment signal over the period of time to generate a second frequency domain signal.
- the number of the plurality of pulse width signals may equal a number of the plurality of counters.
- the plurality of counters may include N counters including the first counter, and an i-th counter of the plurality of counters may receive the plurality of pulse width signals in the time domain.
- the i-th counter may generate an i-th increment signal in the time domain from the plurality of pulse width signals based on an i-th row of the Discrete Transform matrix.
- the Discrete Transform matrix may be an N ⁇ N Discrete Transform matrix.
- the synchronizer may receive the i-th increment signal.
- the synchronizer may generate an i-th synchronized increment signal in the time domain from the i-th increment signal.
- the plurality of accumulators may further comprise N accumulators including the first accumulator.
- An i-th accumulator of the N accumulators may receive the i-th synchronized increment signal.
- the i-th accumulator may then accumulate the i-th synchronized increment signal over the period of time to generate an i-th frequency domain signal.
- the number of the plurality of counters may equal one of 4, 8, or 16. In some embodiments, the plurality of counters may process the plurality of pulse width signals in parallel. In some embodiments, the number of the plurality of counters may equal 4.
- the first row of the Discrete Transform matrix may be [1, 1, 1, 1].
- the plurality of pulse width signals may comprise a first pulse width signal, a second pulse width signal, a third pulse width signal, and a fourth pulse width signal.
- the first increment signal may comprise an addition of the first pulse width signal, the second pulse width signal, the third pulse width signal, and the fourth pulse width signal in the time domain.
- the apparatus may further comprise a clock divider.
- the divider may set a first clock rate.
- the first clock rate may be a first fraction of a system clock rate.
- the clock divider may feed the first clock rate to the first accumulator through a first multiplexor.
- the plurality of accumulators may further comprise N accumulators including the first accumulator.
- the clock divider may set an i-th clock rate.
- the i-th clock rate may be an i-th fraction of the system clock rate for an i-th accumulator of the N accumulators.
- the clock divider may feed the i-th clock rate to the i-th accumulator through an i-th multiplexor.
- the Discrete Transform matrix may be one of a Walsh matrix, or a Haar matrix.
- FIG. 6 shows a flow chart of a method 600 for performing image sensor readout using the Discrete Transform based compression, according to some embodiments.
- the method 600 may be performed by a hard device, such as the system 400 described above.
- the method 600 starts at the operation 602 , where a time domain Discrete Transform block of an apparatus receives N pulse width signals.
- the time domain Discrete Transform block generates N frequency domain signals.
- an output module of the apparatus stores or transmits information associated with the N frequency domain signals.
- the information associated with the N frequency domain signals may be the N frequency domain signals.
- the apparatus may further comprise a run length encoder.
- the run length encoder may run length encode the N frequency domain signals to generate run length encoded signals.
- the apparatus may further comprise an entropy encoder.
- the entropy encoder may entropy encode the run length encoded signals to generate entropy encoded signals.
- the information associated with the N frequency domain signals may be the entropy encoded signals.
- the N frequency domain signals may be N quantized frequency domain signals.
- the time domain Discrete Transform block may comprise N counters.
- An i-th counter of the N counters may receive the N pulse width signals in the time domain.
- the i-th counter of the N counters may generate an i-th increment signal in the time domain from the N pulse width signals based on an i-th row of a Discrete Transform matrix.
- the Discrete Transform matrix may be an N ⁇ N Discrete Transform matrix.
- the time domain Discrete Transform block may further comprise a synchronizer.
- the synchronizer may receive the i-th increment signal.
- the synchronizer may generate an i-th synchronized increment signal in the time domain from the i-th increment signal.
- the time domain Discrete Transform block may further comprise N accumulators.
- An i-th accumulator of the N accumulators may receive the i-th synchronized increment signal.
- the i-th accumulator of the N accumulators may accumulate the i-th synchronized increment signal over a period of time to generate an i-th frequency domain signal.
- N may be one of 4, 8, or 16.
- the apparatus may comprise a plurality of N comparators.
- the plurality of N comparators may receive outputs from N pixels and generate the N pulse width signals from the N pixels.
- the apparatus may be an image sensor readout device.
- FIGS. 7A-7B shows more detailed block diagrams of the 4-input counters used in this disclosure, according to some embodiments.
- FIG. 7A shows one example 4-input counter 700 used as the first counter corresponding to the first row of the 4 ⁇ 4Walsh matrix [i 1 1 1 1], as described with respect to FIGS. 2A, 3, and 4 .
- the 4-input counter 700 receives 4 input signals i 0 , i 1 , i 2 , and i 3 (e.g., pulse width signals c 1 , c 2 , c 3 , and c 4 , respectively).
- the 4-input counter 700 generates the increment signal INCR 1 as a 3-bit (O 0 , O 1 , and O 2 , and O 3 ) output, which represents i 0 +i 1 +i 2 +i 3 .
- the 4-input counter 700 includes the full adder 702 .
- the 4-input counter 700 further includes the half adders 704 and 706 . These adders are connected as shown as shown in FIG. 7A . Hardware implementations of the full adder and the half adder are known in the art.
- a half adder such as the half adder 704 and the half adder 706 , adds two binary numbers A and B to produce a sum S and a carry output C.
- the truth table of the half adder is shown below.
- a full adder such as the full adder 702
- a full adder 702 is a logical circuit that performs an addition operation on three one-bit binary numbers (A, B and the carry input Cin).
- the outputs of the full adder are a sum S and a carry output C.
- the truth table of the full adder is shown in the table below.
- FIG. 7B shows one example 4-input counter 750 used as an subsequent counter corresponding to a subsequent row of the 4 ⁇ 4 Walsh matrix, as described with respect to FIGS. 2A, 3, and 4 .
- the 4-input counter 750 receives 4 input signals i 0 , i 1 , i 2 , and i 3 .
- the 4-input counter 750 generates the increment signal as a 3-bit (O 0 , O 1 , and O 2 ) output, which represents i 0 +i 1 ⁇ i 2 ⁇ i 3 .
- the 4-input counter 750 includes the half adders 752 and 754 .
- the 4-input counter 750 further includes a 2-bit subtractor 756 known in the art. These components are connected as shown in FIG. 7B .
- the 2-bit subtractor 756 receives A and B as the inputs, and the output of the 2-bit subtractor is A-B, represented by the 3-bit (O 0 , O 1 , and O 2 ) output.
- the 4-input counter 750 may be implemented for any of the subsequent counters (i.e., the counters corresponding to the second, the third, or the fourth row of the 4 ⁇ 4 Walsh matrix).
- the pulse width signals that correspond to “1” entries may be the input signals i 0 and i 1 , respectively.
- the pulse width signals that correspond to “ ⁇ 1” entries may be the input signals i 2 and i 3 , respectively.
- c 1 and c 3 may be the input signals i 0 and i 1 , respectively
- c 2 and c 4 may be the input signals i 2 and i 3 , respectively, for generating the increment signal INCR 2 (c 1 +c 3 ⁇ c 2 ⁇ c 4 ).
- c 1 and c 2 may be the input signals i 0 and i 1 , respectively, and c 3 and c 4 may be the input signals i 2 and i 3 , respectively, for generating the increment signal INCR 3 (c 1 +c 2 ⁇ c 3 ⁇ c 4 ).
- c 1 and c 4 may be the input signals i 0 and i 1 , respectively, and c 2 and c 3 may be the input signals i 2 and i 3 , respectively, for generating the increment signal INCR 3 (c 1 +c 4 ⁇ c 2 ⁇ c 3 ).
- FIGS. 8A-8B shows more detailed block diagrams of the 8-input counters used in this disclosure, according to some embodiments.
- FIG. 8A shows one example 8-input counter 800 used as the first counter corresponding to the first row of the 8 ⁇ 8 Walsh matrix [1 1 1 1 1 1 1], as described with respect to FIGS. 2A, 3, and 4 .
- the 8-input counter 800 receives 8 input signals i 0 , i 1 , i 2 , i 3 , i 4 , i 5 , i 6 , and i 7 (e.g., pulse width signals c 1 , c 2 , c3,c 4 , c 5 , c 6 , c 7 , and c 8 , respectively).
- the 8-input counter 800 generates the increment signal INCR 1 as the 4-bit output (O 0 , O 1 , O 2 , and O 3 ).
- the 8-input counter 800 includes the full adders 802 , 804 , 806 , and 808 .
- the 8-input counter 800 further includes the half adders 810 , 812 , and 814 . These adders are connected as shown in FIG. 8A .
- FIG. 8B shows one example 8-input counter 850 used as an subsequent counter corresponding to a subsequent row of the 8 ⁇ 8 Walsh matrix, as described with respect to FIGS. 2 A, 3 , and 4 .
- the 8-input counter 850 receives 8 input signals i 0 , i 1 , i 2 , i 3 , i 4 , i 5 , i 6 , and i 7 .
- the 8-input counter 850 generates the increment signal as a 4-bit (O 0 , O 1 , O 2 , and O 3 ) output, which represents i 0 +i 1 +i 2 +i 3 ⁇ i 4 ⁇ i 5 ⁇ i 6 ⁇ i 7 .
- the 8-input counter 850 includes the 4-input counters 852 and 854 .
- the 4-input counters 852 and 854 may be implemented the same as the 4-input counter 700 , as described with respect to FIG. 7A .
- the 8-input counter 850 further includes a 3-bit subtractor 856 known in the art. These components are connected as shown in FIG. 8B .
- the 3-bit subtractor 856 receives A and B as the inputs, and the output of the 3-bit subtractor is A-B, represented by the 4-bit (O 0 , O 1 , O 2 , and O 3 ) output.
- the 8-input counter 850 may be implemented for any of the subsequent counters (i.e., the counters corresponding to the second, the third, the fourth, . . . , or the 8th row of the 8 ⁇ 8 Walsh matrix).
- the pulse width signals that correspond to “1” entries may be the input signals i 0 , i 1 , i 2 , and i 3 , respectively.
- the pulse width signals that correspond to “ ⁇ 1” entries may be the input signals i 4 , i 5 , i 6 , and i 7 , respectively.
- c 1 , c 3 , c 5 , and c 7 may be the input signals i 0 , i 1 , i 2 , and i 3 , respectively.
- c 2 , c 4 , c 6 , and c 8 may be the input signals i 4 , i 5 , i 6 , and i 7 , respectively, for generating the increment signal INCR 2 (c 1 +c 3 +c 5 +c 7 ⁇ c 2 ⁇ c 4 ⁇ c 6 ⁇ c 8 ).
- FIGS. 9A-9C shows more detailed block diagrams of the 16-input counters used in this disclosure, according to some embodiments.
- FIG. 9A shows one example 16-input counter 900 used as the first counter corresponding to the first row of the 16 ⁇ 16 Walsh matrix [1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1], as described with respect to FIGS. 2A, 3, and 4 .
- the 16-input counter 900 receives 16 input signals i 0 , i 1 , i 2 , i 3 , i 4 , i 5 , i 6 , i 7 , i 8 , i 9 , i 10 , i 11 , i 12 , i 13 , i 14 , and i 15 (e.g., pulse width signals c 1 , c 2 , c 3 , c 4 , c 5 , c 6 , c 7 , c 8 , c 9 , c 10 , c 11 , c 12 , c 13 , c 14 , and c 15 , respectively).
- the 16-input counter 900 generates the increment signal INCR 1 as the 5-bit output (O 0 , O 1 , and O 2 , O 3 , and O 4 ).
- the 16-input counter 900 includes the full adders 902 , 904 , 906 , 908 , 910 , 912 , 914 , 916 , 918 , 920 , and 922 .
- the 16-input counter 900 further includes the half adders 924 , 926 , 928 , and 930 . These adders are connected as shown in FIG. 9A .
- FIG. 9B shows another example 16-input counter 930 used as the first counter corresponding to the first row of the 16 ⁇ 16 Walsh matrix [1 1 1 1 1 1 1 1 1 1 1 1 1 1 1], as described with respect to FIGS. 2A, 3, and 4 .
- the 16-input counter 930 includes two 8-input counters 932 and 934 .
- the 16-input counter 930 further includes a 4-bit adder 936 .
- Each of the two 8-input counters 932 and 934 may be designed the same as the 8-input counter as described with respect to FIG. 8A .
- the first 8 input signals (e.g., c 1 , c 2 , c 3 , c 4 , c 5 , c 6 , c 7 , and c 8 ) of the 16 input signals are the inputs of the 8-input counter 932 .
- the second 8 input signals (e.g., c 9 , c 10 , c 11 , C 12 , c 13 , C 14 , c 15 , and c 16 ) of the 16 input signals are the inputs of the 8-input counter 934 .
- the 4-bit output of the 8-input counter 934 is the 4-bit input A of the 4-bit adder 936 .
- the 4-bit output of the 8-input counter 932 is the 4-bit input B of the 4-bit adder 936 .
- the 4-bit adder 936 performs addition operation of the two inputs A and B and generates a 5 -bit output representing the sum of A and B.
- Hardware design and implementation of the 4-bit adder are known in the art.
- FIG. 9C shows one example 16-input counter 950 used as an subsequent counter corresponding to a subsequent row of the 16 ⁇ 16Walsh matrix, as described with respect to FIGS. 2A, 3, and 4 .
- the 16-input counter 950 receives 16 input signals i 0 , i 1 , i 2 , i 3 , i 4 , . . . , and i 15 .
- the 16-input counter 950 generates the increment signal as a 5-bit (Oo, O 1 , O 2 , O 3 , and O 4 ) output, which represents i 0 +i 1 +i 2 +i 3 +i 4 +i 5 +i 6 +i 7 ⁇ (i 8 +i 9 +i 10 +i 11 +i 12 +i 13 +i 14 +i 15 ).
- the 16-input counter 950 includes the 8-input counters 952 and 954 .
- the 8-input counters 952 and 954 may be implemented the same as the 8-input counter 800 , as described with respect to FIG. 8A .
- the 16-input counter 950 further includes a 4-bit subtractor 956 known in the art. These components are connected as shown in FIG.
- the 4-bit subtractor 956 receives A and B as the inputs, and the output of the 4-bit subtractor is A-B, represented by the 5-bit (O 0 , O 1 , O 2 , O 3 , and O 4 ) output.
- the 16-input counter 950 may be implemented for any of the subsequent counters (i.e., the counters corresponding to the second, the third, the fourth, . . . , or the sixteenth row of the 16 ⁇ 16 Walsh matrix).
- the pulse width signals that correspond to “1” entries may be the input signals i 0 , i 1 , i 2 , i 3 , . . . , and i 7 , respectively.
- the pulse width signals that correspond to “ ⁇ 1” entries may be the input signals i 8 , i 9 , i 10 , i 11 , . . . , and i 15 , respectively.
- c 1 , c 3 , c 5 , c 7 , c 9 , c 11 , c 13 , and c 15 may be the input signals i 0 , i 1 , i 2 , i 3 , i 4 , i 5 , i 6 , and i 7 respectively.
- c 2 , c 4 , c 6 , c 8 , c 10 , c 12 , c 14 , and c 16 may be the input signals i 8 , i 9 , i 10 , i 11 , i 12 , i 13 , i 14 , and i 15 , respectively, for generating the increment signal INCR 2 below
- the counters described with respect to FIGS. 7A, 8A, and 9A may be used as the first counter for the Haar Wavelet Transform, corresponding the first row having all “1” entries.
- the 8-input counter 850 as described with respect to FIG. 8B may be used.
- the 8-input counter 850 may be used for the second counter corresponding to the second row of the 8 ⁇ 8 Haar matrix to generate the increment signal INCR 2 (c 1 +c 2 +c 3 +c 4 ⁇ c 5 ⁇ c 6 ⁇ c 7 ⁇ c 8 ).
- the 4-input counter 750 as described with respect to FIG. 7B may be used.
- the 4-input counter 750 may be used for the third counter corresponding to the third row of the 8 ⁇ 8 Haar matrix to generate the increment signal INCR 3 (c 1 +c 2 ⁇ c 3 ⁇ c 4 ).
- a 1-bit subtractor known in the art may be used.
- the 1-bit subtractor may be used for the fifth counter corresponding to the fifth row of the 8 ⁇ 8 Haar matrix to generate the increment signal INCR 5 (c 1 ⁇ c 2 ).
- FIG. 10 shows the block diagram of one example embodiment accumulator 1000 that may be implemented for the accumulators 206 in FIG. 2A , the accumulators 306 in FIG. 3 , and the accumulators 428 in FIG. 4 .
- the accumulator 1000 comprises the inverters 1002 A-D, the multiplexers 1004 A-D, the full adders 1006 A-D, and the flip flips 1008 A-D known in the art.
- the flip flips 1008 A-D may be D flip-flops.
- a D flip-flop is an edge-triggered memory circuit.
- the D flip-flop has three inputs: a data input (D) that defines the next state, a timing control input (CLK) that tells the flip-flop exactly when to “memorize” the data input, and a reset input (RST) that can cause the memory to be reset to o regardless of the other two inputs (usually referred as asynchronous reset).
- D data input
- CLK timing control input
- RST reset input
- the output of a D flip-flop is Q.
- Accumulator 1000 is a 4-bit accumulator receiving an increment signal represented as 4 bits (I 0 , I 1 , I 2 , and I 3 ). The accumulator 1000 counts up or down based on signal D, on each rising clock edge, representing the resulting number in 2's complement binary notation. The output of the accumulator is the accumulated signal D.
- FIG. 11 shows the block diagram of one example embodiment clock divider 1100 that may be implemented for the clock divider 310 in FIG. 3 and the clock divider 432 in FIG. 4 .
- the clock divider 1100 is a D type flip-flop clock divider.
- the clock divider 1100 comprises inverters 1102 A-E and flip-flops 1104 A-E, connected as shown in FIG. 11 .
- the clock rate to each accumulator does not have to be a factor of 2 of the fastest clock. So, other clocking circuits may be used.
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Mathematical Physics (AREA)
- Theoretical Computer Science (AREA)
- Computational Mathematics (AREA)
- Mathematical Analysis (AREA)
- Mathematical Optimization (AREA)
- Pure & Applied Mathematics (AREA)
- Data Mining & Analysis (AREA)
- Signal Processing (AREA)
- Multimedia (AREA)
- Software Systems (AREA)
- Databases & Information Systems (AREA)
- General Engineering & Computer Science (AREA)
- Algebra (AREA)
- Spectroscopy & Molecular Physics (AREA)
- Computer Networks & Wireless Communication (AREA)
- Computing Systems (AREA)
- Compression, Expansion, Code Conversion, And Decoders (AREA)
- Compression Of Band Width Or Redundancy In Fax (AREA)
- Complex Calculations (AREA)
- Radar Systems Or Details Thereof (AREA)
Abstract
Description
- The present invention relates generally to a system and method for signal processing designs, and, in particular embodiments, to a transform computation apparatus and method.
- Generally, a transform is a mathematical operation which maps signals between two different domains. For example, the Discrete Fourier Transform (DFT) maps sampled time domain signals to the frequency domain signals. The changed signal properties after a transform operation can be exploited for various purposes such as analysis of the signals. Many types of transforms exist. In another example, the Discrete Cosine Transform (DCT), which is the basis for Joint Photographic Experts Group (JPEG) compression, exploits the 2D image signal sparsity in the transformed domain. In yet another example, the Discrete Wavelet Transform (DWT), which is the basis for the JPEG2000 compressed image file format, discretely samples the wavelets. The Haar Wavelet Transform is one form of the DWT.
- Calculating a transform is a computationally expensive task, often requiring a huge amount of processing operations, arithmetic blocks such as adders and/or multipliers, memory space, die area, and energy consumption. Thus, systems and methods that improve the calculation of transforms are desired. This disclosure relates to improvement of apparatus and method for calculation of various Discrete Transforms.
- In accordance with embodiments, a first counter of a plurality of counters of an apparatus receives a plurality of pulse width signals in the time domain. The first counter generates a first increment signal in the time domain from the plurality of pulse width signals based on a first row of a Discrete Transform matrix. A synchronizer of the apparatus receives the first increment signal. The synchronizer generates a first synchronized increment signal in the time domain from the first increment signal. A first accumulator of a plurality of accumulators of the apparatus receives the first synchronized increment signal. The first accumulator accumulates the first synchronized increment signal over a period of time to generate a first frequency domain signal.
- In some embodiments, the plurality of counters may further comprise a second counter. The second counter may receive the plurality of pulse width signals in the time domain. The second counter may then generate a second increment signal in the time domain from the plurality of pulse width signals based on a second row of the Discrete Transform matrix. The synchronizer may receive the second increment signal. The synchronizer may then generate a second synchronized increment signal in the time domain from the second increment signal. The plurality of accumulators may further comprise a second accumulator. The second accumulator may receive the second synchronized increment signal. The second accumulator may then accumulate the second synchronized increment signal over the period of time to generate a second frequency domain signal.
- In some embodiments, the number of the plurality of pulse width signals may equal a number of the plurality of counters. The plurality of counters may include N counters including the first counter, and an i-th counter of the plurality of counters may receive the plurality of pulse width signals in the time domain. The i-th counter may generate an i-th increment signal in the time domain from the plurality of pulse width signals based on an i-th row of the Discrete Transform matrix. The Discrete Transform matrix may be an N×N Discrete Transform matrix. The synchronizer may receive the i-th increment signal. The synchronizer may generate an i-th synchronized increment signal in the time domain from the i-th increment signal. The plurality of accumulators may further comprise N accumulators including the first accumulator. An i-th accumulator of the N accumulators may receive the i-th synchronized increment signal. The i-th accumulator may then accumulate the i-th synchronized increment signal over the period of time to generate an i-th frequency domain signal.
- In some embodiments, the number of the plurality of counters may equal one of 4, 8, or 16. In some embodiments, the plurality of counters may process the plurality of pulse width signals in parallel. In some embodiments, the number of the plurality of counters may equal 4. The first row of the Discrete Transform matrix may be [1, 1, 1, 1]. The plurality of pulse width signals may comprise a first pulse width signal, a second pulse width signal, a third pulse width signal, and a fourth pulse width signal. The first increment signal may comprise an addition of the first pulse width signal, the second pulse width signal, the third pulse width signal, and the fourth pulse width signal in the time domain.
- In some embodiments, the apparatus may further comprise a clock divider. The divider may set a first clock rate. The first clock rate may be a first fraction of a system clock rate. The clock divider may feed the first clock rate to the first accumulator through a first multiplexor. In some embodiments, the plurality of accumulators may further comprise N accumulators including the first accumulator. The clock divider may set an i-th clock rate. The i-th clock rate may be an i-th fraction of the system clock rate for an i-th accumulator of the N accumulators. The clock divider may feed the i-th clock rate to the i-th accumulator through an i-th multiplexor.
- In some embodiments, the Discrete Transform matrix may be one of a Walsh matrix, or a Haar matrix.
- In accordance with embodiments, a time domain Discrete Transform block of an apparatus receives N pulse width signals. The time domain Discrete Transform block generates N frequency domain signals. An output module of the apparatus stores or transmits information associated with the N frequency domain signals.
- In some embodiments, the information associated with the N frequency domain signals may be the N frequency domain signals.
- In some embodiments, the apparatus may further comprise a run length encoder. The run length encoder may run length encode the N frequency domain signals to generate run length encoded signals. The apparatus may further comprise an entropy encoder. The entropy encoder may entropy encode the run length encoded signals to generate entropy encoded signals. The information associated with the N frequency domain signals may be the entropy encoded signals.
- In some embodiments, the N frequency domain signals may be N quantized frequency domain signals.
- In some embodiments, the time domain Discrete Transform block may comprise N counters. An i-th counter of the N counters may receive the N pulse width signals in the time domain. The i-th counter of the N counters may generate an i-th increment signal in the time domain from the N pulse width signals based on an i-th row of a Discrete Transform matrix. The Discrete Transform matrix may be an N×N Discrete Transform matrix.
- In some embodiments, the time domain Discrete Transform block may further comprise a synchronizer. The synchronizer may receive the i-th increment signal. The synchronizer may generate an i-th synchronized increment signal in the time domain from the i-th increment signal. The time domain Discrete Transform block may further comprise N accumulators. An i-th accumulator of the N accumulators may receive the i-th synchronized increment signal. The i-th accumulator of the N accumulators may accumulate the i-th synchronized increment signal over a period of time to generate an i-th frequency domain signal.
- In some embodiments, N may be one of 4, 8, or 16. In some embodiments, the apparatus may comprise a plurality of N comparators. The plurality of N comparators may receive outputs from N pixels and generate the N pulse width signals from the N pixels.
- In some embodiments, the apparatus may be an image sensor readout device.
- The foregoing has outlined rather broadly the features of an embodiment of the present disclosure in order that the detailed description of the disclosure that follows may be better understood. Additional features and advantages of embodiments of the disclosure will be described hereinafter, which form the subject of the claims of the disclosure. It should be appreciated by those skilled in the art that the conception and specific embodiments disclosed may be readily utilized as a basis for modifying or designing other structures or processes for carrying out the same purposes of the present disclosure. It should also be realized by those skilled in the art that such equivalent constructions do not depart from the spirit and scope of the disclosure as set forth in the appended claims
- For a more complete understanding of the present disclosure, and the advantages thereof, reference is now made to the following descriptions taken in conjunction with the accompanying drawings, in which:
-
FIG. 1 illustrates a conventional system that performs the Walsh Transform; -
FIG. 2A illustrates a time domain Discrete Transform block (e.g., time domain Walsh Transform block or time domain Haar Wavelet Transform block) for improved Discrete Transform computation, according to some embodiments; -
FIG. 2B shows an example waveform diagram of the signals involved in the computation performed by the time domain Discrete Transform block (e.g., time domain Walsh Transform block or time domain Haar Wavelet Transform block), according to some embodiments; -
FIG. 3 illustrates a time domain Discrete Transform block (e.g., time domain Walsh Transform block or time domain Haar Wavelet Transform block) with coefficient quantization for compression; -
FIG. 4 illustrates an image readout device using a Walsh Transform block to read out a number of pixels simultaneously, according some embodiments; -
FIG. 5 illustrates a flow chart of a method for performing time domain Discrete Transform, according to some embodiments; -
FIG. 6 illustrates a flow chart of a method for performing image sensor readout using the Discrete Transform based compression, according to some embodiments; -
FIGS. 7A-7B illustrate block diagrams of the 4-input counters used in this disclosure, according to some embodiments; -
FIGS. 8A-8B illustrates block diagrams of the 8-input counters used in this disclosure, according to some embodiments; -
FIGS. 9A-9C illustrate block diagrams of the 16-input counters used in this disclosure, according to some embodiments; -
FIG. 10 shows a block diagram of one example embodiment accumulator; and -
FIG. 11 shows a block diagram of one example embodiment clock divider. - Corresponding numerals and symbols in the different figures generally refer to corresponding parts unless otherwise indicated. The figures are drawn to clearly illustrate the relevant aspects of the embodiments and are not necessarily drawn to scale
- This application relates to improvement of apparatus and method for discrete transform, such as the Walsh Transform. The Walsh Transform is also known as the Hadamard Transform, Walsh-Hadamard Transform, Hadamard-Rademacher-Walsh Transform, or Walsh-Fourier Transform. The goal of performing the Walsh Transform is to compress the signal by removing redundant data. The Walsh Transform itself is mathematically reversible and lossless.
- Conventional methods for the Walsh Transform require the input to be already digitized. Such requirement assumes some form of analog-to-digital converter (ADC), or time-to-digital converter (TDC). The ADC and TDC typically have high power consumption.
-
FIG. 1 shows aconventional system 100 that performs the Walsh Transform using the Fast Walsh Transform algorithm. In thesystem 100, the FastWalsh Transform module 104 takes digital signals and converts the digital signals to frequency domain signals h1, h2, h3, and h4. However, often times, the input vector comprises time domain signals, such as pulse width signals c1, c2, c3, and c4. Pulse width signals are commonly used to represent, in the time domain, light intensity incident on a pixel in an image sensor or voltage when combined with a voltage controlled delay unit (VCDU). So, a time-to-digital converter TDC 102 is required in thesystem 100. TheTDC 102 converts the input signals c1, c2, c3, and c4 in the time domain to the digital signals d1, d2, d3, and d4, respectively. The digital signals d1, d2, d3, and d4 are digital representations of the input signals c1, c2, c3, and c4 in the time domain, respectively. Then, the FastWalsh Transform module 104 takes the digital signals d1, d2, d3, and d4 and converts them into frequency domain signals h1, h2, h3, and h4 (labelled as h<1:4> inFIG. 1 ). - As explained above, the TDC and ADC typically have high power consumption. Also, like systems for many other types of transforms, the
conventional system 100 for the Walsh Transform requires a huge amount of processing operations, arithmetic blocks, memory space, die area, and energy consumption. Embodiments of this disclosure provide methods and apparatuses for performing the Walsh Transform computation on the pulse width signals in the time domain during the conversion process. In so doing, embodiments of this disclosure provide technical improvement over the conventional Walsh Transform system by reducing the amount of processing operations, arithmetic blocks, memory usage, die area, and power consumption. -
FIG. 2A shows asystem 200 as an example time domain Discrete Transform block (e.g., time domain Walsh Transform block or time domain Haar Wavelet Transform block) for improved Walsh Transform computation, according to some embodiments. Thesystem 200 comprises multipleparallel counters 202, asynchronizer 204, andmultiple accumulators 206. The number of theparallel counters 202 and the number of theaccumulators 206 depend on the number of the parallel input signals. For example, thesystem 200 may require N parallel counters 202 andN accumulators 206 for processing N parallel input signals. - For illustration purpose,
FIG. 2A shows 4 parallel input signals, such as 4 parallel pulse width signals c1, c2, c3, and c4. So, thesystem 200 comprises 4parallel counters accumulators 206.FIG. 2A provides a non-limitingexample embodiment system 200 for processing 4 parallel input signals. If there are 8 or 16 parallel input signals, thesystem 200 may comprise 8 parallel counters and 8 accumulators, or 16 parallel counters and 16 accumulators, respectively, and so on. - The input of each of the
parallel counters 202 are all the parallel input signals (e.g., c1, c2, c3, and c4). Each of theparallel counters 202 takes all the parallel input signals and performs addition/subtraction to the parallel input signals according to the Walsh matrix. The Walsh matrix used depends on the number of parallel counters. For a system of N parallel counters for processing N parallel input signals, an N×N Walsh matrix is used. Specifically, the first counter performs addition/subtraction to the parallel input signals according to the first row of the Walsh matrix, the i-th counter performs addition/subtraction of the parallel input signals according to the i-th row of the Walsh matrix, and the N-th counter performs addition/subtraction to the parallel input signals according to the N-th row of the Walsh matrix. The output of each counter is an increment signal that is the digital representation of the result of addition/subtraction of the parallel input according to the corresponding row of the Walsh matrix. -
FIG. 2A shows anexample system 200 having 4 parallel counters taking 4 pulse width signals in the time domain (c1, c2, c3, and c4) as the input signals. So, each counter uses a corresponding row of the 4×4 Walsh matrix below to perform addition/subtraction of the pulse width signals c1, c2, c3, and c4. -
- The
value 1 in an entry of the Walsh matrix indicates addition of the corresponding input pulse width signal, and the value −1 in an entry of the Walsh matrix indicates subtraction of the corresponding input pulse width signal. For example, the first counter of theparallel counters 202 uses the first row of the 4×4 Walsh matrix, [1 1 1 1]. So the first counter performs INCR1 increment signal INCR1, which is a digital representation of (c1+c2+c3+c4). - The second counter of the
parallel counters 202 uses the second row of the 4×4 Walsh matrix, [1 −1 1 −1].Because the first entry and the third entry of this row [1 −1 1 −1] are 1, the second counter performs addition for the first and the third input pulse width signals (c1 and c3). Also, because the second entry and the fourth entry of this row [1 −1 1 −1] are −1, the second counter performs subtraction for the second and the fourth input pulse signals (c2 and c4). That is, the output of the second counter is a generated second increment signal INCR2, which is the digital representation of (c1+c3−c2−c4). - The third counter of the
parallel counters 202 uses the third row of the 4×4 Walsh matrix, [1 1 −1 −1]. Because the first entry and the second entry of this row [1 1 −1 −1] are 1, the third counter performs addition for the first and the second input pulse width signals (c1 and c2). Also, because the third entry and the fourth entry of this row [1 1−1−1] are −1, the third counter performs subtraction for the third and the fourth input pulse width signals (c3 and c4). That is, the output of the third counter is a generated third increment signal INCR3, which is the digital representation of (c1+c2−c3−c4). - The fourth counter of the
parallel counters 202 uses the fourth row of the 4×4 Walsh matrix, [1 −1 −1 1]. Because the first entry and the fourth entry of this row [1 −1 −1 1] are 1, the fourth counter performs addition for the first and the fourth input pulse width signals (c1 and c4). Also, because the second entry and the third entry of this row [1 −1 −1 1] are −1, the fourth counter performs subtraction for the second and the third input pulse width signals (c2 and c3). That is, the output of the third counter is a generated a fourth increment signal INCR4, which is the digital representation of (c1+c4−c2−c3). - Embodiment systems using the Walsh matrix of another size can be designed similarly, as understood by people skilled in the art. For example, for an embodiment system of 8 parallel counters processing 8 parallel input pulse width signals (e.g., pulse width signals c1, c2, c3, c4, c5, c6, c7, and c8), each counter of the 8 parallel counters uses the corresponding row of the 8×8 Walsh matrix below.
-
- The 8 increment signals generated by the 8 parallel counters as understood by people skilled in the art, respectively, are shown below.
- INCR1=c1+c2+c3+c4+c5+c6+c7+c8
- INCR2=c1+c3+c5+c7−c2−c4−c6−c8
- INCR3=c1+c2+c5+c6−c3−c4−c7−c8
- INCR4=c1+c4+c5+c8−c2−c3−c6−c7
- INCR5=c1+c2+c3+c4−c5−c6−c7−c8
- INCR6=c1+c3+c6+c8−c2−c4−c5−c7
- INCR7=c1+c2+c7+c8−c3−c4−c5−c6
- INCR8=c1+c4+c6+c7−c2−c3−c5−c8
- The
system 200 is not limited to Walsh Transform computation. Thesame system 200 inFIG. 2A may be used to compute other forms of the discrete transform. For example,system 200 inFIG. 2A may be used to compute the Haar Wavelet Transform using the following 4×4 Haar matrix for a system with 4 parallel counters. -
- The value i in an entry indicates addition of the corresponding input pulse width signal, the value −1 in an entry indicates subtraction of the corresponding input pulse width signal, and the
value 0 in an entry indicates the corresponding input pulse width signal is not counted. The 4 increment signals generated by the 4 parallel counters for the Haar Wavelet Transform, respectively, are shown below. - INCR1=c1+c2+c3+c4
- INCR2=c1+c2−c3−c4
- INCR3=c1−c2
- INCR4=c3−c4
- For an embodiment system of 8 parallel counters processing 8 parallel input pulse width signals (e.g., pulse width signals c1, c2, c3, c4, c5, c6, c7, and c8), each counter of the 8 parallel counters uses the corresponding row of the 8×8 Haar matrix below.
-
- The 8 increment signals generated by the 8 parallel counters as understood by people skilled in the art, respectively, are shown below.
- INCR1=c1+c2+c3+c4+c5+c6+c7+c8
- INCR2=c1+c2+c3+c4−c5−c6−c7−c8
- INCR3=c1+c2−c3−c4
- INCR4=c5+c6−c7−c8
- INCR5=c1−c2
- INCR6=c3−c4
- INCR7=c5−c6
- INCR8=c7−c8
- The outputs from the parallel counters 202 (i.e., increment signals) are integrated using the
accumulators 206. Each of theaccumulators 206 addition/subtraction of the corresponding increment signal over time to generate a frequency domain signal h. - The accumulator needs the increment signals to be synchronized. To reduce the complexity of the accumulators, a
synchronizer 204 may be included between theparallel counters 202 and theaccumulators 206. The input of thesynchronizer 204 are the increment signals (e.g., INCR1, INCR2, INCR3, and INCR4) output by the parallel counters 202. Thesynchronizer 204 synchronizes on the increment signals and outputs the synchronized increment signals. Hardware and circuitry for performing signal synchronization are well known to people skilled in the art. - The input of the
accumulators 206 are the synchronized increment signals. Theaccumulators 206 perform integration of the increment signals. Hardware and circuitry for the accumulator performing signal integration are well known to people skilled in the art. - The result of the integration is the outputs of the
accumulators 206, which may be represented in two's complement. Two's complement is a mathematical operation on binary numbers. The two's complement is calculated by inverting the digits and adding one. For example, for a four-bit binary number 0001, the two's complement is 1111 (inversion of 0001 is 1110, and 1110plus 1 is 1111). InFIG. 2A , the 4 outputs of the 4corresponding accumulators 206 are the four frequency domain signals, h1, h2, h3, and h4, respectively, which are the same as the outputs as the FastWalsh Transform Module 104 as shown inFIG. 1 (labeled as h<1:4> in bothFIG. 1 andFIG. 2A ). - As shown in
FIG. 2A , the main difference between theembodiment system 200 and theconventional system 100 is that thesystem 200 combines time-to-digital conversion with the discrete transform computation (e.g., Walsh Transform computation or Haar Wavelet Transform computation). With the embodiment system and method described above, the increment signal generation and integration can be performed in the time domain. - As shown in
FIG. 2A , there is no need for a TDC in the embodiment Walsh Transform block of thesystem 200 prior to the Walsh Transform block. Such embodiment technique reduces power consumption of the system over the conventional systems. The Walsh Transform encodes a signal as a sum of scalar components multiplied by corresponding rows of the Walsh Matrix. In some embodiments, accumulation operations only occur when an increment signal is not 0. For example, if the input signals are correlated, the increment signals are often 0. So, no accumulation is needed in these situations, which leads to further reduction of power consumption. - As described above, the goal of performing the Walsh Transform is to compress the signals by removing redundant data. In some cases (e.g. when the input signals relate to an image), the data redundancy depends on the signal spatial frequency, and the Walsh Transform splits the input signals by spatial frequency. Different frequencies can be quantized with different resolutions. The resolution of certain frequencies can be coarser when the corresponding data is less important.
-
FIG. 2A above illustrates a time domain computation system and method of the Walsh Transform taking pulse width signals as the input. Pulse width signals are 1added/subtracted according to rows in the Walsh matrix (using the parallel counters 202), and the results are integrated (using up/down accumulators 206). The output of the system 200 (e.g., h1, h2, h3, and h4) is a domain transformed representation of the original input pulse width signals (e.g., c1, c2, c3, and c4). If thesystem 200 is applied to appropriate input signals, the output can be sparse in the Walsh domain (i.e., the frequency domain), and therefore requiring less memory for storing information (lossless). In addition, the output can be suitable for further compression (e.g., lossy compression). -
FIG. 2B is a waveform diagram of the signals involved in the computation performed by thesystem 200 over a period of time t1 to t6, according to some embodiments. The increment signals are sampled at the rising edges of the clock cycle, t1, t2, t3, t4, t5, and t6. For the first accumulator, at time t1, the increment signal INCR1 is sampled with thevalue 4, and the frequency domain signal h1 accumulated by the first accumulator is 4 (0+4=4). At time t2, the increment signal INCR1 is sampled with thevalue 4, and the frequency domain signal h1 accumulated by the first accumulator is 8 (4+4=8). At time t3, the increment signal INCR1 is sampled with thevalue 3, and the frequency domain signal hi accumulated by the first accumulator is 11 (8+3=11). At time t4, the increment signal INCR1 is sampled with thevalue 2, and the frequency domain signal h1 accumulated by the first accumulator is 13 (11+2=11). At time t5, the increment signal INCR1 is sampled with thevalue 1, and the frequency domain signal h1 accumulated by the first accumulator is 14 (13+1=14). At time t6, the increment signal INCR1 is sampled with thevalue 0, and the frequency domain signal h1 accumulated by the first accumulator is 14 (14+0=14). - Similarly, for the second accumulator, at time t1, t2, t3, t4, t5, and t6, the increment signal INCR2 is sampled with the
values values values - As explained above, the Walsh Transform itself is mathematically reversible and lossless. With a time domain Walsh Computing block (such as the
system 200 shown in FIG. 2A), there is no inherent compression aside from the potentially sparser signal representation. But such potentially sparser signal representation is signal dependent. Quantization of the Walsh Transform result would require additional digital circuit(s). - In conventional systems, Walsh coefficient quantization would normally be performed using dedicated arithmetic blocks to implement the following computation:
-
- Here, h is the transformed Walsh coefficient (i.e., the frequency domain signal), q is a quantization factor, and Qresult is the final quantized coefficient. There are drawbacks of the conventional systems. The conventional systems require initially acquiring the input signals at high quality. The conventional systems also require the division operation and the rounding operation shown in the equation above, which significantly impact the system complexity and the system performance.
- To further improve the Walsh Transform coefficient quantization for compression, this disclosure utilizes the time domain Discrete Transform block (e.g., the system 200) described above. Within the time domain Discrete Transform block (e.g., time domain Walsh Transform block or time domain Haar Wavelet Transform block), the accumulators integrate the increment signals at a certain given system clock rate. Each accumulator corresponds to one frequency component (e.g., one frequency domain signal component). Knowing that the different frequency components have a different level of importance in reconstructing the original signals, less important signals can be quantized more coarsely. Such improvement can be achieved by clocking less important accumulators at lower system clock rates. Further, an individual accumulator can even be turned off (effectively clocked at the 0 system clock rate) for not capturing the information in the corresponding input signals at all.
-
FIG. 3 shows asystem 300 as the time domain Discrete Transform block (e.g., time domain Walsh Transform block or time domain Haar Wavelet Transform block) with coefficient quantization for compression, according to some embodiments. Thesystem 300 comprises multipleparallel counters 302, asynchronizer 304,multiple accumulators 306,multiple multiplexors 308, asystem clock 310, and aclock divider 312. The number of theparallel counters 302, the number of theaccumulators 306, and the number of themultiplexors 308 depend on the number of the parallel input signals. For example, thesystem 300 for processing N parallel input signals require N parallel counters 302,N accumulators 306, and up toN multiplexors 308. The number ofmultiplexors 308 can be less than N when two or more multiplexors share the same clock rate. -
FIG. 3 shows 4 parallel input signals, such as 4 parallel pulse width signals c1, c2, c3, and c4 (the same asFIG. 2A ). So, thesystem 300 comprises 4parallel counters accumulators multiplexors 308.FIG. 3 provides a non-limitingexample embodiment system 300 for processing 4 parallel input signals. If there are 8 (or 16) parallel input signals, thesystem 300 may comprise 8 (or 16) parallel counters, 8 (or 16) accumulators, and 8 (or 16) multiplexors, and so on. - In
FIG. 3 , theparallel counters 302 may be designed the same and perform the same functions as theparallel counters 202 as described with respect toFIG. 2A . Thesynchronizer 304 may be designed the same and perform the same functions as thesynchronizer 204 as described with respect toFIG. 2A . Theaccumulators 306 may and perform the same functions (e.g., integration) as theaccumulators 206 as described with respect toFIG. 2A . In addition, each of theaccumulators 306 may further quantize the corresponding one of the frequency domain signals h1, h2, h3, and h4 with different resolution/accuracy/quality by selecting a different clock rate supplied to the accumulator. That is, each of theaccumulators 306 may take a separate clock rate as an additional input fed by a corresponding multiplexor of themultiplexors 308, - For example, the
system clock 310 may have the system clock rate of CLK. Theclock rate divider 312 may set the clock rate for the frequency domain signals h1, h2, h3, and h4 with CLK, CLK/2, CLK/4, and CLK /8, respectively. The different clock rates are supplied to each corresponding accumulator of theaccumulators 306 through a corresponding multiplexor of themultiplexors 308 for quantization. Hardware and circuitry for the clock rate divider and the multiplexor are well known to the people skilled in the art. - For instance, the clock rate CLK may be fed to the first accumulator of the
accumulators 306, the clock rate CLK/2 may be fed to the second accumulator of theaccumulators 306, the clock rate CLK/4 may be fed to the third accumulator of theaccumulators 306, and the clock rate CLK/8 may be fed to the fourth accumulator of theaccumulators 306. In turn, theaccumulators 306 may effectively quantize the frequency domain signals h1, h2, h3, and h4 into the quantized frequency domain signals h1, -
- respectively. In general, the clock rates do not need to be limited to the factors of 2.
-
FIG. 3 illustrates the time domain Discrete Transform block (e.g., time domain Walsh Transform block or time domain Haar Wavelet Transform block) with coefficient quantization for compression. The disclosed technique selectively controls the quantization accuracy of individual time domain computed Walsh Transform coefficients by reducing the clock rates of corresponding accumulators. By appropriately selecting the frequency components for which to reduce the clock rate, which is signal property dependent, the original signals can be faithfully reconstructed from a fewer total number of bits. In so doing, signal compression can be performed immediately during capture time, which leads to lower storage requirement and lower transmission requirement. The quantization level can be flexibly selected based on the clocking frequency (i.e., clock rate). Lower clocking frequency also results in lower power consumption. Compression severity and quality can also be controlled by setting different clock rates. The reconstructed signal quality can track the energy required to acquire the signal, which allows for energy and quality scaling. - The time domain Discrete Transform block (e.g., time domain Walsh Transform block or time domain Haar Wavelet Transform block) as illustrated in
FIGS. 2 and 3 can be used for image compression. Images of real scenes are characteristically sparse when represented in the frequency domain, containing large areas of low frequency signals (smooth gradients) and local high frequency signals (edges). Such characteristics allow for image compression if the signals associated with an image are transformed, for example, using the Discrete Cosine Transform (DCT) as in the JPEG standard. - An image sensor can produce a large amount of data which is costly to store or transmit in terms of storage space or transmission resources. In general, the sooner the image data is compressed, the higher the overall savings for the storage space or the transmission resources. Accordingly, compressing the image at readout stage could significantly save power/die area for all subsequent stages of the image processing/transmission.
- Image compression can be a computationally and energetically intensive operation. But, image compression is still desirable where the storage space (e.g., memory or hard drive) or the available signal transmission bandwidth is limited. The current popular image compression techniques are designed to reduce the image file size as much as possible while preserving the quality of the images. These current compression techniques often compromise the energy used during compression process. Thus, an improved image compression solution with lower energy consumption is desirable.
- To save power at the system level, image compression close to signal generation as much as possible is preferred to reduce power consumption during later stages (e.g., signal storage or signal transmission). So, image compression would work the best when performed on two-dimensional (2D) areas. To generate the data for the image, a rolling shutter sensor needs to expose multiple rows before the compression computations can start, which requires that the first rows be stored somewhere.
- This disclosure also provides an image sensor readout device implementing the Walsh Transform based compression. According to embodiments, pixels can be designed to output pulse width signals which are dependent on the light level. The image sensor readout embodiments are compatible with a time domain Discrete Transform based compressor, such as the
systems FIGS. 2 and 3 . The Walsh Matrix can be expanded to its 2D version to exploit the spatial frequency properties of real images in both directions. -
FIG. 4 shows asystem 400 using a time domain Discrete Transform compression block (e.g., Walsh compression block) to read out a number of pixels simultaneously, according some embodiments. InFIG. 4 , the pixel array is divided into blocks of 4×4 pixels and a row of blocks is read out simultaneously. The pixel blocks are connected to a column level Walsh compression block in a rolling shutter manner. The end result is a block row readout, with 4 rows of pixels read simultaneously, and this process is repeated to read the full pixel array of the image. Using the 4×4 pixel block is for illustration purposes only. Thesystem 400 can be modified to process a 2×2 pixel block, or an 8×8 pixel block. In general,system 400 may be modified to use process N×N pixel block. - The
system 400 comprises arow decoder 402 and areadout module 404. Thereadout module 404 comprises aWalsh compression block 420. Thesystem 400 may additionally comprise arun length encoder 408 and anentropy encoder 410 known in the art. - The
pixel array 414 comprises individual pixels of the image sensing element known in the art. Therow decoder 402 known in the art may decode a row of 4×4 pixel blocks 412 from thepixel array 414. The 4×4pixel block 412 includes 16 pixels, p1, p2, p3, . . . , and p16. The 16 pixels are input into theWalsh compression block 420. TheWalsh compression block 420 comprises, multipleparallel counters 424, asynchronizer 426,multiple accumulators 428,multiple multiplexors 430, and aclock divider 432. The number of theparallel counters 424, the number of the multiple clock gating blocks 426, and the number of theaccumulators 428 depend on the number of pixels in the input pixel block. If there are N pixels in the input pixel block, theWalsh compression block 420 may comprise N parallel counters 424,N accumulators 428, and up to N multiplexors. -
FIG. 4 shows a 4×4 pixel block for illustration purpose. Comparators known in the art (not shown inFIG. 4 ) may be used to covert the pixel outputs from thepixel block 412 to pulse width signals. When there are 16 pixels in thepixel block 412, the comparators generate corresponding 16 pulse width signals, c1, c2, c3, . . . , and c16 (labeled as c<1:16> inFIG. 4 ) in the time domain. The parallel counters 424 may be designed the same and perform the same functions asparallel counters FIGS. 2A and 3 . When there are 16 parallel pulse width signals as shown inFIG. 4 , theWalsh compression block 420 comprises 16parallel counters 424 for generating 16 increment signals (labeled as INCR<1:16> inFIG. 4 ), using the 16×16 Walsh matrix. Theaccumulators 428 may be designed the same and perform the same function as theaccumulators 206 inFIG. 2A , or as theaccumulators 306 inFIG. 3 . When there are 16 increment signals (labeled as INCR<1:16> inFIG. 4 ), theWalsh compression block 420 comprises 16 accumulators for generating 16 frequency domain signals h1, h2, h3, . . . , h16 (labeled as h<1:16> inFIG. 4 ). TheWalsh compression block 420 may support coefficient quantization. Theaccumulators 428 can generate the quantized frequency domain signals. If thesystem 400 comprises therun length encoder 408 and theentropy encoder 410, the output of the Walsh compression block 420 (frequency domain signals or quantized frequency domain signals) may be further compressed by these two encoders. The output of thesystem 400 is the compressed image signal output to be saved or transmitted by an output module (not shown inFIG. 4 ). The output module may be one known in the art for storing information locally on the device (on a hard drive or memory of the device). The output module may be one known in the art for transmitting information remotely over a network. The compressed image signal may be encoded frequency domain signals h1, h2, h3, . . . , h16 with therun length encoder 428 and theentropy encoder 410. The compressed image signal may also be frequency domain signals h1, h2, h3, . . . , h16 without being encoded by therun length encoder 408 and theentropy encoder 410. -
FIG. 4 shows a readout for an image sensor array, implementing a time domain based, Walsh Transform block 420 on 2D sub-areas of the image. 4×4 (or 2×2, or 8×8) pixels are read out in parallel and compressed at the same time, reducing the amount of data to be transferred to a storage medium (e.g., a memory or a hard drive) or to be transmitted to another device. In so doing, a smaller and more efficient representation of the image is achieved at a stage as close to signal generation stage as possible. - With the disclosed technique as shown in
FIG. 4 , a compressed image representation is available directly after readout, which would require less processing before storage or transmission. Less processing before storage or transmission results in lower power consumption and/or lower signal bandwidth requirement. The block-row readout technique disclosed with respect toFIG. 4 means that the image does not need to be stored in memory first and later retrieved for compression. -
FIG. 5 shows a flow chart of amethod 500 for performing time domain Discrete Transform, according to some embodiments. Themethod 500 may be performed by a hard device, such as thesystem 200 or thesystem 300 described above. Themethod 500 starts at theoperation 502, where a first counter of a plurality of counters of an apparatus receives a plurality of pulse width signals in the time domain. At theoperation 504, the first counter generates a first increment signal in the time domain from the plurality of pulse width signals based on a first row of a Discrete Transform matrix. At theoperation 506, a synchronizer of the apparatus receives the first increment signal. At theoperation 508, the synchronizer generates a first synchronized increment signal in the time domain from the first increment signal. At theoperation 510, a first accumulator of a plurality of accumulators of the apparatus receives the first synchronized increment signal. At theoperation 512, the first accumulator accumulates the first synchronized increment signal over a period of time to generate a first frequency domain signal. - In some embodiments, the plurality of counters may further comprise a second counter. The second counter may receive the plurality of pulse width signals in the time domain. The second counter may then generate a second increment signal in the time domain from the plurality of pulse width signals based on a second row of the Discrete Transform matrix. The synchronizer may receive the second increment signal. The synchronizer may then generate a second synchronized increment signal in the time domain from the second increment signal. The plurality of accumulators may further comprise a second accumulator. The second accumulator may receive the second synchronized increment signal. The second accumulator may then accumulate the second synchronized increment signal over the period of time to generate a second frequency domain signal.
- In some embodiments, the number of the plurality of pulse width signals may equal a number of the plurality of counters. The plurality of counters may include N counters including the first counter, and an i-th counter of the plurality of counters may receive the plurality of pulse width signals in the time domain. The i-th counter may generate an i-th increment signal in the time domain from the plurality of pulse width signals based on an i-th row of the Discrete Transform matrix. The Discrete Transform matrix may be an N×N Discrete Transform matrix. The synchronizer may receive the i-th increment signal. The synchronizer may generate an i-th synchronized increment signal in the time domain from the i-th increment signal. The plurality of accumulators may further comprise N accumulators including the first accumulator. An i-th accumulator of the N accumulators may receive the i-th synchronized increment signal. The i-th accumulator may then accumulate the i-th synchronized increment signal over the period of time to generate an i-th frequency domain signal.
- In some embodiments, the number of the plurality of counters may equal one of 4, 8, or 16. In some embodiments, the plurality of counters may process the plurality of pulse width signals in parallel. In some embodiments, the number of the plurality of counters may equal 4. The first row of the Discrete Transform matrix may be [1, 1, 1, 1]. The plurality of pulse width signals may comprise a first pulse width signal, a second pulse width signal, a third pulse width signal, and a fourth pulse width signal. The first increment signal may comprise an addition of the first pulse width signal, the second pulse width signal, the third pulse width signal, and the fourth pulse width signal in the time domain.
- In some embodiments, the apparatus may further comprise a clock divider. The divider may set a first clock rate. The first clock rate may be a first fraction of a system clock rate. The clock divider may feed the first clock rate to the first accumulator through a first multiplexor. In some embodiments, the plurality of accumulators may further comprise N accumulators including the first accumulator. The clock divider may set an i-th clock rate. The i-th clock rate may be an i-th fraction of the system clock rate for an i-th accumulator of the N accumulators. The clock divider may feed the i-th clock rate to the i-th accumulator through an i-th multiplexor.
- In some embodiments, the Discrete Transform matrix may be one of a Walsh matrix, or a Haar matrix.
-
FIG. 6 shows a flow chart of amethod 600 for performing image sensor readout using the Discrete Transform based compression, according to some embodiments. Themethod 600 may be performed by a hard device, such as thesystem 400 described above. Themethod 600 starts at theoperation 602, where a time domain Discrete Transform block of an apparatus receives N pulse width signals. At theoperation 604, the time domain Discrete Transform block generates N frequency domain signals. At theoperation 606, an output module of the apparatus stores or transmits information associated with the N frequency domain signals. - In some embodiments, the information associated with the N frequency domain signals may be the N frequency domain signals.
- In some embodiments, the apparatus may further comprise a run length encoder. The run length encoder may run length encode the N frequency domain signals to generate run length encoded signals. The apparatus may further comprise an entropy encoder. The entropy encoder may entropy encode the run length encoded signals to generate entropy encoded signals. The information associated with the N frequency domain signals may be the entropy encoded signals.
- In some embodiments, the N frequency domain signals may be N quantized frequency domain signals.
- In some embodiments, the time domain Discrete Transform block may comprise N counters. An i-th counter of the N counters may receive the N pulse width signals in the time domain. The i-th counter of the N counters may generate an i-th increment signal in the time domain from the N pulse width signals based on an i-th row of a Discrete Transform matrix. The Discrete Transform matrix may be an N×N Discrete Transform matrix.
- In some embodiments, the time domain Discrete Transform block may further comprise a synchronizer. The synchronizer may receive the i-th increment signal. The synchronizer may generate an i-th synchronized increment signal in the time domain from the i-th increment signal. The time domain Discrete Transform block may further comprise N accumulators. An i-th accumulator of the N accumulators may receive the i-th synchronized increment signal. The i-th accumulator of the N accumulators may accumulate the i-th synchronized increment signal over a period of time to generate an i-th frequency domain signal.
- In some embodiments, N may be one of 4, 8, or 16. In some embodiments, the apparatus may comprise a plurality of N comparators. The plurality of N comparators may receive outputs from N pixels and generate the N pulse width signals from the N pixels.
- In some embodiments, the apparatus may be an image sensor readout device.
-
FIGS. 7A-7B shows more detailed block diagrams of the 4-input counters used in this disclosure, according to some embodiments.FIG. 7A shows one example 4-input counter 700 used as the first counter corresponding to the first row of the 4×4Walsh matrix [i 1 1 1 1], as described with respect toFIGS. 2A, 3, and 4 . The 4-input counter 700 receives 4 input signals i0, i1, i2, and i3 (e.g., pulse width signals c1, c2, c3, and c4, respectively). The 4-input counter 700 generates the increment signal INCR1 as a 3-bit (O0, O1, and O2, and O3) output, which represents i0+i1+i2+i3. The 4-input counter 700 includes thefull adder 702. The 4-input counter 700 further includes thehalf adders FIG. 7A . Hardware implementations of the full adder and the half adder are known in the art. - A half adder, such as the
half adder 704 and thehalf adder 706, adds two binary numbers A and B to produce a sum S and a carry output C. The truth table of the half adder is shown below. -
Half Adder Truth Table A B S C 0 0 0 0 0 1 1 0 1 0 1 0 1 1 0 1 - A full adder, such as the
full adder 702, is a logical circuit that performs an addition operation on three one-bit binary numbers (A, B and the carry input Cin). The outputs of the full adder are a sum S and a carry output C. The truth table of the full adder is shown in the table below. -
Full Adder Truth Table A B Cin C S 0 0 0 0 0 0 0 1 0 1 0 1 0 0 1 0 1 1 1 0 1 0 0 0 1 1 0 1 1 0 1 1 0 1 0 1 1 1 1 1 -
FIG. 7B shows one example 4-input counter 750 used as an subsequent counter corresponding to a subsequent row of the 4×4 Walsh matrix, as described with respect toFIGS. 2A, 3, and 4 . The 4-input counter 750 receives 4 input signals i0, i1, i2, and i3. The 4-input counter 750 generates the increment signal as a 3-bit (O0, O1, and O2) output, which represents i0+i1−i2−i3. The 4-input counter 750 includes thehalf adders input counter 750 further includes a 2-bit subtractor 756 known in the art. These components are connected as shown inFIG. 7B . The 2-bit subtractor 756 receives A and B as the inputs, and the output of the 2-bit subtractor is A-B, represented by the 3-bit (O0, O1, and O2) output. The 4-input counter 750 may be implemented for any of the subsequent counters (i.e., the counters corresponding to the second, the third, or the fourth row of the 4×4 Walsh matrix). - The pulse width signals that correspond to “1” entries may be the input signals i0 and i1, respectively. The pulse width signals that correspond to “−1” entries may be the input signals i2 and i3, respectively. For example, for the second counter corresponding to the second row of the 4×4 Walsh matrix, [1 −1 1 −1], c1 and c3 may be the input signals i0 and i1, respectively, and c2 and c4 may be the input signals i2 and i3, respectively, for generating the increment signal INCR2 (c1+c3−c2−c4). For the third counter corresponding to the third row of the 4×4 Walsh matrix, [1 1 −1 −1], c1 and c2 may be the input signals i0 and i1, respectively, and c3 and c4 may be the input signals i2 and i3, respectively, for generating the increment signal INCR3 (c1+c2 −c3 −c4). For the fourth counter corresponding to the fourth row of the 4×4 Walsh matrix, [1 −1 −1 1], c1 and c4 may be the input signals i0 and i1, respectively, and c2 and c3 may be the input signals i2 and i3, respectively, for generating the increment signal INCR3 (c1+c4−c2−c3).
-
FIGS. 8A-8B shows more detailed block diagrams of the 8-input counters used in this disclosure, according to some embodiments.FIG. 8A shows one example 8-input counter 800 used as the first counter corresponding to the first row of the 8×8 Walsh matrix [1 1 1 1 1 1 1 1], as described with respect toFIGS. 2A, 3, and 4 . The 8-input counter 800 receives 8 input signals i0, i1, i2, i3, i4, i5, i6, and i7 (e.g., pulse width signals c1, c2, c3,c4, c5, c6, c7, and c8, respectively). The 8-input counter 800 generates the increment signal INCR1 as the 4-bit output (O0, O1, O2, and O3). The 8-input counter 800 includes thefull adders half adders FIG. 8A . -
FIG. 8B shows one example 8-input counter 850 used as an subsequent counter corresponding to a subsequent row of the 8×8 Walsh matrix, as described with respect to FIGS. 2A, 3, and 4. The 8-input counter 850 receives 8 input signals i0, i1, i2, i3, i4, i5, i6, and i7. The 8-input counter 850 generates the increment signal as a 4-bit (O0, O1, O2, and O3) output, which represents i0+i1+i2+i3−i4−i5−i6−i7. The 8-input counter 850 includes the 4-input counters 852 and 854. The 4-input counters 852 and 854 may be implemented the same as the 4-input counter 700, as described with respect toFIG. 7A . The 8-input counter 850 further includes a 3-bit subtractor 856 known in the art. These components are connected as shown inFIG. 8B . The 3-bit subtractor 856 receives A and B as the inputs, and the output of the 3-bit subtractor is A-B, represented by the 4-bit (O0, O1, O2, and O3) output. The 8-input counter 850 may be implemented for any of the subsequent counters (i.e., the counters corresponding to the second, the third, the fourth, . . . , or the 8th row of the 8×8 Walsh matrix). - The pulse width signals that correspond to “1” entries may be the input signals i0, i1, i2, and i3, respectively. The pulse width signals that correspond to “−1” entries may be the input signals i4, i5, i6, and i7, respectively. For example, for the second counter corresponding to the second row of the 8×8 Walsh matrix, [1 −1 1 −1 1 −1 1 −1], c1, c3, c5, and c7 may be the input signals i0, i1, i2, and i3, respectively. And, c2, c4, c6, and c8 may be the input signals i4, i5, i6, and i7, respectively, for generating the increment signal INCR2 (c1+c3+c5+c7−c2−c4−c6−c8).
-
FIGS. 9A-9C shows more detailed block diagrams of the 16-input counters used in this disclosure, according to some embodiments.FIG. 9A shows one example 16-input counter 900 used as the first counter corresponding to the first row of the 16×16 Walsh matrix [1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1], as described with respect toFIGS. 2A, 3, and 4 . The 16-input counter 900 receives 16 input signals i0, i1, i2, i3, i4, i5, i6, i7, i8, i9, i10, i11, i12, i13, i14, and i15 (e.g., pulse width signals c1, c2, c3, c4, c5, c6, c7, c8, c9, c10, c11, c12, c13, c14, and c15, respectively). The 16-input counter 900 generates the increment signal INCR1 as the 5-bit output (O0, O1, and O2, O3, and O4). The 16-input counter 900 includes thefull adders input counter 900 further includes thehalf adders FIG. 9A . -
FIG. 9B shows another example 16-input counter 930 used as the first counter corresponding to the first row of the 16×16 Walsh matrix [1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1], as described with respect toFIGS. 2A, 3, and 4 . The 16-input counter 930 includes two 8-input counters 932 and 934 . The 16-input counter 930 further includes a 4-bit adder 936. Each of the two 8-input counters 932 and 934 may be designed the same as the 8-input counter as described with respect toFIG. 8A . The first 8 input signals (e.g., c1, c2, c3, c4, c5, c6, c7, and c8) of the 16 input signals are the inputs of the 8-input counter 932. The second 8 input signals (e.g., c9, c10, c11, C12, c13, C14, c15, and c16) of the 16 input signals are the inputs of the 8-input counter 934. The 4-bit output of the 8-input counter 934 is the 4-bit input A of the 4-bit adder 936. The 4-bit output of the 8-input counter 932 is the 4-bit input B of the 4-bit adder 936. The 4-bit adder 936 performs addition operation of the two inputs A and B and generates a 5-bit output representing the sum of A and B. Hardware design and implementation of the 4-bit adder are known in the art. -
FIG. 9C shows one example 16-input counter 950 used as an subsequent counter corresponding to a subsequent row of the 16×16Walsh matrix, as described with respect toFIGS. 2A, 3, and 4 . The 16-input counter 950 receives 16 input signals i0, i1, i2, i3, i4, . . . , and i15. The 16-input counter 950 generates the increment signal as a 5-bit (Oo, O1, O2, O3, and O4) output, which represents i0+i1+i2+i3+i4+i5+i6+i7−(i8+i9+i10+i11+i12+i13+i14+i15). The 16-input counter 950 includes the 8-input counters 952 and 954. The 8-input counters 952 and 954 may be implemented the same as the 8-input counter 800, as described with respect toFIG. 8A . The 16-input counter 950 further includes a 4-bit subtractor 956 known in the art. These components are connected as shown inFIG. 9C . The 4-bit subtractor 956 receives A and B as the inputs, and the output of the 4-bit subtractor is A-B, represented by the 5-bit (O0, O1, O2, O3, and O4) output. The 16-input counter 950 may be implemented for any of the subsequent counters (i.e., the counters corresponding to the second, the third, the fourth, . . . , or the sixteenth row of the 16×16 Walsh matrix). - Similar to the description with respect to
FIGS. 7B and 8B , the pulse width signals that correspond to “1” entries may be the input signals i0, i1, i2, i3, . . . , and i7, respectively. The pulse width signals that correspond to “−1” entries may be the input signals i8, i9, i10, i11, . . . , and i15, respectively. For example, for the second counter corresponding to the second row of the 16×16 Walsh matrix, [1 −1 1 −1 1 −1 1 −1 1 −1 −1 1 −1 1 −1], c1, c3, c5, c7, c9, c11, c13, and c15 may be the input signals i0, i1, i2, i3, i4, i5, i6, and i7 respectively. And, c2, c4, c6, c8, c10, c12, c14, and c16 may be the input signals i8, i9, i10, i11, i12, i13, i14, and i15, respectively, for generating the increment signal INCR2 below - INCR2=c1+c3+c5+c7+c9+c11+c13+c15−(C2+c4+c6+c8+c10+c12+c14+c16)
- Based on the description of the counters as described above, a person skilled in the art would understand that similar design and implementation may apply to the counters for the Haar Wavelet Transform. For example, the counters described with respect to
FIGS. 7A, 8A, and 9A may be used as the first counter for the Haar Wavelet Transform, corresponding the first row having all “1” entries. In another example, for a subsequent counter counting 8 input signals, the 8-input counter 850 as described with respect toFIG. 8B may be used. For instance, the 8-input counter 850 may be used for the second counter corresponding to the second row of the 8×8 Haar matrix to generate the increment signal INCR2 (c1+c2+c3+c4−c5−c6−c7−c8). In yet another example, for a subsequent counter counting 4 input signals, the 4-input counter 750 as described with respect toFIG. 7B may be used. For instance, the 4-input counter 750 may be used for the third counter corresponding to the third row of the 8×8 Haar matrix to generate the increment signal INCR3 (c1+c2−c3−c4). For a subsequent counter counting 2 signals, a 1-bit subtractor known in the art may be used. For instance, the 1-bit subtractor may be used for the fifth counter corresponding to the fifth row of the 8×8 Haar matrix to generate the increment signal INCR5 (c1−c2). - To provide a more detailed context of the counter design and implementation described above, this disclosure incorporates the following article by reference in its entirety.
-
- L. Dadda, Composite Parallel Counters, IEEE Transactions on Computers, v.29 n.10, p.942-946, October 1980.
- As described above, the accumulator is a standard block known in the art. Hardware implementations of the accumulator may vary.
FIG. 10 shows the block diagram of oneexample embodiment accumulator 1000 that may be implemented for theaccumulators 206 inFIG. 2A , theaccumulators 306 inFIG. 3 , and theaccumulators 428 inFIG. 4 . Theaccumulator 1000 comprises theinverters 1002A-D, themultiplexers 1004A-D, thefull adders 1006A-D, and the flip flips 1008A-D known in the art. The flip flips 1008A-D may be D flip-flops. A D flip-flop is an edge-triggered memory circuit. The D flip-flop has three inputs: a data input (D) that defines the next state, a timing control input (CLK) that tells the flip-flop exactly when to “memorize” the data input, and a reset input (RST) that can cause the memory to be reset to o regardless of the other two inputs (usually referred as asynchronous reset). The output of a D flip-flop is Q. - These components of the
accumulator 1000 are connected as shown inFIG. 10 .Accumulator 1000 is a 4-bit accumulator receiving an increment signal represented as 4 bits (I0, I1, I2, and I3). Theaccumulator 1000 counts up or down based on signal D, on each rising clock edge, representing the resulting number in 2's complement binary notation. The output of the accumulator is the accumulated signal D. - As described above, the clock divider is known in the art.
FIG. 11 shows the block diagram of one exampleembodiment clock divider 1100 that may be implemented for theclock divider 310 inFIG. 3 and theclock divider 432 inFIG. 4 . Theclock divider 1100 is a D type flip-flop clock divider. Theclock divider 1100 comprisesinverters 1102A-E and flip-flops 1104A-E, connected as shown inFIG. 11 . As described above, the clock rate to each accumulator does not have to be a factor of 2 of the fastest clock. So, other clocking circuits may be used. - While this disclosure has been described with reference to illustrative embodiments, this description is not intended to be construed in a limiting sense. Various modifications and combinations of the illustrative embodiments, as well as other embodiments of the disclosure, will be apparent to persons skilled in the art upon reference to the description. It is therefore intended that the appended claims encompass any such modifications or embodiments.
Claims (25)
Priority Applications (3)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US16/508,796 US10904049B1 (en) | 2019-07-11 | 2019-07-11 | Time domain discrete transform computation |
EP20184613.6A EP3764251B1 (en) | 2019-07-11 | 2020-07-08 | Time domain discrete transform computation |
CN202010662623.0A CN112214724B (en) | 2019-07-11 | 2020-07-10 | Device for calculating time domain discrete transformation |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US16/508,796 US10904049B1 (en) | 2019-07-11 | 2019-07-11 | Time domain discrete transform computation |
Publications (2)
Publication Number | Publication Date |
---|---|
US20210014090A1 true US20210014090A1 (en) | 2021-01-14 |
US10904049B1 US10904049B1 (en) | 2021-01-26 |
Family
ID=71527606
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US16/508,796 Active US10904049B1 (en) | 2019-07-11 | 2019-07-11 | Time domain discrete transform computation |
Country Status (3)
Country | Link |
---|---|
US (1) | US10904049B1 (en) |
EP (1) | EP3764251B1 (en) |
CN (1) | CN112214724B (en) |
Families Citing this family (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110770722B (en) * | 2017-06-29 | 2023-08-18 | 北京清影机器视觉技术有限公司 | Two-dimensional data matching method, device and logic circuit |
CN113805160B (en) * | 2021-08-04 | 2024-05-28 | 杭州电子科技大学 | Active sonar interference fringe feature extraction method based on curvature sum |
Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20050122238A1 (en) * | 2003-10-16 | 2005-06-09 | Canon Kabushiki Kaisha | Operation circuit and operation control method thereof |
US8928775B2 (en) * | 2010-05-12 | 2015-01-06 | Samsung Electronics Co., Ltd. | Apparatus and method for processing image by using characteristic of light source |
Family Cites Families (27)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US3981443A (en) | 1975-09-10 | 1976-09-21 | Northrop Corporation | Class of transform digital processors for compression of multidimensional data |
US4038539A (en) | 1976-02-23 | 1977-07-26 | American Electronic Laboratories, Inc. | Adaptive pulse processing means and method |
DE2625973C3 (en) * | 1976-06-10 | 1981-12-24 | Philips Patentverwaltung Gmbh, 2000 Hamburg | Method and arrangement for the redundancy-reducing transformation of images |
US4261043A (en) | 1979-08-24 | 1981-04-07 | Northrop Corporation | Coefficient extrapolator for the Haar, Walsh, and Hadamard domains |
US4590608A (en) | 1980-05-30 | 1986-05-20 | The United States Of America As Represented By The Secretary Of The Army | Topographic feature extraction using sensor array system |
JPS5737925A (en) | 1980-08-14 | 1982-03-02 | Matsushita Electric Ind Co Ltd | High-speed hadamard converter |
US4553165A (en) | 1983-08-11 | 1985-11-12 | Eastman Kodak Company | Transform processing method for reducing noise in an image |
US4982353A (en) | 1989-09-28 | 1991-01-01 | General Electric Company | Subsampling time-domain digital filter using sparsely clocked output latch |
US5262871A (en) | 1989-11-13 | 1993-11-16 | Rutgers, The State University | Multiple resolution image sensor |
US5293628A (en) * | 1991-11-04 | 1994-03-08 | Motorola, Inc. | Data processing system which generates a waveform with improved pulse width resolution |
JP2809954B2 (en) | 1992-03-25 | 1998-10-15 | 三菱電機株式会社 | Apparatus and method for image sensing and processing |
JP2868955B2 (en) * | 1992-07-06 | 1999-03-10 | 株式会社東芝 | Pulse generation circuit |
WO1995014350A1 (en) | 1993-11-15 | 1995-05-26 | National Semiconductor Corporation | Quadtree-structured walsh transform coding |
US5805933A (en) | 1994-12-28 | 1998-09-08 | Canon Kabushiki Kaisha | Image processing apparatus, image processing method and network system |
FR2746243B1 (en) | 1996-03-15 | 1998-06-05 | METHOD FOR PROVIDING A REPRESENTATION OF AN OPTICAL SCENE BY WALSH-HADAMARD TRANSFORMATION AND IMAGE SENSOR USING THE SAME | |
US6215910B1 (en) | 1996-03-28 | 2001-04-10 | Microsoft Corporation | Table-based compression with embedded coding |
US5850622A (en) | 1996-11-08 | 1998-12-15 | Amoco Corporation | Time-frequency processing and analysis of seismic data using very short-time fourier transforms |
US6157740A (en) | 1997-11-17 | 2000-12-05 | International Business Machines Corporation | Compression/decompression engine for enhanced memory storage in MPEG decoder |
US6233060B1 (en) | 1998-09-23 | 2001-05-15 | Seiko Epson Corporation | Reduction of moiré in screened images using hierarchical edge detection and adaptive-length averaging filters |
US8385387B2 (en) | 2010-05-20 | 2013-02-26 | Harris Corporation | Time dependent equalization of frequency domain spread orthogonal frequency division multiplexing using decision feedback equalization |
US9100085B2 (en) | 2011-09-21 | 2015-08-04 | Spatial Digital Systems, Inc. | High speed multi-mode fiber transmissions via orthogonal wavefronts |
US8687086B1 (en) * | 2012-03-30 | 2014-04-01 | Gopro, Inc. | On-chip image sensor data compression |
US9083977B2 (en) * | 2012-11-27 | 2015-07-14 | Omnivision Technologies, Inc. | System and method for randomly accessing compressed data from memory |
US8958166B2 (en) * | 2013-05-15 | 2015-02-17 | Lsi Corporation | Method and system for sliding-window based phase, gain, frequency and DC offset estimation for servo channel |
CN103546695B (en) * | 2013-10-18 | 2016-05-25 | 天津大学 | Be applied to time domain accumulation method and the accumulator of TDI-CIS |
EP3203695A1 (en) * | 2016-02-04 | 2017-08-09 | ABB Schweiz AG | Matrix equalizer for cmfb transmission in dispersive channels |
US10158375B1 (en) | 2018-03-21 | 2018-12-18 | Nxp Usa, Inc. | PDM bitstream to PCM data converter using Walsh-Hadamard transform |
-
2019
- 2019-07-11 US US16/508,796 patent/US10904049B1/en active Active
-
2020
- 2020-07-08 EP EP20184613.6A patent/EP3764251B1/en active Active
- 2020-07-10 CN CN202010662623.0A patent/CN112214724B/en active Active
Patent Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20050122238A1 (en) * | 2003-10-16 | 2005-06-09 | Canon Kabushiki Kaisha | Operation circuit and operation control method thereof |
US8928775B2 (en) * | 2010-05-12 | 2015-01-06 | Samsung Electronics Co., Ltd. | Apparatus and method for processing image by using characteristic of light source |
Also Published As
Publication number | Publication date |
---|---|
EP3764251A2 (en) | 2021-01-13 |
US10904049B1 (en) | 2021-01-26 |
EP3764251A3 (en) | 2021-01-27 |
CN112214724B (en) | 2024-07-30 |
EP3764251B1 (en) | 2024-01-10 |
CN112214724A (en) | 2021-01-12 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
KR100218650B1 (en) | Data compression system and method | |
US7003168B1 (en) | Image compression and decompression based on an integer wavelet transform using a lifting scheme and a correction method | |
JP2005539467A5 (en) | ||
JPH088691B2 (en) | Image data processing method | |
CN88103191A (en) | The television transmission system that contains pyramid number encoder/decoding circuit | |
US8170334B2 (en) | Image processing systems employing image compression and accelerated image decompression | |
JPH11501420A (en) | VLSI circuit structure that implements the JPEG image compression standard | |
EP3764251B1 (en) | Time domain discrete transform computation | |
US8170333B2 (en) | Image processing systems employing image compression | |
US7236997B2 (en) | Filter processing apparatus and method | |
US8170335B2 (en) | Image processing systems employing image compression and accelerated decompression | |
US5822457A (en) | Pre-coding method and apparatus for multiple source or time-shifted single source data and corresponding inverse post-decoding method and apparatus | |
Narayanan et al. | Vlsi architecture for 2D-discrete wavelet transform (DWT) based lifting method | |
WO2000055757A1 (en) | A fast multiplierless transform | |
Wiewel | FPGA implementation of an energy-efficient real-time image compression algorithm for the EIVE satellite mission | |
RU2628122C1 (en) | Method of hardware compressing digital image for shooting equipment of scanning type | |
TWI524778B (en) | H.264 video encoding technology | |
Yusof et al. | Field programmable gate array (FPGA) based baseline JPEG decoder | |
US20050232349A1 (en) | Compressing video frames | |
US7352906B2 (en) | Continuous transform method for wavelets | |
Rajeshwari et al. | DWT based Multimedia Compression | |
Satyendra et al. | Discrete Wavelet transform using Vedic multiplier for image compression | |
Hegde et al. | An efficient hybrid integer coefficient-DCT architecture using quantization module for HEVC standard | |
Boulgouris et al. | Optimal progressive lossless image coding using reduced pyramids with variable decimation ratios | |
EP2146508A1 (en) | Image encoding/decoding devices and image band decomposing/composing devices |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
AS | Assignment |
Owner name: STMICROELECTRONICS (RESEARCH & DEVELOPMENT) LIMITED, UNITED KINGDOM Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNOR:KAKLIN, FILIP;REEL/FRAME:049727/0334 Effective date: 20190711 |
|
FEPP | Fee payment procedure |
Free format text: ENTITY STATUS SET TO UNDISCOUNTED (ORIGINAL EVENT CODE: BIG.); ENTITY STATUS OF PATENT OWNER: LARGE ENTITY |
|
STCF | Information on status: patent grant |
Free format text: PATENTED CASE |
|
MAFP | Maintenance fee payment |
Free format text: PAYMENT OF MAINTENANCE FEE, 4TH YEAR, LARGE ENTITY (ORIGINAL EVENT CODE: M1551); ENTITY STATUS OF PATENT OWNER: LARGE ENTITY Year of fee payment: 4 |