WO2001090927A1 - Method and device in a convolution process - Google Patents
Method and device in a convolution process Download PDFInfo
- Publication number
- WO2001090927A1 WO2001090927A1 PCT/SE2001/001074 SE0101074W WO0190927A1 WO 2001090927 A1 WO2001090927 A1 WO 2001090927A1 SE 0101074 W SE0101074 W SE 0101074W WO 0190927 A1 WO0190927 A1 WO 0190927A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- impulse response
- samples
- convolution
- input signal
- register
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Ceased
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/10—Complex mathematical operations
- G06F17/15—Correlation function computation including computation of convolution operations
Definitions
- the present invention relates to a method in convolution.
- the method may be used in so called auralising in a space.
- Auralising implies that a calculation is made about how sounds are interpreted by the listeners both ears take place, so as to recreate the natural experience of space.
- Stereo sound is part of the sound experience. Instead of two separate microphones, small microphones may be put in the ears of a person being in the recording studio. A person listening to the recorded sound in headphones will experience the same sound image as at the recording occasion. This is called artificial head stereo and is a foundation for auralising, where the headphone sound is created artificially.
- the invention also relates to a device for performing the method.
- the invention particularly relates to real time auralisation in large premises, for example concert halls.
- An impulse response for each ear regarding a specific placing and orientation of sound source and listener is calculated in advance. It is also possible to approximately measure the impulse response by firing a pistol where the sound source is going to be and record the sound with microphones in the auditory canals of a so called artificial head on the location of the listener. The time until the impulse response has reverberated is called reverberation time.
- the sound for each ear is calculated by an operation called convolution. It forms a filtering of the original sound dependent on both the reflexes (reverberation) in the premises and the direction sensitive influence on sound of the ear before it reaches the eardrum.
- the impulse responses as well as the non-reverberant original sound may be represented as series of numbers and all calculations may take place digitally, for example in a computer. However, to perform the calculation in real time a great calculation power is required. With a computing rate of 50 kHz (full CD quality) and a reverberation of the room of 2 seconds, 2*10 10 (20 billions) multiplications and as many additions per second are re- quired.
- x constitutes the input
- h constitutes the convolution core (filter)
- y constitutes the output
- Modern signal processing is based on Fourier transforming current signals in the time plane to transformed signals in the frequency plane.
- the signal processing is then taking place in the frequency plane.
- the convolu- tion to be performed in the time plane corresponds to multiplication in the frequency plane.
- multiplication is a simpler operation, previously signal processing has been implemented by multiplication in fast hardware also in this context.
- FFT Fast Fourier Transform
- DSP digital signal processors
- One object of the invention is to provide a method for convolution of digital signals. This object is obtained by the invention having the features of claim 1 and 6, respectively.
- the invention solves the problem of how to perform long convolutions of sound in real time using a reasonably amount of hardware.
- the fundamental operations in convolution are performed parallel in an efficient manner. Sound samples from an input signal and terms, or filter coefficients, from the impulse response are stored in registers. Each sound sample and filter coefficient are multiplied with each other separately and in parallel. Thereafter the products are added.
- the additions of the products may take place in a so called adder tree, wherein included terms first are added in pairs. The sums again are added in pairs in a repeated sequence until a final sum is calculated. Due to the commutative rule of addition (the order is unessential), this procedure gives ex- actly the same result as if the original numbers had been added in turn.
- a key question for providing an efficient calculation is how to process the data included in the impulse response.
- An impulse response may com- prise in the order of 100,000 samples.
- the impulse response is, according to the invention, divided into segments. Required hardware is thereby decreased dramatically because it may be used in a process using time multiplexing.
- each segment of the impulse response is efficiently used because the convolution operations are performed with the segment together with a plurality of samples from the input signal.
- Fig. 1 schematically shows an implementation for discrete convolution in the time plane
- Fig. 2 schematically shows one embodiment of an implementation for discrete convolution in the time plane according to the invention
- Fig. 3 shows how registers in the embodiment according to Fig. 2 are co-operating during different phases of the convolution
- Fig. 4 shows how contents of registers are changing during different phases of the convolution
- Fig. 5 schematically shows an implementation of an output buffer being used in the embodiment according to Fig. 2, and
- Fig. 6 shows the function of a calculating unit in the embodiment ac- cording to Fig. 2.
- a discrete convolution in the time plane may take place according to Formula 2 below.
- a practical example of implementation is shown in Fig. 1.
- n y(t) ⁇ h(v)x(t-v) (2)
- x constitutes input
- h constitutes a convolution core (filter)
- y constitutes output
- the basic operations in convolution may be very efficiently paralleled.
- Input samples are fed into a first shift register 10.
- a corresponding first filter register 18 holds the impulse response.
- Sound samples and impulse responses are stored in registers 10 and 18, of which each value has its own direct output.
- a separate unit for multiplication is provided, i.e. all multiplications are performed in parallel.
- Fig. 1 shows all the units for multiplication combined in a multiplier unit 12. All of the results of these multiplications are then to be added, and this may also be performed in a single step in an adder unit 13.
- This prob- lem may be solved by placing a number of registers on the way and divide the calculation on a number of clock cycles (pipelining).
- the starting point is an example having a data transfer rate of 50 kHz for the sound, a clock frequency of 50 MHz for the digital electronics (1000 times faster) and a reverberation time of 2 sec- onds and a 16 bit data width for both sound and impulse response.
- An impulse response of 100000 samples is used.
- FIG. 2 A practical embodiment of a convolution unit according to the invention is schematically shown in Fig. 2.
- the input is introduced through an input buffer 14, suitably arranged as a shift register.
- the input buffer 14 is con- nected to a first register 10 through a latch 15.
- the first register 10 is opera- tively connected to a second register 11.
- the first register 10 and the second register are suitably arranged as shift registers.
- the first register 10 and the second register are fed back through a feedback loop 25 and the latch 15, so that it is possible to shift data back and forth between the registers in a circu- lating process. The process is described in detail below.
- a memory 16 is arranged for storing the impulse response.
- the impulse response is divided into segments and operates as a filter on the input signal.
- the memory 16 is designed for storing 1000 segments, each of 100 samples, or terms.
- a segment of the impulse response is processed together with a corresponding segment of input.
- a first calculation step is taking place in a multiplier unit 12.
- the multiplier unit 12 comprises a plurality of multiplication means for parallel multiplication of specific samples from the input signal and from the impulse response.
- the segment of the impulse response to be processed is transferred through a multiplexer 17 to a first filter register 18.
- the memory 16 is also connected to a second filter register 19 through the multiplexer 17.
- the embodiment shown in Fig. 2 is particularly suitable for pipelining.
- included components e.g. registers and calculation units, comprise a plurality of logical blocks, wherein each block is performing a part of an operation. Several operations in series are thereby performed apparently at the same time.
- One filter register is sufficient if the head of the listener is completely still. If the listener is turning the head the same echogram may be used, but new impulse responses must be calculated. This may be performed on a modern PC in a few tens of milliseconds, fast enough to create an apparently continuously active sound image following the head turning in real time. Also when the head is turning a correct reproduction may be provided by doubling the memory for impulse responses in a similar manner as corresponding buffers in the convolution unit. While one memory is used for convolution the other is filled with new contents.
- Alternation between the memories may take place momentarily. Accordingly, while data from the first filter register 18 is processed, new filter data from the impulse response may be transferred to the second filter register 19. Data is alternately used and loaded in the both filter registers 18 and 19, so that the processing may take place without any delay for loading of registers.
- all hardware will be used all the time in continuous operation.
- One way to make the solution more effi- cient is to adapt the different calculation units so that an unnecessary number of bits not is used in each case. By this manner it is possible to increase the rate and decrease the amount of hardware.
- Each cell of the second shift register 11 and of the filter registers is connected to a specific multiplication means of the multiplier 12, so that the multiplication may take place in parallel.
- the result from each multiplication is of the length of 32 bits, if the factors included have 16 bits. However, normally only the 11 most significant bits of the result need to be used. It may be even more efficient to adapt the multiplication means, so that they only calculate the bits needed. Then, 11+11 bits are used in a first step of the adder unit 13, which in turn gives a result of 12 bits. Consequently the number of bits is increased with one for each step in the adder tree. Dependent on the number of steps and the number of segments only as many bits as needed are included in the final result.
- An output 20 of the adder unit 13 is operatively connected to a calculation unit 21.
- a control unit 22 is operatively connected to the calculation unit 21 and an output buffer 23.
- the control unit 22 is ensuring that the partial results from the convolution operations available on the output 20 are added to an associated previously calculated partial result stored in the output buffer 23.
- the control unit 22 is also controlling remaining components, e.g. the shift registers, the calculation units and the multiplexer. Both the multiplier unit 12 and the adder unit 13 are suitably arranged for pipelining.
- the function of the circuit in Fig. 2 may be described schematically in the following way. Sound samples are shifted into the registers, and the different parts of the impulse response are stored in the memory 16, so as to be loaded into one of the filter registers, one part at a time.
- the register is doubled, one loading while the other is used for convolution and then the operation is alternated (the actual change is taking place between two clock cycles and is not taking any time). Then, as output no longer is produced at the correct rate, an output buffer 23 in the form of a memory having a particular calculation unit is introduced, see the description of Fig. 6 below. While the filter segment is stored anyway, convolution operations a few points ahead in time (already registered) are performed.
- Fig. 3A-3D show how an input may be used in an efficient manner.
- new input samples are shifted in through an input 24 of the input buffer 14.
- the content of the input buffer 14 is then shifted further into the first shift register 10 and the second shift register 11.
- the first shift register 10 and the second shift register 11 will contain different generations of input data.
- the shift registers are connected to each other in the manner shown in Fig. 3B.
- the input buffer 14 is separated from the shift registers and is not allowed to change any register content.
- the partial results originated in each position are added to the associated positions of the output buffer through the calculation unit 21 and are eventually fed out in the correct rate.
- all buffers/registers are 100 positions, or samples, long and 16 bits wide and the memory 16 for the impulse response contains 1000 segments. Accordingly, the convolution unit is occupied every clock cycle when in a running operation.
- the input buffer 14 contains a completely new set, or generation of input data, designated G(n).
- the first shift register 10 contains the previously used data corresponding to a generation G(n-1) and the second shift register contains even earlier used data, corresponding to a generation G(n-2).
- the first shift register 10 contains the generation G(n) data.
- the input buffer is set to zero and prepared for introduction of new input samples simultaneously.
- the previously used data is simultaneously shifted from the first shift register 10 to the second shift register 11 , which then contains the data of generation G(n-1).
- Data of the second shift register 11 is not fed into the first register 10, since the feedback loop 25 between the second shift register 11 and the first shift register 10 is broken. This may be accomplished by having the latch 15 break- ing the connection.
- Each convolution results in one output point and it has been calculated based on a specific position of the buffer circulating back and forth and a specific segment of the impulse response.
- the initial point is that 100 new data just were shifted in, see Fig. 4. This position is called B(0).
- the next clock impulse gives the position B(1), etc. until B(99).
- Data is first shifted to the right through the registers until all data from the first register have been shifted into the second register 11. In each position a convolution operation is taking place. After that the same data is used once again by shifting data in the reversed direction. Due to the "swinging" the next position is B(98).
- the absolute first convolution is descending from l(0) and B(0). This value is referred to as l(0)B(0).
- the next value produced is then l(0)B(1) etc. Then, after l(0)B(99) comes l(1)B(98).
- n the number of segments in the impulse response memory 0(j) element in the output buffer
- the output buffer 23 may be designed as a regular RAM memory, which is organised as a ring buffer in accordance with Fig. 5.
- the ring buffer comprises an address pointer 26 for start and one address pointer 27 for end. Between start and end the buffer is set to zero.
- a control unit generates the required addresses so that updating in each moment will take place in the correct position and that the outputting of data takes place in the correct manner.
- the convolution unit produces one result every clock cycle, it will be necessary to read out an old value, add a new value and write back the result to the output buffer during one clock cycle. For example, this may be solved by designing a RAM circuit, so that two values may be reached at a time. Consequently a calculation unit, which is reading, accumulating and writing, must be present between the convolution unit and the memory. Four additions are performed during a process step. As the four additions are not evenly distributed between the four clock cycles, at least one buffer register must be present for intermediate storing of the convolution results from one clock cycle to another. A practical embodiment of such a calculation unit 21 is disclosed in Fig. 6. The calculation unit 21 comprises two parallel pipelines 28 and 29. During four clock cycles the two parallel pipelines 28 and 29 are reading and writing, respectively, in two and adding in four.
- a possible completion to process head turning is to double the memory 16 for impulse responses in a similar manner as the corresponding filter register 18 and 19 in the convolution unit. While one memory is used for convolution the other is loaded with a new content. Alternation between the memories may take place momentarily, so that the convolution does not need to be interrupted. A smaller modification of the control of the output buffer is required so that the buffer is set to zero only at start with a new sound and not when new impulse responses are loaded. By this a "sliding transition" between filters for two discrete directions is obtained. It is not nee- essary to change filters more often than that corresponding to about half of the reverberation time, i.e. once per second in the described example.
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Mathematical Physics (AREA)
- Computational Mathematics (AREA)
- Theoretical Computer Science (AREA)
- Mathematical Analysis (AREA)
- Mathematical Optimization (AREA)
- Pure & Applied Mathematics (AREA)
- Data Mining & Analysis (AREA)
- Algebra (AREA)
- Databases & Information Systems (AREA)
- Software Systems (AREA)
- General Engineering & Computer Science (AREA)
- Computing Systems (AREA)
- Complex Calculations (AREA)
Priority Applications (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| AU58989/01A AU5898901A (en) | 2000-05-19 | 2001-05-16 | Method and device in a convolution process |
Applications Claiming Priority (2)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| SE0001909A SE516467C2 (sv) | 2000-05-19 | 2000-05-19 | Metod och anordning vid faltning |
| SE0001909-1 | 2000-05-19 |
Publications (1)
| Publication Number | Publication Date |
|---|---|
| WO2001090927A1 true WO2001090927A1 (en) | 2001-11-29 |
Family
ID=20279792
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| PCT/SE2001/001074 Ceased WO2001090927A1 (en) | 2000-05-19 | 2001-05-16 | Method and device in a convolution process |
Country Status (3)
| Country | Link |
|---|---|
| AU (1) | AU5898901A (enExample) |
| SE (1) | SE516467C2 (enExample) |
| WO (1) | WO2001090927A1 (enExample) |
Cited By (2)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| WO2003048968A3 (en) * | 2001-11-30 | 2004-12-23 | Apple Computer | Single-channel convolution in a vector processing computer system |
| CN106250103A (zh) * | 2016-08-04 | 2016-12-21 | 东南大学 | 一种卷积神经网络循环卷积计算数据重用的系统 |
Citations (4)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US4862402A (en) * | 1986-07-24 | 1989-08-29 | North American Philips Corporation | Fast multiplierless architecture for general purpose VLSI FIR digital filters with minimized hardware |
| WO1993004529A1 (en) * | 1991-08-12 | 1993-03-04 | Jiri Klokocka | A digital filtering method and apparatus |
| US5814750A (en) * | 1995-11-09 | 1998-09-29 | Chromatic Research, Inc. | Method for varying the pitch of a musical tone produced through playback of a stored waveform |
| US6000834A (en) * | 1997-08-06 | 1999-12-14 | Ati Technologies | Audio sampling rate conversion filter |
-
2000
- 2000-05-19 SE SE0001909A patent/SE516467C2/sv not_active IP Right Cessation
-
2001
- 2001-05-16 WO PCT/SE2001/001074 patent/WO2001090927A1/en not_active Ceased
- 2001-05-16 AU AU58989/01A patent/AU5898901A/en not_active Abandoned
Patent Citations (4)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US4862402A (en) * | 1986-07-24 | 1989-08-29 | North American Philips Corporation | Fast multiplierless architecture for general purpose VLSI FIR digital filters with minimized hardware |
| WO1993004529A1 (en) * | 1991-08-12 | 1993-03-04 | Jiri Klokocka | A digital filtering method and apparatus |
| US5814750A (en) * | 1995-11-09 | 1998-09-29 | Chromatic Research, Inc. | Method for varying the pitch of a musical tone produced through playback of a stored waveform |
| US6000834A (en) * | 1997-08-06 | 1999-12-14 | Ati Technologies | Audio sampling rate conversion filter |
Cited By (4)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| WO2003048968A3 (en) * | 2001-11-30 | 2004-12-23 | Apple Computer | Single-channel convolution in a vector processing computer system |
| US7107304B2 (en) | 2001-11-30 | 2006-09-12 | Apple Computer, Inc. | Single-channel convolution in a vector processing computer system |
| US7895252B2 (en) | 2001-11-30 | 2011-02-22 | Apple Inc. | Single-channel convolution in a vector processing computer system |
| CN106250103A (zh) * | 2016-08-04 | 2016-12-21 | 东南大学 | 一种卷积神经网络循环卷积计算数据重用的系统 |
Also Published As
| Publication number | Publication date |
|---|---|
| SE0001909L (enExample) | 2001-11-20 |
| SE516467C2 (sv) | 2002-01-15 |
| SE0001909D0 (sv) | 2000-05-19 |
| AU5898901A (en) | 2001-12-03 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| US4998281A (en) | Effect addition apparatus | |
| US6208687B1 (en) | Filter switching method | |
| JP2976429B2 (ja) | アドレス制御回路 | |
| JPS59202499A (ja) | 残響装置 | |
| WO2001090927A1 (en) | Method and device in a convolution process | |
| JP2003271165A (ja) | 音場再生装置、プログラム及び記録媒体 | |
| KR100233284B1 (ko) | 어드레스 발생장치 | |
| JPH04330561A (ja) | デジタル信号処理装置 | |
| JP2634561B2 (ja) | 可変遅延回路 | |
| JP2556560B2 (ja) | 楽音生成方式 | |
| JPH06177676A (ja) | 信号処理装置 | |
| KR950002074B1 (ko) | 파이프라인 구조의 잔향부가 시스템 | |
| JP2000308199A (ja) | 信号処理装置および信号処理装置の製造方法 | |
| KR100294919B1 (ko) | 입체적오디오신호재생장치및그방법 | |
| JP2542616Y2 (ja) | 残響付加装置 | |
| JPS6073694A (ja) | 残響付加装置 | |
| Kot | Digital sound effects echo and reverb based on non-exponentially decaying comb filter | |
| JPS62123820A (ja) | デジタル・グラフイツク・イコライザ | |
| JPH08292764A (ja) | 信号切換装置 | |
| JPH0544040B2 (enExample) | ||
| JP2611406B2 (ja) | デジタル音声信号発生装置 | |
| JPH03201900A (ja) | 音場補正装置 | |
| JP3991475B2 (ja) | 音声データ処理装置およびコンピュータシステム | |
| JP2905904B2 (ja) | 電子楽器の信号処理装置 | |
| JP2527465Y2 (ja) | デジタル・オ−ディオ・ト−ン・コントロ−ル装置 |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| AK | Designated states |
Kind code of ref document: A1 Designated state(s): AE AG AL AM AT AU AZ BA BB BG BR BY BZ CA CH CN CO CR CU CZ DE DK DM DZ EC EE ES FI GB GD GE GH GM HR HU ID IL IN IS JP KE KG KP KR KZ LC LK LR LS LT LU LV MA MD MG MK MN MW MX MZ NO NZ PL PT RO RU SD SE SG SI SK SL TJ TM TR TT TZ UA UG US UZ VN YU ZA ZW |
|
| AL | Designated countries for regional patents |
Kind code of ref document: A1 Designated state(s): GH GM KE LS MW MZ SD SL SZ TZ UG ZW AM AZ BY KG KZ MD RU TJ TM AT BE CH CY DE DK ES FI FR GB GR IE IT LU MC NL PT SE TR BF BJ CF CG CI CM GA GN GW ML MR NE SN TD TG |
|
| 121 | Ep: the epo has been informed by wipo that ep was designated in this application | ||
| REG | Reference to national code |
Ref country code: DE Ref legal event code: 8642 |
|
| 122 | Ep: pct application non-entry in european phase | ||
| NENP | Non-entry into the national phase |
Ref country code: JP |