WO2017106898A1

WO2017106898A1 - Improved sound projection

Info

Publication number: WO2017106898A1
Application number: PCT/AU2016/000403
Authority: WO
Inventors: Joseph Hayes
Original assignee: Acoustic 3D Holdings Ltd
Priority date: 2015-12-22
Filing date: 2016-12-21
Publication date: 2017-06-29

Abstract

A digital sound processing system which modifies the output signal by subjecting the signal to a Gabor or Morlet wavelet generator In one aspect a Gabor, Morlet, or perceptual wave generator is inserted in the signal path prior to the reception of the signal by a loudspeaker. This solves the contamination by reflection problem caused by conventional loudspeakers.

Description

IMPROVED SOUND PROJECTION

This invention provides a means of enhancing auditory perception by improved digital signal processing. Background to the invention

USA patent 5764782 by the present inventor disclosed an acoustic reflector facing the sound source. The reflector had an odd prime number of wells having depths that varied according to a quadratic residue sequence. This invention is predicated on an understanding of the physiology of hearing and that the generation of diffuse waves would improve the listening experience.

WO2012015650 discloses a reflector and other arrangements for generating diffuse waves within a fluid space to clarify energy and heighten specific information in the space which carries a sound signal. The diffuse waves may also be generated by digital signal processing using suitable algorithms to modify the signal.

It is an object of this invention to provide an alternative method of enhancing the experience of sound.

Brief description of the invention

To this end the present invention provides a digital sound processing system which modifies the output signal by subjecting the signal to a correction convolution to achieve an output of at least a Gabor or Morlet wavelet generator. In one aspect, a correction convolution to Gabor or Morlet wave generator is inserted in the signal path prior to the reception of the signal by a loudspeaker.

This invention is predicated on the insight that the problem with normal

loudspeakers is they radiate their acoustic energy in a form that is strongly reflected around inside the listening room. The loudspeakers operated according to this invention, radiate their acoustic energy in a form that does not produce audible reflections inside a room. This reflection causes confusion when the brain is trying to resolve what it is hearing. This is done by diffusing the sound at the source. This solves the contamination by reflection problem caused by conventional

loudspeakers. However, diffusion may not be clearly heard, so to turn it back into something that can be clearly heard, the diffuse sound of one channel is allowed to mix with the diffuse sound from other channels. Through a unique property of the diffusion produced by this invention, when two or more channels mix they turn back into the original audio signal acoustics.

The Gabor transform is a type of short-time Fourier Transform. Morlet adapted this to produce Morlet wavelets to assist in analysis of seismic signals. The Morlet wavelet is closely related to human perception for hearing and vision. Morlet wavelets are used in signal processing to detect edge and change patterns in signals, and have an ability to time stamp the finite of edge information. The Morlet wavelet analysis is used in electrocardiogram (ECG) to discriminate abnormal heart beat behaviour. The Morlet wave transform is used in music transcription as each note has a clear start and end time in a Morlet transform.

A continuous wavelet transform using a Morlet wavelet has the effect of drawing out underlying patterns of correlation that are contained in the signal. These underlying patterns often remain constant in time width (period) even at different scales of wavelet.

One way to achieve an effect similar to that achieved in the systems described in USA patent 5764782 and WO2012015650, is to insert a correctional convolution into the electronics signal path (audio) and thereafter drive this signal through a conventional loudspeaker or headphone device.

The convolution of the closed device output measurement with the correctional convolution aims at producing a modified closed device output that mimics a Morlet, Gabor, or other perceptual wavelet form. The correctional convolution can be achieved using an appropriately designed correctional filter. Typically an all pass filter can be used to achieve these sorts of time response manipulations.

This code may be part of any digital signal processing system. The output signal is modified by subjecting the signal to a correctional filter to achieve an output resembling a Gausian envelope generator.

The scale of the resultant wavelet may be altered to achieve different subjective perceptual effects.

The style of resultant wavelet may be altered to achieve different subjective perceptual effects.

During the process of creating a wavelet convolution of the audio signal in the DSP, it is simultaneously possible, and indeed desirable, to effectively 'correct' for any desired envelope of the target loudspeaker-driver. Thus treated, the resultant acoustic signal will closely resemble a wavelet encoded acoustic created by either of the applicants disclosed in reflector or manifold technologies as disclosed in Australian patents 2011318232 and 2016210715 and PCT/AU2016/000154..

The signal processing system of this invention is particularly useful in improving the perceived quality of small transducers of the kind used in cellphones, smart phones and computing devices.

Detailed Description of the invention

Preferred embodiments of the invention will be described with reference to the drawings in which:

Figure 1 illustrates a typical wavelet of this invention.

Figure 2 is a schematic view of the signal processing system of this invention; Figure 3 illustrates a testing apparatus;

Figure 4 shows the application of the test setup used in Figure 3 for use on a loudspeaker and reflector combination as described in the applicant's patent WO2012015650;

Figure 5 shows the results of a measurement taken on a prior art direct radiating loudspeaker;

Figure 6 shows the same results for a UUT in accordance with this invention;

Figure 7 illustrates a fixed morlet wavelet;

Figure 8 shows the Sine function at various stages of phase;

Figure 9 A shows the close up detail of the captured impulse response of a loudspeaker of this invention;

Figure 9B shows a captured impulse response of a speaker in accordance with

WO2012015650;

Figure 10 shows the test system used in figure 1 being used with a different test tone;

Figure 1 1 shows the captured response within a room acoustic environment;

Figure 12 shows the Ricker's Criterion;

Figure 13 shows the design formulae for a lowpass sine wavelet;

Figure 14 shows a typical EQ curve for an active loudspeaker;

Figure 15 shows the high frequency satellite curve; Figure 16 shows the Fast Fourier Transform FFT of a manifold as described in AU 2015901657;

Figure 17 shows an isometric view of a manifold loudspeaker with a wavelet transient ring radiation;

Figure18 shows a listener in relation to a single manifold loudspeaker that radiates wavelet rings;

Figure 19 shows a stereo air of manifold loudspeakers that radiate individually different wavelet ring patterns;

Figure 20 shows a complete surround sound system using three manifold speakers; Figure 21 shows an extended virtual reality environment in which five manifold loudspeakers are used.

Figure 22 - is a graphical view of a tone and its Fast Fourier Transform;

Figure 23 - is a graphical view of a tone and its Fast Fourier Transform;

Figure 24 - is a graphical view of a tone and its Fast Fourier Transform;

Figure 25 - is a schematic diagram of a system of sudden phase signal injection based on bass energy in the stop band of a loudspeaker;

Figure 26 depicts a left channel , a right channel and a convolved interaction;

Figure 27 is a detailed depiction of figure 26;

Figure 28 is a detailed view of the figures 26 and 27;

Figure 29 depicts the real and imaginary parts of a Morlet wavelet;

Figure 30 shows a detailed view of the central zero phase convolution wavelet;

Figure 31 is a coefficient phase plot of the continuous wavelet transform with a

Morlet wavelet;

Figure 32 is a time alignment plot of the continuous wavelet transform with a Morlet wavelet;

Figure 33 is time response plot of a smartphone driver output and a modified smart phone driver output.

The Morlet wave generator may be a processor using the Matlab Morlet function. In one embodiment an audio recording is convolved with a Morlet wavelet of scale = 3 (at a sample rate of 44,100 samples per second) to produce a perceptual recording that has been encoded with perceptual wavelets allowing for greater cognition of the temporal changes in the recording. This audio file is then played back through conventional loudspeakers or headphone devices. Another embodiment is to pass the audio signal within a device via a Morlet wavelet of scale = 3 (at a sample rate of 44,100 samples per second) such that the playback device itself does the temporal encoding and plays that back through a conventional loudspeaker or headphone unit.

An equation for generating a morlet wavelet is;

morl(x) = exp(-x²/2) ^* cos(5x)

Figure 3 shows a test setup, well known in the art, used to measure a loudspeakers 'impulse' response.

The test system emits a very finite burst of energy, usually one sample long and at full scale output, that causes the complete system to react exposing its 'transient' behaviour at the recording microphone location. This is often referred to as the delta function.

Typically, an 'on axis' measurement is performed at a nominal distance of 1 metre between the unit under test (UUT- the loudspeaker) and the microphone.

The locations of the loudspeaker should be chosen to be useful for measurement purposes (large rooms and the loudspeaker preferably located at mid room location horizontally, and vertically). This gives a maximum path light before early reflections thus allowing the longest view of the direct sound of the loudspeaker.

The test setup of figure 3 is used to measure conventional direct radiating loudspeakers.

Figure 4 shows the application of the test setup used in Figure 3 for use on a loudspeaker and reflector combination as described in the applicants patent WO2012015650. Every part of the system is exactly the same. It is only the behaviour of the UUT that changes.

Figure 5 shows the results of a measurement taken on a prior art direct radiating loudspeaker in a standard living room (typical of one used for either television viewing or listening to music via audio files).

The result are shown initially at the top of the diagram as an 'impulse' measurement (time domain). The Y axis is the amplitude sensed for sound pressure level by the recording microphone. The X axis is time but expressed in centimetres. Sound travels at 343.2 metres per second at 20 degrees Celsius and 70% humidity. This is the equivalent to 34,320 centimetres per second. The scale on the results page goes from -20 cm to +200 cm. The 0 cm time is allocated to the maximum y result recorded.

The direct sound from the UUT arrives at the recording microphone at a nominal time = 0cm. The first early reflection 102 arrives at around 90 cm. A second reflection arrives at the recording microphone at around 155 cm.

The middle portion of the results of Figure 5 shows a Continuous Wavelet

Transform (CWT). The Y axis is the 'scale' of the wavelet. The X axis is the same as for the impulse response being time expressed in centimetres.

The intensity of 'white' on the CWT plot is a representation of a 'correlation' score between the portion of the recorded UUT signal against a fixed Morlet wavelet as shown in figure 7.

So where there is a high intensity white region on the plot this indicates that the signal is highly correlated to the comparison signal of the Morlet wavelet of a scale equivalent to that shown on the Y axis at the location of the white area.

The lower portion of the results figure shows some readout of the result computed from the results table.

The first reflection 102 of the incident sound 101 arrives at around 90 cm and visually appears to be similar to a Gabor wavelet in the time domain (impulse response). It also stands out in the CWT 105 and has characteristics not that dissimilar to the direct sound CWT 104.

The second reflection 103 of the incident sound 101 arrives at around 155 cm and visually appears to be similar to a Gabor wavelet in the time domain (impulse response). It also stands out in the CWT 106 and has characteristics not that dissimilar to the direct sound CWT 104 at least at a higher scale wavelet value. Morlet wavelets are typically used for edge detection in geotechnical circles. Thus the results of figure 5 are suggesting an almost similar correlation at the arrival of the direct sound as well as the first and second reflections.

A perceived edge in psychoacoustics is a temporal cue used within the human audiology anatomy. Thus the prior art direct radiating loudspeaker is providing temporal cues caught within the audio signal but also providing strong temporal cues from the listening room reflections. Figure 6 shows the same results for a UUT built on technology as divulged in this application. The immediate differences are that the direct sound 201 visually looks similar to either a Morlet or Gabor wavelet, or a Sine function, or a hybrid of all these - a Hayes wavelet or a perceptual wavelet.

On the CWT plot of figure 6 it is only the direct sound that has a strong correlation to a Morlet wavelet. The first reflection 1 and 2 no longer appear as 'edges' as defined by the Morlet wavelets known in the art of geo-technology.

It appears that the first and second reflection of the prior art UUT do take on the appearance of a Gabor wavelet.

Figure 8 shows the Sine function at various stages of phase. The first and second reflection of the prior art UUT appear to be Gabor wavelet in nature at around a 45 degrees phase lag.

The acoustic reflection process normally involves a phase delay caused by 'dilation' between the pressure and velocity waves. This 'dilation' manifests as a phase delay relation to the amount of absorption of energy that occurred between the emitted reflection compared to the incoming source sound wave.

It is plausible that natural reflections have Gabor wavelet properties and phase behavior, consistent with absorption of energy upon reflection off an absorbent surface.

Consequently the reflection of the prior art UUT appear to be Gabor wavelet with a imposed phase delay indicative of the abortive properties of the reflecting surface. The UUT in figure 6 has only one point of 'edge' being at time = 0 cm. The strong correlated edges of figure 3 at 90 cm and 155 cm are substantially missing in figure 6.

The technology described by this invention therefore can be considered to be one that significantly reduces the temporal importance of the listening room reflections whilst strengthening the direct sound edge qualities being temporal cues present in the captured audio signal.

Thus in a listening room that is excited by loudspeakers as described by this invention deliver a sound field dominated by the captured cues in the audio signal and therefore the listener predominantly perceives the temporal acoustics of the audio signal space. Figure 5 maximum correlation occurs at -1.1 cm at a Morlet wavelet scale = 7 (sampling frequency is 96kHz) and contains 11 % of the total energy present over the 220 cm sampling window. The maximum correlation value against the Morlet wavelet is 1 13.

Figure 6 maximum correlation occurs at -1 .4 cm at a Morlet wavelet scale = 7 (sampling frequency is 96kHz) and continues 13% of the total energy present over the 220 cm sampling window. The maximum correlation value against the Morlet wavelet is 202.

Thus the 'edge' properties as defined by the measured correlation against known prior art of a Morlet wavelets is almost twice as strong in figure 6 as they are in figure 5.

Figure 7shows the shape of a morlet wavelet. Figure 8 shows the shape of various sine functions under various phase changes.

Figure 9 A shows the close up detail of the captured impulse response of a loudspeaker of this invention. This is the window of sound from -20 cm prior to the maximum recorded value to 20cm after the maximum recorded value. The central section around t= 0cm is shaped like that of a Gabor wavelet at a 90 degree phase angle (Figure 8 reference). But there is a lot of the signal present in the

measurement before and after the Gabor wavelet looking middle portion. This could be the 'ringing' from a poorly damped spectral enclosure.

Figure 9B shows a captured impulse response of an earlier form of A3D speaker. In Figure 9B the central element is more disconnected from the pre signal and the post signal. It appears very much like a classic Morlet wavelet and or a sine filter transient.

A sine filter is the transient of what are known 'stone wall' filters. A sine function is a stone wall low pass frequency filter. A stone wall filter has a very steep stop band attenuation and has a perfect phase response in the pass band.

Figure 10 shows the test system used in figure 4 being used with a different test tone. Instead of emitting an 'impulse' test signal the emission is a carefully constructed test tone.

For the purpose of explanation a test tone was created using a 500z carrier. At every 3 milliseconds the phase of this 500Hz carrier is shifted forward 90 degrees. So this test tone only contains one frequency, 500Hz, but has regular sudden phase change at every 3 milliseconds. The frequency of the encoded phase changes is 333Hz (1/3 x 10-3).

This test tone is referred to as a Temporal Encoded Pure Tones (TEPT). Its purpose is to induce a temporal pattern of sudden phase changes onto a carrier signal (500Hz).

Figure 1 1 shows the captured response within a room acoustic environment (listening room) of this test tone played through an A3D speaker. There is considerable amount of background noise present that has the effect of making the 500Hz carrier look slightly contaminated. However, at regular 3 millisecond intervals (~100cm) the CWT shows a high correlation score at a scale = 7 wavelet. The areas in-between the ~100cm intervals include the room acoustics temporal contribution. They never encroach the 'edge' marking ability of the sudden phase changes encoded into the test signal. Thus the recorded audio is now dominant in the temporal response in the listening room when played through an A3D speaker. Figure 12 shows the Ricker's Criterion, as known in the siesmic art, wherein a flat spot in the middle of the wavelet can be tolerated and not effect the temporal resolution of these wavelets.

Figure 13 shows the design formulae for a bandpass sine filter. This will produce a wavelet transient very close to those measured on A3D loudspeakers.

The similarity with a sine function transient suggest the A3D loudspeaker could possess a brick wall filter property. Namely, little spectral content above the cut-off frequency.

Both measurement of figure 9A and 9B are considered to be 'zero phase' wavelets. A 'zero Phase' wavelet is defined as: "The minimum phase wavelet correspond to front loaded energy i.e. at time zero minimum energy and elsewhere maximum. While zero phase wavelet has maximum energy at time zero."

A zero phase wavelet when convolved leaves a zero phase result. Again this is excellent for acoustics.

One could argue that the sound caught into the left ear is convolved by the human audiology system with that caught by the right ear. Therefore zero phase wavelets leave a zero phase image inside ones auditory system, inside the listeners brain. Figure 14 shows a typical EQ curve for an active loudspeaker. Curve 801 shows the frequency response of the lower bass portion. Curve 802 shows the frequency response of the 'satellite' portion of the loudspeaker. The curve 803 shows one of the filter which is used to 'cross over' between the bass unit and the satellite. This curve is a classic 2^nd order Butterworth filter with a corner frequency around 150Hz. Local equalisation shapes are contained at 804 and 805. These are used to correct local issues in the spectral response of the loudspeaker.

It is thought, through the prior art knowledge of psychoacoustic behaviour, that the spectral portion of the hearing response lies between 20Hz to 5,000 Hz. The temporal response is thought to be 5,000 Hz to 20,000 Hz.

Thus in Figure 15 the high frequency satellite curve 901 includes a particular equalisation 902 that seeks to boost or cut the temporal perception region such that the overall balance between spectral and temporal perception can be achieved. Figure 16 shows the Fast Fourier Transform FFT of an A3D manifold developed for a Cobra manifold. This embodiment is suitable for use on small consumer electronics devices such as smart phones. So it has the ability to temporally mark (wavelet encode) into the listening space the sudden phase changes in the audio signal. The prior art criteria of FFT show little distortion through the addition of this wavelet encoding manifold. This spectral curve can be equalised by the host smart phone electronics.

There is little spectral bass below 500z in this device. It is plausible to convert bass sounds into their equivalent sudden phase jumps on a carrier frequency above 500Hz such that these bass sounds become perceptible via the temporal information channel to the brain rather than the spectral information channel to the brain that will require FFT energy below 500Hz. Spectral energy below 500Hz is simply not supported by the physics of these small speaker drivers.

A benefit of the increased sound pressure levels due to increased volume velocities is an increased radiated sound pressure level into the listening space.

Figure 17 shows an A3D loudspeaker t01 radiating a sound field in which there is a phase anomaly (temporal activity) at a radius 102. The transient of this phase nominally is a spatial wavelet 103 and this wavelet exists of a circular ring 104 at radius 102 around the A3D loudspeaker 101.

Figure 18 shows the same A3D loudspeaker z01 radiating two phase anomalies causing two spatial wavelet rings z06. A human standing in this radiated field will hear these temporal rings z06 via both ears z03 and z02 such to cause a zero phase image z05 inside ones perception system.

Figure 19 shows a stereo pair of A3D loudspeakers y01 and y02 that radiate spatial wavelets y03 and y04 based on each channels phase anomalies. Monaural information in the stereo mix will manifest as coherent acoustic energy along the centre line of these speakers y06. A listener y07 will hear both direct energy form A3D speakers y01 and y02. They will also experience a phantom sound field formed by interactions between the left and right stereo signals causing both a spectral and temporal sound field. Phase congruency will exist in this sound field. Minute difference between left and right channels will build a virtual reality acoustic within the sound field.

This image will exhibit depth of field as well as specular imaging between channels. Figure 20 shows three A3D loudspeakers k01 , k02, and k03 placed around a listener k07. These three A3D loudspeakers k01 k02 and k03 will create three direct sound field form this monaural content and three phantom zero phase sound fields form the interactive sound fields. This will provide a laterally immersive listening space.

Figure 21 shows a complete virtual reality sound space created with five A3D loudspeaker y01 y02 y03 y04 and y05. A3D speakers y01 y02 y03 and y04 are in a quadraphonic arrange placed laterally around the listener y06. A3D speaker y05 is places above the listener y06. these A3D loudspeakers y01 y02 y03 y04 and y05 will create five direct zero phase monaural perceptions form monaural content form each source. They will create 6 lateral stereo zero phase sound fields form the interactions of lateral sound sources, i.e.; y01 and y02

y02 and y03

y03 and y04

y04 and y01

y01 and y03

y02 and y04 and they will create four overhead zero phase sound fields form the stereo interaction between;

y01 and y05

y02 and y05

y03 and y05

y04 and y05

This will provide an immersive reality acoustic through the recording and manipulation of a 5 channel audio signal. Encoding 5 channels of audio in a digital file is known in the art. A zero phase zone such as described is a simulation of a 'live' acoustic sound field.

6 lateral stereo zero phase sound fields

4 vertical zero phase sound fields

5 direct mono zero phase sound fields

Figure 22 shows a constructed 'tone' consisting of a 500Hz carrier. However at every 3 milliseconds a sudden 90 degree phase change occurs 21. The Fast Fourier transform shows that spectrally this is seen as a combination of

approximately 410 Hz and 750 Hz components. However in this tone the 333Hz being the 3 millisecond interval where sudden phase jumps occur is dominant. Figure 24 shows a constructed 'tone' consisting of a 800Hz carrier 24. The Fast Fourier transform shows that spectrally this is seen as 800 Hz only 25.

Figure 23 shows a constructed 'tone' consisting of a 800Hz carrier and a small phase change (15 degrees) at 10 millisecond intervals 26. The Fast Fourier transform shows that spectrally this is still seen as 800 Hz only, however audibly the tone of 100Hz can be heard due to the 10 millisecond phase changes. Smart phones are known to have little energy below 500Hz to 700Hz. The physical speaker driver cannot support tones below this region.

Figure 25 shows a system to insert bass into the pass band of a smart phone (700Hz and above) by first splitting the audio signal into below (3232 and

above(3229) 700Hz components. The higher pass region 3229 is fed to the smartphone speaker 3231 after it passes through a phase modifier 3230. The lower portion of the audio signal 3231 is passed through a filter that extracts bass information and converts it into phase change on the pass band 3229 signal. In this method bass is encoded as phase change into the above 700Hz audio signal and becomes perceptible as bass through the human temporal perception systems. Figures 26 to 32 relate to validation of the method for creating a hologram. As the QRD has an autocorrelation equal to zero it is diffuse. But since any autocorrelation is equal to 1 at time = 0 then when something of zero autocorrelation convolves with itself it leaves a perfect 1 at time = 0.

If something is diffuse it has arguably no coherent sound. But when this diffusion mixes with the other channel, it leaves a coherent copy of the encoding

sound/image.

A 'mist' is the equivalent to 'uncorrected energy'. By 'uncorrected' is meant that there is energy there, but it has no form (correlation). But when this 'mist' crosses (convolves) a 'mist' coming from the other direction they interact to leave a perfect correlation only at time = 0 but a value of 1 . Its like two aerosol cans, a left and right facing can. If you spray each can at leach other then their interaction creates an image. The image is the by product of the inherent autocorrelation being equal to zero. The only information will be at time = 0 and it will be a value of 1 (perfect correlation) This is the math of autocorrelation.

Figure 26 shows three plots. The top plot is the captured 7 meter sample of an AS8 being a A3D loudspeaker. This was taken in a open workshop space.

For analysis purposes this plot has been assigned the role of the left-hand speaker acoustical output.

The second plot is of a notional right-hand speaker acoustical output. It is a time inverted version of the left (top) signal (traveling in the opposite direction it would appear 'mirrored' around a nominal centre position).

The third plot is of the convolution of the left and right speaker outputs at that nominal 'centre position'. The convolution is what you would get as the left and right speaker signals pass each other in a convolution system. If the human auditory systems can convolve the signals form the left and right ear then such an output would result.

This case assumes that the left and the right loudspeakers have

identical outputs.

Figure 27 shows a closer look at these 3 plots again. This time it is a 1.8 meter window. The simplified 'zero phase result' of the convolution of left and right is clear to see.

Figure 28 is a detailed view of the plots of figures 26 and 27. A complex

wave packet transient from either left or right speaker convolve to a simpler zero phase result. This means that almost all of the noise disappears. It is

almost characterized as a single sample at 20kHz wide. Each marker represents 3.7cm, so the central spike is about 1cm wide. That is three samples at

96,000 Hz. It is classic Gabor or

Klauder Wavelet and perfectly symmetrical. It could be

argued it is facsimile of a delta function.

Figure 29 shows the real and imaginary parts of a Gabor/Morlet wavelet. There is a lot of similarity between figure 29 and figure 9A plots. The imaginary part of the Morlet wavelet in fig 29 is very similar to the individual left or right channel acoustical signals. The real part of the Morlet wavelet is very similar to the convolved

left and right output.

Figure 30 shows a close up view of the central zero phase convolution wavelet.

This is the resultant convolution at a point equally distant form the two

speakers. It features a central peak at 2044 samples. The next negative

peaks are between 2040 and 2041 (say 2040.5 samples) on the left and 2047 and 2048 (say 2047.5 samples) on the right of the major peak. The distance between these peaks is 2047.5 - 2040.5 = 7 samples.

The distance between the first zero crossings is slightly less than 5 samples. Lets say 4,5 samples. Referrening to formulae taken from the work of 'Seismic Resolution of Zero- Phase Wavelets, R. S. Kallweit and L. C. Wood, Amoco Houston Division DGTS January 12, 1977' we can apply seismic formulae to validate the performance of the resultant zero phase wavelet of the convolved left right channels of Figure 29.

This is aligned the value τ and is referred to a the Predominant Period".

The term "predominant frequency" which is defined in the literature as the rec- iprocal of the wavelet's breadth (Tb).

7 samples at 96,000 samples per second is;

Predominant Period = 7/96,000 = 73 microseconds In time

The "Predominant Frequency" is therefore 1 / τ.

"Predominant Frequency" = 1/73 microseconds = 13,714 Hz.

The term "temporal resolution of a zero-phase wavelet" may be defined as the time interval between the wavelet's primary lobe inflection points. In figure 30 the wavelet's primary lobe inflection points is at 2042 and 2047 samples. The time interval between the wavelet's primary lobe inflection points is 5 samples.

This is the minimum two-way time through a reverberation sound field as heard directly by the listener. A wavelet's inflection points are found by setting equal to zero the second derivative of the wavelet itself.

wavelet breadth: Tb = 1 / (1.3)f1

Thus f1 (Peak frequency) = 1/(1.3) Tb = 1/(1.3) x 73 Microseconds = 1/95 microseconds = 10,537 Hz temporal resolution TR = 1 / (3.0)f1

Thus TR = 1/(3.0)f1 = 1/(3.0) x 10,537 = 1/31 ,612 = 31.6 microseconds. The A3D wavelet has a temporal resolution of 31.6 microseconds. At 96,000 samples per second the length of a sample is 10.4 microseconds.

The zero crossings occur 4.5 samples apart. 4.5 samples

at 96,000 samples per second is;

TO = 4.5/96,000 = 47 microseconds In time

Therefore the

bandwidth (f4) = 1/T0 = 1/47 microseconds = 21 .3kHz a final check of the formulae TR = 0.43Tb = 0.86Tb / 2

shows;

TR = 0.43Tb = 0.43 x 7 samples = 3 samples. This validates against the calculated temporal resolution of 3 samples

(31 .6 microseconds)

It would appear that the design formulae of Kallweit and Wood given in figure 13 substantially validate the performance of the loudspeaker of this invention. The speaker in this case, has produced a

convolved temporal sound field of just over 3 samples resolution compared to the sampling frequency of 96,000 samples per second. In frequency resolution the t emporal resolution is 31.6kHz.

This invention convolves left and right audio signal into wavelet convolved acousti- cal waves. When the left and right acoustical waves convolve with each other it leaves a Morlet Wavelet type result. The temporal resolution of the convolved left and right acoustical channels is 31.6kHz in this example. Thus by using a diffuse carrier, via wavelet convolution though reflecting it off a QRD, the left and right acoustical channels are rendered impervious to the listening rooms acoustics. In turn when the left and right sound fields convolve they leave a perfect copy of the audio including its temporal reverberant sound field with a temporal resolution above 30kHz. This zero phase sound field is both perfect in phase, time alignment, and temporal clarity.

Figure 31 shows the analytic results displayed as a coefficient phase plot of the continuous wavelet transform with a Morlet wavelet. The phase plot shows the 'zero phase result' (convolved left and right channels)

with a maximum and standalone phase excursion of 1.4 degrees at 7090Hz. Even a t 19,500Hz the phase is 0 degrees. At

all other frequencies the phase alignment in less than 0.1 degrees.

Figure 32 shows the analytic results displayed as a coefficient time alignment phase plot of the continuous wavelet transform with a Morlet wavelet.

The time alignment plot shows the 'zero phase result' with the maximum and standalone distance excursion of 2mm at 7090Hz. At all other frequencies the time alignment in less than 0.1 mm.

A valid option for implementing the innovation divulged in this invention would be to implement a software only solution in the case where costs or size restrictions, such as those on handheld devices such as smartphones, are prevalent.

A typical smart phone loudspeaker may have a unique response in the time domain. The top curve in Figure 33 shows such a time response for a known smartphone driver. By using a correctional filter the output can be transformed to that of the lower plot in Figure 33

Such correctional filter can be used as a correctional convolution implemented in software that is inserted into the signal path on a smart phone device. Once inserted the smart phone device that previously showed an impulse response equal to the top curve of figure 33 will show the impulse response equal to the bottom curve of figure 33. The output will not have a quasi Gaussian envelope and the definite of time maxima will be in the middle of the envelope at its peak as opposed to the beginning of the envelope as previously in the uncorrected device.

An all pass filter with maximum phase properties, as are known in the art of signal processing, can achieve such a transformation of time response shape.

The process of this invention may act as a form of apodization for sonic sound fields. Apodization is well understood in the optics world. It is characterised by a Gaussian spread of the envelope of energy. This found to help 'focus' a spatial optical image. Apodization is used in photography to create a 'bokeh' effect.

'Bokeh' is an effect where object in the near field are clearly focused whereas far field objects are blurred into a smooth average. Similarly this invention may have a 'bokeh' effect on the resultant sound field where the far field is averaged. This would appear to replicate a natural process in the human audiology.

Measurements of a loudspeaker according to this invention appear to indicate a soliton like behaviour of the traveling wave shape. Specifically the shape holds its form at different distances away from the source. Thus convolving a signal with a mechanical or digital quadratic residue diffusor (QRD) as described in the

WO2012015650 document may excite sound at the source in a completely different manner than a conventional loudspeaker. The difference is such that a QRD may act as a soliton wave source. This would be useful in any wave motion science. The benefits of solitons are well understood by the art.

Gabor wavelets are important in that when they are convolved with themselves produce a Gabor wavelet as an output(as demonstrated in figures 26,27 and 28). Thus, if the natural world of acoustics in defined, even in part, by the presence of Gabor wavelets then these will be captured in an audio signal. When this audio is in turn played through a system as described in WO2012015650 or the proceesing system of this invention, The output will be a Gabor wavelet. If acoustical reflections are Gabor wavelets then these will be preserved in the playback system through a system as described in WO2012015650 or the proceesing system of this invention.

The phase of the resultant Gabor wavelet produces from the convolution of two Gabor Wavelets will be a product of the individual phase of the input wavelets.

Those skilled in the art will appreciate that this invention provides a unique and simple means to improve the quality output from audio speakers.

Those skilled in the art will also realise that this invention may be implemented in embodiments other than those described without departing from the core teachings of this invention. An immediate observation is that the third convolved plot is a cleaner simpler version than the left or right plots.

Claims

1. A digital sound processing system for audio speakers which modifies the output signal by subjecting the signal to a Gabor or Morlet wavelet generator.

2. A digital processing system as claimed in claim 1 wherein a Morlet or Gabor wavelet is inserted into the electronics signal path and thereafter the signal is driven through a conventional loudspeaker or headphone device.

3. A computing device or cellphone having a signal processing system as

claimed in claim 1 or 2.

4. A digital sound processing system as claimed in claim 1 wherein an audio recording is convolved with a Gabor or Morlet wavelet to produce a perceptual recording that has been encoded with perceptual wavelets and then played through conventional loudspeakers or headphone devices.

5. A digital sound processing system as claimed in claim 1 in which the audio signal is passed within a device via a Gabor or Morlet wavelet such that the playback device itself does the temporal encoding and is then played through a conventional loudspeaker or headphone unit.