GB2466286A - Combining frequency coefficients based on at least two mixing coefficients which are determined on statistical characteristics of the audio signal - Google Patents

Combining frequency coefficients based on at least two mixing coefficients which are determined on statistical characteristics of the audio signal Download PDF

Info

Publication number
GB2466286A
GB2466286A GB0823164A GB0823164A GB2466286A GB 2466286 A GB2466286 A GB 2466286A GB 0823164 A GB0823164 A GB 0823164A GB 0823164 A GB0823164 A GB 0823164A GB 2466286 A GB2466286 A GB 2466286A
Authority
GB
United Kingdom
Prior art keywords
coefficients
frequency
audio signal
frequency coefficients
coefficient
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Withdrawn
Application number
GB0823164A
Other versions
GB0823164D0 (en
Inventor
Sunil Sivadas
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Nokia Oyj
Original Assignee
Nokia Oyj
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nokia Oyj filed Critical Nokia Oyj
Priority to GB0823164A priority Critical patent/GB2466286A/en
Publication of GB0823164D0 publication Critical patent/GB0823164D0/en
Priority to PCT/EP2009/066270 priority patent/WO2010069773A1/en
Publication of GB2466286A publication Critical patent/GB2466286A/en
Withdrawn legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/48Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
    • G10L19/02Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using spectral analysis, e.g. transform vocoders or subband vocoders
    • G10L19/0204Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using spectral analysis, e.g. transform vocoders or subband vocoders using subband decomposition
    • G10L19/0208Subband vocoders
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
    • G10L19/02Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using spectral analysis, e.g. transform vocoders or subband vocoders
    • G10L19/022Blocking, i.e. grouping of samples in time; Choice of analysis windows; Overlap factoring
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/03Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the type of extracted parameters
    • G10L25/18Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the type of extracted parameters the extracted parameters being spectral information of each sub-band

Abstract

An apparatus configured to determine a first plurality of frequency coefficients from a first of at least two filter banks by transforming an audio signal; determine a second plurality of frequency coefficients from a second of the at least two filter banks by transforming the audio signal; determine at least two mixing coefficients dependent on the statistical characteristics of the audio signal; and combine at least one frequency coefficient from the first plurality of frequency coefficients with at least one frequency coefficient from the second plurality of frequency coefficients, wherein the combining is dependent on the at least two mixing coefficients.

Description

AN APPARATUS
Field of the Invention
The present invention relates to apparatus for the processing of audio signals. The invention further relates to, but is not limited to, apparatus for processing audio signals in mobile devices.
Background of the Invention
Filter banks are used for analysing and processing audio signals and may be found in many applications ranging from noise reduction to spectral analysis and audio coding. Filter banks perform a time frequency analysis on the incoming signal in order to extract a set of parameters which are more suited to further processing.
Filter banks operate by dividing the signal spectrum into a number of frequency sub bands and generating a time indexed series of coefficients representing the frequency localized signal power within each sub band. By providing explicit information about the distribution of signal and hence masking power over the time frequency plane, the filter bank may facilitate the identification of perceptual irrelevancies. Additionally, the time frequency parameters generated by the filter bank may provide a signal mapping which is conveniently manipulated to shape any subsequent coding distortion. Further, by decomposing the signal into its constituent frequency components, the filter bank may also assists in the reduction of statistical redundancies.
Filter banks may be deployed on a fixed time frequency resolution basis, which can be realised in the form of a Short Time Fourier Transform (STFT). The utilisation of the STFT thereby enables filter banks of this type to be implemented both effectively and efficiently by the means of a Fast Fourier Transform (FFT).
However, fixed time frequency resolution filter banks can result in undesirable effects in both the time and frequency domain whilst processing the signal.
From Psychoacoustic principles it is understood that the frequency resolution of human hearing is not uniform or linear in nature. Rather the resolution of human hearing is logarithmic and may be represented in the form of a mel or bark scale.
However, the STFT based filter bank is both uniform and linear and does not naturally lend itself to analysing the input audio signal in the manner of the human hearing system. This mismatch between a linear uniform analysis filter bank and the logarithmic based frequency characteristics of the human hearing can result in situations where the filter bank is inadequate for the purpose of analysing the input signal.
For example, one such inadequacy arises during the analysis of a transient or rapidly changing audio signal. In this example the time resolution of a uniform filter bank is typically not small enough to analyse the audio signal with sufficient accuracy. This short coming can result in artefacts in the form of pre-echoes whereby the energy of the audio signal is spread in time, resulting in a smearing of the transient event. Conversely, the uniform filter bank may have insufficient frequency resolution in order to model stationary parts of the audio signal. This can readily manifest itself as a smearing effect in the frequency domain, especially towards the lower frequencies where the energy of the audio signal is somewhat erroneously distributed among wider range of frequency bins. This smearing effect may prevent further processing stages from having the ability to distinguish between closely spaced tones.
One approach to overcoming these disadvantages is to use a wavelet transform comprising appropriate wavelet basis functions which are variable in terms of both time and scale. By using such an approach it is possible to model concurrently within the same transform both high frequency transients with sufficient time resolution and low frequency components with sufficient frequency resolution.
Further approaches to overcoming these disadvantages may be to adaptively vary the time resolution of a STFT filter bank in order to reduce any specific artefacts whilst retaining adequate frequency resolution. The multiple frequency bank mixing scheme approach relies on passing the signal through a number of filter banks in parallel, whereby each filter bank filters the input signal to a different fixed time frequency resolution. The resulting time frequency coefficients may then be adaptively mixed in order to select desired coefficients in each area of the time frequency plane.
The process of mixing the time frequency coefficients can be controlled according to the heuristic properties of the signal. For example, the mechanism to control the mixing of the various time frequency coefficients may incorporate properties of the frequency response of the human hearing system.
Further schemes may adopt signal analysis mechanisms to detect transients or stationary regions in the input signal, thereby adapting the mixing profile of the time frequency coefficients accordingly.
Such a system adopting a heuristic model as a basis for determining the mixing profile in a frequency bank mixing scheme, will typically require apriori information of the signal type in order to obtain optimum performance. Therefore, a system for analysing audio type signals will require a different set of heuristic model characteristics from a system required to analyse speech based signals.
However, the methods used for mixing the coefficients are limited to combining the various time frequency representations of the signal.
Furthermore, once a heuristic model based approach has been tuned for a specific purpose such as combining signal components from multiple time frequency resolution filter banks it is no longer suited for other purposes such as combining signals from multiple sources.
Summary of the Invention
This invention proceeds from the consideration that whilst heuristic based frequency bank mixing schemes may be tuned for a specific application, they are inherently inflexible and cannot be used for purposes other than those for which they were originally tuned. This is despite the fact that different applications use essentially the same generic architecture. For example if a heuristic based scheme was designed for use as a multiple time resolution analyser for audio based signals then it would be unsuitable for use as a multiple signal combiner.
There is provided according to an aspect of the invention a method comprising: determining a first plurality of frequency coefficients from a first of at least two filter banks by transforming an audio signal; determining a second plurality of frequency coefficients from a second of the at least two filter banks by transforming the audio signal; determining at least two mixing coefficients dependent on the statistical characteristics of the audio signal; and combining at least one frequency coefficient from the first plurality of frequency coefficients with at least one frequency coefficient from the second plurality of frequency coefficients, wherein the combining is dependent on the at least two mixing coefficients.
Determining the first plurality of frequency components and the second plurality of components may comprise: determining the first plurality of frequency coefficients with a first time resolution; and determining the second plurality of frequency coefficients with a second time resolution.
The first time resolution is preferably determined by windowing the audio signal with a window function of a first length, and wherein the second time resolution is determined by windowing the audio signal with a window function of a second length.
The method may further comprise changing at least one of: the length of the audio signal windowed with the first length window function to be equal to the length of the audio signal windowed with the second length window; and the length of the audio signal windowed with the second length window function to be equal to the length of the audio signal windowed with the first length window.
Determining the at least two mixing coefficients may comprise: calculating a statistical measure between the first plurality of frequency coefficients and the second plurality of frequency coefficients; determining at least two uncorrelated basis vectors from the statistical measure; calculating a basis vector rating measure for each of the at least two uncorrelated basis vectors; and selecting a principal basis vector from the at least two uncorrelated basis vectors, wherein the elements of the principal basis vector form the at least two mixing coefficients.
Selecting the principal basis vector may comprise: determining a maximum basis vector rating measure; and selecting the uncorrelated basis vector associated with the maximum basis vector.
The statistical measure is preferably a covariance matrix, the uncorrelated basis vector is preferably an eigenvector, and the basis vector rating measure is preferably an eigenvalue associated with the eigenvector.
The statistical measure is preferably a correlation matrix, wherein the uncorrelated basis vector is preferably an eigenvector, and wherein the basis vector rating measure is preferably an eigenvalue associated with the eigenvector.
Combining the at least one frequency coefficient from the first plurality of frequency coefficients with the at least one frequency coefficient from the second plurality of frequency coefficients may comprise: weighting the at least one frequency coefficient from the first plurality of frequency coefficients; weighting the at least one frequency coefficient from the second plurality of frequency coefficients; and adding the at least one frequency coefficient from the first plurality of frequency coefficients to the second plurality of frequency coefficients.
Weighting the at least one frequency coefficient from the first plurality of frequency coefficients may comprise: multiplying the at least one frequency coefficient from the first plurality of frequency coefficients with one of the at least two mixing coefficients dependent on the statistical characteristics of the audio signal.
Weighting the at least one frequency coefficient from the second plurality of frequency coefficients may comprise: multiplying the at least one frequency coefficient from the second plurality of frequency coefficients with a further of the at least two mixing coefficients dependent on the statistical characteristics of the audio signal.
Each of the at least two pluralities of frequency coefficients may comprise at least two groups of frequency coefficients, and wherein at least one frequency coefficient from a first of at least two groups of frequency coefficients is preferably the same frequency coefficient as at least one frequency coefficient from a second of the at least two groups of frequency coefficients.
The audio signal may comprise at least two audio channel signals, wherein the determining of the first plurality of frequency coefficients from the first of at least two filter banks may comprise transforming a first of the audio channel signals; and the determining of the second plurality of frequency coefficients from the second of the at least two filter banks may comprise transforming a second of the audio channel signals.
Determining at least two mixing coefficients dependent on the statistical characteristics of the audio signal may comprise: applying principal component analysis to the first plurality of frequency coefficients and the second plurality of frequency coefficients.
According to a second aspect of the invention there is provided an apparatus comprising a processor configured to: determine a first plurality of frequency coefficients from a first of at least two filter banks by transforming an audio signal; determine a second plurality of frequency coefficients from a second of the at least two filter banks by transforming the audio signal; determine at least two mixing coefficients dependent on the statistical characteristics of the audio signal; and combine at least one frequency coefficient from the first plurality of frequency coefficients with at least one frequency coefficient from the second plurality of frequency coefficients, wherein the combining is dependent on the at least two mixing coefficients.
The apparatus may be further configured to determine the first plurality of frequency components and the second plurality of components by: determining the first plurality of frequency coefficients with a first time resolution; and determining the second plurality of frequency coefficients with a second time resolution.
The apparatus may be further configured to window the audio signal with a window function of a first length to determine the first time resolution, and window the audio signal with a window function of a second length to determine the second time resolution.
The apparatus may be further configured to change at least one of: the length of the audio signal windowed with the first length window function to be equal to the length of the audio signal windowed with the second length window; and the length of the audio signal windowed with the second length window function to be equal to the length of the audio signal windowed with the first length window.
The apparatus may be further configured to determine the at least two mixing coefficients by being preferably configured to: calculate a statistical measure between the first plurality of frequency coefficients and the second plurality of frequency coefficients; determine at least two uncorrelated basis vectors from the statistical measure; calculate a basis vector rating measure for each of the at least two uncorrelated basis vectors; and select a principal basis vector from the at least two uncorrelated basis vectors, wherein the elements of the principal basis vector form the at least two mixing coefficients.
The apparatus may be further configured to select the principal basis vector by being preferably configured to: determine a maximum basis vector rating measure; and select the uncorrelated basis vector associated with the maximum basis vector.
The statistical measure is preferably a covariance matrix, wherein the uncorrelated basis vector is preferably an eigenvector, and wherein the basis vector rating measure is preferably an eigenvalue associated with the eigenvector.
The statistical measure is preferably a correlation matrix, wherein the uncorrelated basis vector is preferably an eigenvector, and wherein the basis vector rating measure is preferably an eigenvalue associated with the eigenvector.
The apparatus may be further configured to combine the at least one frequency coefficient from the first plurality of frequency coefficients with the at least one frequency coefficient from the second plurality of frequency coefficients by preferably being configured to: weight the at least one frequency coefficient from the first plurality of frequency coefficients; weight the at least one frequency coefficient from the second plurality of frequency coefficients; and add the at least one frequency coefficient from the first plurality of frequency coefficients to the second plurality of frequency coefficients.
The apparatus may further be configured to weight the at least one frequency coefficient from the first plurality of frequency coefficients by preferably being configured to: multiply the at least one frequency coefficient from the first plurality of frequency coefficients with one of the at least two mixing coefficients dependent on the statistical characteristics of the audio signal.
The apparatus may further be configured to weight the at least one frequency coefficient from the second plurality of frequency coefficients by preferably being configured to: multiply the at least one frequency coefficient from the second plurality of frequency coefficients with a further of the at least two mixing coefficients dependent on the statistical characteristics of the audio signal.
Each of the at least two pluralities of frequency coefficients may comprise at least two groups of frequency coefficients, and wherein at least one frequency coefficient from a first of at least two groups of frequency coefficients is preferably the same frequency coefficient as at least one frequency coefficient from a second of the at least two groups of frequency coefficients.
The audio signal may comprise at least two audio channel signals, wherein the apparatus may be configured to determine the first plurality of frequency coefficients from the first of at least two filter banks by transforming a first of the audio channel signals; and determine the second plurality of frequency coefficients from the second of the at least two filter banks comprises by transforming a second of the audio channel signals.
The apparatus may be further configured to determine at least two mixing coefficients dependent on the statistical characteristics of the audio signal by preferably being configured to: apply principal component analysis to the first plurality of frequency coefficients and the second plurality of frequency coefficients.
According to a third aspect of the invention there is provided a computer-readable medium encoded with instructions that, when executed by a computer, perform: determining a first plurality of frequency coefficients from a first of at least two filter banks by transforming an audio signal; determining a second plurality of frequency coefficients from a second of the at least two filter banks by transforming the audio signal; determining at least two mixing coefficients dependent on the statistical characteristics of the audio signal; and combining at least one frequency coefficient from the first plurality of frequency coefficients with at least one frequency coefficient from the second plurality of frequency coefficients, wherein the combining is dependent on the at least two mixing coefficients.
According to a fourth aspect of the invention there is provided an apparatus comprising: means for determining a first plurality of frequency coefficients from a first of at least two filter banks by transforming an audio signal; means for determining a second plurality of frequency coefficients from a second of the at least two filter banks by transforming the audio signal; means for determining at least two mixing coefficients dependent on the statistical characteristics of the audio signal; and means for combining at least one frequency coefficient from the first plurality of frequency coefficients with at least one frequency coefficient from the second plurality of frequency coefficients, wherein the combining is dependent on the at least two mixing coefficients.
The apparatus as described above may comprise an encoder.
An electronic device may comprise the apparatus as described above.
A chipset may comprise the apparatus described above.
Embodiments of the present invention aim to address the above problem.
Brief Description of Drawings
For better understanding of the present invention, reference will now be made by way of example to the accompanying drawings in which: Figure 1 shows schematically an electronic device employing embodiments of the invention; Figure 2 shows schematically an audio processing system employing embodiments of the present invention; Figure 3 shows schematically an adaptive mixer deploying a first embodiment of the invention; Figure 4 shows a flow diagram illustrating the operation of the adaptive mixer according to embodiments of the invention; Figure 5 shows schematically a plurality of filter banks according to embodiments of the invention; Figure 6 shows a flow diagram illustrating the operation of the plurality of filter banks as shown in figure 5 according to embodiments of the invention; Figure 7 shows a flow diagram illustrating the operation of the mixing analyser as shown in figure 3 according to embodiments of the invention; Figure 8 shows schematically spectrograms depicting an embodiment of the invention; and Figure 9 shows a further embodiment of the invention; Description of Preferred Embodiments of the Invention The following describes apparatus and methods for the provision of adaptively mixing audio signals. In this regard reference is first made to Figure 1 schematic block diagram of an exemplary electronic device 10 or apparatus, which may incorporate an adaptive mixer according to an embodiment of the invention.
The electronic device 10 may for example be a mobile terminal or user equipment of a wireless communication system.
The electronic device 10 comprises a microphone 11, which is linked via an analogue-to-digital converter 14 to a processor 21. The processor 21 is further linked via a digital-to-analogue converter 32 to loudspeakers 33. The processor 21 is further linked to a transceiver (TXIRX) 13, to a user interface (UI) 15 and to a memory 22.
The processor 21 may be configured to execute various program codes. The implemented program codes comprise an adaptive audio mixing code for processing the audio signal. The implemented program codes 23 may further comprise additional code for further processing of the audio signal. The implemented program codes 23 may be stored for example in the memory 22 for retrieval by the processor 21 whenever needed. The memory 22 could further provide a section 24 for storing data, for example data that has been encoded in accordance with the invention.
The adaptive audio mixing code may in embodiments of the invention be implemented in hardware or firmware.
The user interface 15 enables a user to input commands to the electronic device 10, for example via a keypad, and/or to obtain information from the electronic device 10, for example via a display. The transceiver 13 enables a communication with other electronic devices, for example via a wireless communication network.
It is to be understood again that the structure of the electronic device 10 could be supplemented and varied in many ways.
A user of the electronic device 10 may use the microphone 11 for inputting speech that is to be transmitted to some other electronic device or that is to be stored in the data section 24 of the memory 22. A corresponding application has been activated to this end by the user via the user interface 15. This application, which may be run by the processor 21, causes the processor 21 to execute the encoding code stored in the memory 22.
The analogue-to-digital converter 14 converts the input analogue audio signal into a digital audio signal and provides the digital audio signal to the processor 21.
The processor 21 may then process the digital audio signal in the same way as described with reference to Figures 2 and 3.
The resulting bit stream is provided to the transceiver 13 for transmission to another electronic device. Alternatively, the coded data could be stored in the data section 24 of the memory 22, for instance for a later transmission or for a later presentation by the same electronic device 10.
The electronic device 10 could also receive a bit stream with correspondingly processed data from another electronic device via its transceiver 13. In this case, the processor 21 may execute the decoding program code stored in the memory 22. The processor 21 decodes the received data, and provides the decoded data to the digital-to-analogue converter 32. The digital-to-analogue converter 32 converts the digital decoded data into analogue audio data and outputs them via the loudspeakers 33. Execution of the decoding program code could be triggered as well by an application that has been called by the user via the user interface 15.
The received processed data could also be stored instead of an immediate presentation via the loudspeakers 33 in the data section 24 of the memory 22, for instance for enabling a later presentation or a forwarding to still another electronic device.
It would be appreciated that the schematic structures described in figures 2, 3, 5 and 9 and the method steps in figures 4, 6 and 7 represent only a part of the operation of a complete system comprising an embodiments of the invention as exemplarily shown implemented in the electronic device shown in figure 1.
The general operation of the adaptive mixer as employed by embodiments of the invention is shown in figure 2. A General adaptive mixing system may consist of an adaptive mixer, as illustrated schematically in figure 2. Illustrated is a system 102 with an adaptive mixer 104, and a subsequent signal processing or coding stagel 06.
The adaptive mixer 104 processes an input audio signal 110 producing an adaptively mixed audio signal 112, which may either be stored or passed to a subsequent stage 106 for further processing or coding.
Figure 3 shows schematically an adaptive mixer 104 according to a first embodiment of the invention. The adaptive mixer 104 is depicted as comprising an input 302 which may be connected to M filter banks 303. It would be appreciated that the number of filter banks M may be any reasonable number, the actual number of filter banks being dependent on the embodiment of the invention. The input 302 may be arranged to receive an audio signal which may then be distributed to each filter bank of the M filter banks 303 simultaneously. The output from each filter bank of the M filter banks 303 may in turn comprise K filter bank coefficients. The output from the M filter banks 303 may be depicted as M signals (each having K coefficient values) 304_A to 304_M in figure 3.
The mixing analyser 305 may be configured to accept the filter bank signal 304 from each of the M filter banks 303 and generate as an output a mixing coefficient signal 306.
The mixer 307 may also be configured to accept each of the filter bank signals 304 from each of the M filter banks 303.
In addition to receiving the M filter bank signals 304 from the filter banks 303, the mixer 307 may be further arranged to receive as an additional input the mixing coefficient signal 306 as output from the mixing analyser 305. The mixer 307 may then be configured to utilise the mixing coefficient signal 306 in order to combine (or mix) the M filter bank signals 304 into an output combined filter bank signal 308.
It is to be understood that the output combined filter bank signal 308 comprises an individually weighted sum of the M filter bank signals 304.
The adaptive mixer 104 may then be configured to output the combined or mixed filter bank signal 308 as the signal stream 112.
In a first embodiment of the invention the output combined filter bank signal 308 may be arranged as an individual signal.
In further embodiments of the invention the output combined filter bank signal 308 may be arranged as a plurality of combined filter bank signals.
In some embodiments of the invention the mixing analyser 305 may be configured to accept the input audio signal 302 rather than the filter bank signals 304 from each of the M filter banks 303. In this particular embodiment of the invention the mixing analyser 305 may generate the mixing coefficient signal 306 dependent on the characteristics of the input signal 302.
The operation of these components is described in more detail with reference to the flow chart in Figure 4 showing the operation of the adaptive mixer.
The audio signal is received by the adaptive mixer 104 via the input 302. In a first embodiment of the invention the audio signal from each channel is a digitally sampled signal. In other embodiments of the present invention the audio input may comprise a plurality of analogue audio signal sources, for example from a plurality of microphones distributed within the audio space, which are analogue to digitally (AID) converted. In further embodiments of the invention the multichannel audio input may be converted from a pulse code modulation digital signal to an amplitude modulation digital signal.
The receiving of the audio signal is shown in Figure 4 by processing step 401.
The audio signal received via the input 302 is first distributed to the M filter banks 303.
Figure 5 shows a block diagram depicting a generic filter bank structure 303 comprising M filter banks which may be used to generate the M filter bank output signals from the input audio signal 302 according to embodiments of the invention.
The filter bank structure in figure 5 is shown as having a M individual filter banks 502 capable of filtering the input audio signal 302 represented as x(n) for an time analysis frame instance n. However it would be appreciated that in other embodiments of the invention the operation of M filter banks may be carried out using less than or more than M filter bank structures. For example in one embodiment of the invention a single filter bank structure may be divided so that the input audio signal is input to each part of the divided structure to produce the M filter bank result.
To further assist the understanding of the invention the process of determining the filter bank coefficients 304 by one of the M filter banks 303 as depicted in figure 5 is described in more detail with reference to the flow chart on figure 6.
The step of receiving the audio signal at the filter banks 502, from the processing step 401 from figure 4, is depicted as processing step 601 in figure 6.
In embodiments of the invention each filter bank 502 may convert the time domain input x(n) into a set of K sub bands, otherwise known as filter bank coefficients.
The set of filter bank coefficients for a particular filter bank i, where i is a value between 0 and M-1, and time frame instance n may be denoted as = [.(n,0),(n,1),..(n,k) .(n,K-1)] where (n,k) represents the individual filter bank coefficient k for an analysis frame n. In total there may be M sets of K filter bank coefficients, one set for each of the M filter banks. The M sets of K filter bank coefficients may be represented as[X0,X1 xM1].
It is to be appreciated that the filter bank coefficients from a filter bank structure such as that depicted in figure 5 may be calculated for an input time domain signal on a contiguous frame by frame basis.
It is to be understood in embodiments of the invention that each filter bank 502 may be realised as a block transform, in which the resultant transform coefficients are in effect filter bank coefficients.
In embodiments of the invention each member filter bank 502 may comprise a transform filter bank in which an analysis window function may be applied to the input audio signal x(n) as a first processing or analysis stage.
The step of windowing the input audio signal by a filter bank is shown as processing step 603 in figure 6.
In embodiments of the invention each member filter bank may utilise a different length analysis window as a first processing stage. This allows the plurality of filter banks to process the input audio signal 302 according to different time resolutions simultaneously. It is to be understood in embodiments of the invention that a particular time resolution of a filter bank is determined by the length of the analysis window.
It is to be understood in embodiments of the invention that the number of transform coefficients may be determined by the length of the analysis windowing stage 601 when utilising the block transform approach for filter bank analysis. Thus, in the transform approach as described above each filter bank 502 would produce a different number of frequency coefficients.
In embodiments of the invention it is desirable that the number of transform coefficients (or transform length) may be determined to be the same for each filter bank 502. This ensures that a set of filter coefficients produced from one filter bank has an identical frequency spacing to that produced by another bank. This identical coefficient spacing from one filter bank to the next ensures that corresponding filter coefficients may be combined across multiple banks on a per coefficient basis.
Consequently, in embodiments of the invention the length of the block transform for a filter bank may be adjusted to a length which is uniform for all filter banks. This length may take the value of the longest analysis window deployed and hence the longest transform length applied across the filter bank structure 303.
The process of adjusting the length of the block transform may take the conceptual form of padding the input analysis window with additional zeroes such that the desired transform length is achieved. This process may be visualised as inserting a number of zeroes between each time sample before the input frame is windowed and analysed by the transform filter bank.
In embodiments of the invention the effect of the block transform approach is to interpolate between each valid time sample thereby producing a set of filter bank coefficients whose frequency resolution is increased to that of the longest transform size within the filter bank structure.
The step of adjusting the length of an input window function for a filter bank 502 is depicted as processing step 605 in figure 6.
In a first embodiment of the invention each member filter bank 502 may be deployed using a short term fourier transform (STFT) where the filter bank coefficients correspond to the fourier coefficients from the transform.
Typically, in first embodiments of the invention the STFT filter bank may be implemented in more efficient form as Fast Fourier Transform (FFT).
The step of applying the block transform to the windowed audio data for a filter bank 502 is shown as processing step 607 in figure 6.
The processing step of filtering the input audio signal with the plurality of M filter banks is shown as processing step 403 in figure 4.
The mixing analyser 305 may receive as input the sets of filter bank coefficients as generated from each of the M fitter banks 303. The mixing analyser 305 may then analyse the coefficients from all banks in order to extract the most salient features from the data. This data may then be utilised in order to determine the factors by which coefficients from one bank are combined with corresponding coefficients from another bank.
To further assist the understanding of the invention the process of determining the mixing coefficient signal 306 by the mixing analyser 305 is described in more detail with reference to the flow chart in figure 7.
The step of receiving the filter bank coefficients from the M filter banks 303 at the mixing analyser 305, from processing step 403 from figure 4, is depicted as processing step 701 in figure 7.
In embodiments of the invention the K filter bank coefficients, which may be denoted as =[(n,o),(n,1),..x(n,k)....,x,(n,K-1)J where x(n,k) represents the individual filter bank coefficient k from a filter bank 502 and for a time analysis frame n, may be arranged as a K dimensional row vector as part of an observation matrix F. The number of rows in the matrix may be determined by the number of filter banks. For example if the filter bank system 303 deploys M filter banks, then the number of rows in the matrix would be M. This feature matrix may be expressed as (n0) (n,1) .. (n, k) (n, K -1) F= XM (n,0) x (ui,1) XM (ii, k) XM (11, K -1) It is to be understood in embodiments of the invention that the observation matrix F provides a convenient way of collating and representing the filter bank coefficients from each of the M filter banks, and that its formation is not necessary for any subsequent processing stage.
The step of collating the K filter bank coefficients into row vectors from each of the M filter banks may be shown as processing step 703 in figure 7.
Each row of the observation matrix F may be processed using Principal Component Analysis (PCA) in order to exploit any correlative relationship between coefficients from one filter bank to another.
PCA may be considered as an orthogonal linear transformation which transforms data to a new coordinate system. The new coordinate system allows the data to be represented as vectors within the new coordinate system, whereby projection along a coordinate axis is determined by the variance of the data.
In embodiments of the invention PCA may be applied to the rows of matrix F in order to exploit any correlative behaviour or entropy between the coefficients from one filter bank and another. This may comprise as a first step the determination of statistical similarities between the filter bank coefficients from one bank to the filter bank coefficients from another. This measure may be calculated in the form of an M by M symmetrical square matrix whose elements represents the correlative relationships between respective coefficients from different pairs of filter banks.
In a first embodiment of the invention the M by M symmetrical square matrix may be found by calculating the relative correlation between corresponding coefficients drawn from all possible combinations of pairs of filter banks. This may be expressed in the form of a M by M correlation matrix in which each element of the matrix represents the correlation between corresponding coefficients from two of the M individual filter banks. The correlation matrix may be expressed for M filter banks as cor(.1, i) cor(.1, x2) cor(. cor(1, XM) cor(2, ) cor(2 x2) cor(2 fm) cor(.2 C= cor(m,.j) cor(xM cor( M) Where cor(1,1) represents the correlation between the coefficients from a filter bank i and a filter bank land may be expressed as cor(1,Tx) = x1 (n, k) x1 (11, k) Where x(n,k) represents the filter bank coefficient k from a filter bank I, and an analysis frame n. In other words each element of the matrix C is determined by finding the correlation between coefficients from filter bank I, and their counterpart coefficients from bank j.
In a second embodiment of the invention the M by M symmetrical matrix may be determined by calculating the relative covariance between corresponding coefficients from all possible combinations of pairs of filter. This may be expressed in the form of a covariance matrix in which each element of the matrix represents the covariance between the respective coefficients from two filter banks. The covariance matrix may be expressed for M filter banks as cov(.1) cov(1,) cov( , T) cov(1, X) cov(2, i) cov(.2, x2) cov(2, m) cov(2, C= cov(,.1) cov( .. ... cov( XAI) Where cov(1,) represents the covariance between the coefficients from a general filter bank i and a general filter bank] and may be expressed as cov(,) =(x1(n,k)-m1).(x1(n,k)-m) where rn1 represents the mean of the K filter coefficients for filter bank i, and m1 represents the mean of the K filter coefficients for filter bank j. The means m. and m1 may be expressed respectively as m. =Lx1(n,k) and m In other words each element of the matrix C is determined by finding the covariance between coefficients from filter bank i, and their counterpart coefficients from bank].
It is to be understood in embodiments of the invention deploying just two filter banks the matrix C will be a two by two matrix.
The step of determining the matrix covariance C is depicted by processing step 705 in figure 7.
The set of orthogonal basis vectors defining a vector space for the M sets of filter bank coefficients may be determined by diagonalizing the matrix C. Diagonalization of the matrix C may be expressed mathematically as V'CV=D where V is an M by M matrix comprising the eigenvectors of C, whereby an the eigenvectors may form the columns of the matrix V. An eigenvector of M may be denoted by and the matrix V may be represented as V = VMI. V denotes the inverse of the matrix V. The matrix D denotes the diagonalization of the matrix C, whose leading diagonal comprises the eigenvalues of C. It is to be understood in embodiments of the invention each eigenvector may be of dimension M. In embodiments of the invention an eigenvalue 2 of C may be calculated by solving the determinant of the matrix (C-Al) for 2, otherwise known as the characteristic polynomial det(C-2I)=0 where det denotes the determinant of the matrix, and I is the identity matrix.
In further embodiments of the invention alternative iterative measures may be deployed in order to determine the eigenvalues of the matrix C. These alternative iterative measures may be more suitable for determining the eigenvalues for symmetrical matrices whose orders are greater than three. In other words iterative methods may be more appropriate for embodiments of the invention deploying four or more filter banks. Examples of such iterative techniques include the methods of Tridiagonalization, QR-Factorization and the Power Method.
It is to be understood in embodiments of the invention the number of eigenvalues produced by the above outlined technique may generally correspond to the number of filter banks M. The eigenvectors of C may be found by considering the vectors i3 which satisfy the following the vector equation Cv = The above vector equation may be solved for an eigenvalue 2 in order to determine an eigenvector. In other words the above equation will yield an eigenvector for each eigenvalue 2.
It is to be understood in embodiments of the invention that an M by M matrix C may yield M eigenvalues which in turn will result in M eigenvectors, each vector being of dimension M. The step of determining the eigenvalues and corresponding eigenvectors of the matrix covariance matrix C is shown as processing step 707 in figure 7.
In embodiments of the invention the eigenvectors of the covariance matrix C denoted by the column vectors [ ] of the matrix V may be considered as the orthogonal basis vectors of a linear orthogonal transformation system for the representation of the filter bank coefficients. In other words the eigenvectors may be considered as a new set of coordinate axes in a vector space, and the filter bank coefficients may be represented as projections along these new axes.
It is to be understood that the projection of the data along a particular axis of the new set of coordinate axes may be determined by its variance when projected along that axis. Further, the order of the axes is determined by the relative variance of the data when projected along the axis. For instance the greatest variance by any projection of the data will lie on the first coordinate, otherwise known as the principal component. This particular coordinate axis may be determined to be the eigenvector corresponding to the largest eigenvalue. The order of subsequent coordinate axes may also be determined by the relative magnitude of their respective eigenvalues, and therefore the relative variance of the data when projected along the axis.
In embodiments of the invention the lower variance by any projection of the data may lie along the higher coordinates. Therefore these higher coordinate axes may be viewed as being less important for characterising the data.
The mixing analyser 305 may then select a sub set of eigenvectors in order to formulate the mixing coefficients.
In embodiments of the invention the eigenvector associated with the largest eigenvalue may be selected. This selected eigenvector may be otherwise known in the art as the principal component or principal eigenvector of the matrix V. It is to be understood in embodiments of the invention that the elements of this principal eigenvector may constitute the mixing coefficients for any subsequent processing stages.
In embodiments of the invention the principal eigenvector together with the corresponding mixing coefficients may be expressed as -I iT VA [l'i V2 VMj where A denotes the principal eigenvector, in other words the eigenvector from V corresponding to the largest eigenvalue. The symbol v. denotes the elements of the principal eigenvector which are used to form the mixing coefficient signal 306.
The step of selecting the principal eigenvector to form the mixing coefficients is shown as processing step 709 in figure 7.
Once the mixing coefficients have been determined by the mixing analyser 305, the mixing coefficients may be passed to the mixer 307 via the mixing coefficient signal 306.
The step depicting the overall determination of the mixing coefficients from the M filter bank coefficients by the mixing analyser 305 is shown as processing step 405 in figure 4.
The mixing coefficients may be used by the mixer 307 to combine the filter bank coefficients from each of the M filter banks in order to produce the combined filter bank signal 308.
In embodiments of the invention the combining operation in the mixer 307 may be performed on coefficient by coefficient basis where coefficients from one filter bank may be combined with its counterpart coefficients from the other banks.
In embodiments of the invention the combined filter bank signal 308 may be first achieved by weighting a filter bank coefficient from one bank with a particular element of the principal eigenvector, weighting a counterpart filter bank coefficient from another bank with a different element of the principal eigenvector, and then summing the correspondingly weighted filter bank coefficients. This process of weighting the coefficient from a filter bank and then summing with other correspondingly weighted coefficients from other banks may be repeated across all filter banks to provide for a combined filter bank coefficient.
In embodiments of the invention a combined filter bank coefficient may be determined as z(k) -* (k) + v2 * 2 (k) + + vAf * XM (k) where z(k) represents the kth combined filter bank coefficient. It can be seen from the above equation that the kth combined filter bank coefficient may be determined to be the linear combination of the weighted kth filter bank coefficient from each of the M filter banks.
It is to be understood in embodiments of the invention that all filter bank coefficients from a particular bank may be weighted using the same element from the principal eigenvector. For example, by reference to the above expression, all coefficients from the first filter bank may be weighted using the weighting coefficient (or principal eigenvector element) v1, and all coefficients from the second filter bank may be weighted using the weighting coefficient v2, and so on, Further, it may be ascertained from the above equation that each filter bank coefficient may be weighted by its respective mixing coefficient (or eigenvector element) before being combined with the weighted coefficients from the other banks.
The step depicting the overall determination of the combined (or mixed) filter bank signal mixer 307 is shown as processing step 407 in figure 4.
It is to be understood in embodiments of the invention that generally the set of K filter bank coefficients, from each filter bank may be treated as a single vector, whose elements comprise the filter coefficients of the respective filter bank. In these embodiments of the invention the above described process may be applied to the complete set of filter bank coefficients as a whole. In other words the above described process may be applied to the entire length of the vector.
In further embodiments of the invention the K dimensional vector X, whose elements comprise the coefficients of a filter bank may be partitioned into a number of consecutive sub vectors. Each sub vector may comprise a contiguous sub set of elements (or filter banks coefficients) of the original vector.
In yet further embodiments of the invention the K dimensional filter bank coefficient vector i may be partitioned into a number of overlapping consecutive sub vectors. Each sub vector may comprise a sub set of filter bank coefficients, in which the coefficients from one sub vector may overlap with the coefficients from a neighbouring sub vector. In other words neighbouring sub vectors may share common filter bank coefficients.
For example in embodiments of the invention, each K dimensional filter bank coefficient vector 1, which may be denoted as i? where (n,k) represents the individual filter bank coefficient k may be divided into Q sub vectors. Each sub vector i,q may comprise a sub set of consecutive filter bank coefficients drawn from the vector. For instance, a sub vector q, iq' of the filter bank coefficient vector 10, may be comprised of the following contiguous filter bank coefficients q [x (n, k - (n, k -2), (n, k -1), (n, k), (n, k + 1),i, (n, k + 2)] In an example of embodiments of the invention in which the coefficients of a filter bank may be partitioned into a number of non overlapping consecutive sub vectors, the subsequent sub vector q+1, Xjq+I of the filter bank coefficient vector X1 may be comprised of the following contiguous filter bank coefficients = Further, in an example of embodiments of the invention whereby the coefficients of a filter bank may be partitioned into a number of overlapping consecutive sub vectors, the subsequent sub vector q+1, i,q+I of the filter bank coefficient vector may be comprised of the following contiguous filter bank coefficients = [ (n, k), (ii, k + 1), , (n, k + 2), x, (11, k + 3), (n, k + 4), (n, k + 5)] It is to be understood in the above example that the partition of the filter bank coefficient vector into a number of overlapping consecutive sub vectors has been depicted using a 50% overlap.
Further embodiments of the invention may deploy a 33% overlap.
It is to be appreciated in embodiments of the invention deploying an overlapping scheme for the partitioning of a filter bank coefficient vector that any number of overlapping ratios may be used, and that generally the performance of the adaptive combiner is improved by deploying a greater ratio of overlap.
It is to be further appreciated in embodiments of the invention that the overlapping ratio may be determined experimentally in order to balance processing needs with the performance requirements of the adaptive combiner.
It is to be understood that partitioning the filter bank coefficient vector into a number of sub vectors has the effect of dividing each filter bank spectrum into sub regions, where each sub region is represented by a sub vector comprising a contiguous set of filter bank coefficients.
In embodiments of the invention the methods of determining the mixing coefficients and combining the filter bank coefficients as described above may be applied at a sub vector level. In other words coefficients from a sub vector from a particular filter bank may be combined with the corresponding coefficients from counterpart sub vectors from the other filter banks.
It is to be understood that the entire spectrum may be processed by repeating the processing steps 405 and 407 for each sub vector individually. The processing steps 405 and 407 may be repeated until all sub vectors within the filter bank coefficient vector are processed.
Embodiments of the invention which divide the filter bank coefficient vector into a number of sub vectors may be viewed as a piecewise approach to creating the combined filter bank spectrum, whereby the final combined spectrum is formed by an additional processing step of concatenating the combined sub vectors.
In some embodiments of the invention the input to the M filter banks 303 may comprise a number of audio frames concatenated to form a super frame. In such embodiments each individual audio frame which comprises the super frame may be processed individually by the M filter banks 303. The super frame may then be formed by concatenating the sets of filter bank coefficients corresponding to the individual audio frames. The process of formulating the super frame may be done for each filter bank output signal 304_A to 304_M.
In embodiments of the invention which deploy a super frame structure, each concatenated super frame as produced at the outputs 304_A to 304_M of the plurality of filter banks 303 may be arranged as a row vector within the PCA observation matrix F. In such embodiments of the invention the PCA processing steps of 701 to 709 may be performed over the entire length of the super frame.
In further embodiments of the invention which deploy a super frame of concatenated frames of filter bank coefficients, the concatenated super frame may be divided into a number of sub vectors. In these embodiments the PCA process of steps 701 to 709 may be performed over the length of the corresponding sub vectors from each filter bank.
It is to be appreciated in embodiments of the invention that as a result of calculating the spectral coefficients on a frame by frame basis the frequency spectrum of any resultant signal may be displayed in the form of three-dimensional plot called a spectrogram. The spectrogram of signal displays a plot of the energy of the frequency content of a signal as it changes in time.
Figure 8 is a schematic showing the combining of two filter bank signals according to a particular embodiment of the invention. The embodiment depicted by Figure 8 comprises the method of dividing the output from each filter bank into a number of sub vectors, and then performing the processing steps of 405 and 407 across all filter banks on a sub vector by sub vector basis.
The signals in Figure 8 are depicted in the form of spectrograms. In this embodiment of the invention the first filter bank coefficient signal may be obtained by filtering the input audio signal according to a high time resolution, as depicted by spectrogram 801 in Figure 8. The second filter bank coefficient signal may be obtained by filtering the input signal according to a low time resolution. This second filter bank signal may be depicted by the spectrogram 803 in Figure 8.
Figure 8 depicts the mixing of a filter bank coefficient sub vector q 8011 from the first filter bank with the filter bank coefficient vector q 8031 from the second filter bank according to an embodiment of the invention. In this depiction of the embodiment of the invention the sub vectors q from each of the first and second filter banks 8011 and 8031 are passed to the mixing analyser 305 via the connections 304A and 304_B. The mixing analyser 305 may then analyse the sub vectors q from each of the filter banks in order to determine the mixing coefficient signal 306. In this particular deployment of the embodiment of the invention the mixing coefficient signal 306 will be vector comprising two elements. The mixer 307 may then use the mixing coefficient signal 306 to weight and combine the elements of sub vectors q 8011 and 8031 from each of the filter banks in order to determine the combined sub vector q 8051.
It is to be appreciated that embodiments of the invention may effectuate the adaptive combining of multiple filter bank signals by adaptively mixing the signals according to dominant features present in the audio signal at the time of mixing.
Additionally, the adaptive mixing (or combining) as implemented by embodiments of the invention may be viewed as a form of data driven mixing, where the mixing coefficients are continually being adapted to the dominant characteristics of the input audio signal. In other words the mixing coefficients are continually being adapted on a frame by frame basis according to the results of the principal component analysis (PCA) as determined by the mixing analyser 305.
It is to be further appreciated that this adaptive mixing or combining effect may be facilitated by utilising the multi filter bank approach as described by embodiments of the invention. By using this approach different characteristics of the input audio signal may be apportioned to the various filter banks depending on the time resolution of the input windowing function.
For example, when an audio signal may exhibit an onset characteristic, in other words where the signal changes from a low energy region to a low high energy region, it may be more suitably captured by a filter bank deploying a high time resolution (low frequency resolution) window. In this example the mixing coefficients from the PCA as calculated by the mixing analyser 305 may determine that high time resolution filter bank signal forms the predominant contribution to the combined filter bank signal.
In a further example, the audio signal may exhibit a steady state signal characteristic. In this case the signal may be more accurately represented by the filter bank deploying a low time resolution (high frequency resolution) window. In this particular example the mixing coefficients from the PCA may bias the relative contribution within the combined filter bank signal in favour of the longer time resolution filter bank signal.
Figure 9 depicts a further embodiment of the invention where the input to the adaptive mixer 104 may be arranged to receive an audio signal 901 comprising a plurality of M channels. Each channel of the input audio signal may then be assigned to one of M individual filter banks 303, whereby the time resolution of each filter bank may be adapted or tuned to the particular characteristics of the input audio channel. The filter bank coefficients 304 from each of the filter banks may then be combined using the mixing analyser 305 and mixer 307 according to the embodiments of the invention.
Embodiments of the invention configured to receive multiple audio input signals may be particularly advantageous for combining audio signals from different sources. For example, the embodiment as depicted in figure 9 may be used to combine the plurality of signals from a microphone array.
Although the above examples describe embodiments of the invention operating an adaptive mixer operating within a codec within an electronic device 10 or apparatus, it would be appreciated that the invention as described below may be implemented as part of any audio processing stage within a chain of audio processing stages.
Thus user equipment may comprise an adaptive mixer such as those described in embodiments of the invention above.
It shall be appreciated that the term user equipment is intended to cover any suitable type of wireless user equipment, such as mobile telephones, portable data processing devices or portable web browsers.
Furthermore elements of a public land mobile network (PLMN) may also comprise audio codecs as described above.
In general, the various embodiments of the invention may be implemented in hardware or special purpose circuits, software, logic or any combination thereof.
For example, some aspects may be implemented in hardware, while other aspects may be implemented in firmware or software which may be executed by a controller, microprocessor or other computing device, although the invention is not limited thereto. While various aspects of the invention may be illustrated and described as block diagrams, flow charts, or using some other pictorial representation, it is well understood that these blocks, apparatus, systems, techniques or methods described herein may be implemented in, as non-limiting examples, hardware, software, firmware, special purpose circuits or logic, general purpose hardware or controller or other computing devices, or some combination thereof.
The embodiments of this invention may be implemented by computer software executable by a data processor of the mobile device, such as in the processor entity, or by hardware, or by a combination of software and hardware. Further in this regard it should be noted that any blocks of the logic flow as in the Figures may represent program steps, or interconnected logic circuits, blocks and functions, or a combination of program steps and logic circuits, blocks and functions. The software may be stored on such physical media as memory chips, or memory blocks implemented within the processor, magnetic media such as hard disk or floppy disks, and optical media such as for example DVD and the data variants thereof, CD.
The memory may be of any type suitable to the local technical environment and may be implemented using any suitable data storage technology, such as semiconductor-based memory devices, magnetic memory devices and systems, optical memory devices and systems, fixed memory and removable memory. The data processors may be of any type suitable to the local technical environment, and may include one or more of general purpose computers, special purpose computers, microprocessors, digital signal processors (DSPs), application specific integrated circuits (ASIC), gate level circuits and processors based on multi-core processor architecture, as non-limiting examples.
Embodiments of the inventions may be practiced in various components such as integrated circuit modules. The design of integrated circuits is by and large a highly automated process. Complex and powerful software tools are available for converting a logic level design into a semiconductor circuit design ready to be etched and formed on a semiconductor substrate.
Programs, such as those provided by Synopsys, Inc. of Mountain View, California and Cadence Design, of San Jose, California automatically route conductors and locate components on a semiconductor chip using well established rules of design as well as libraries of pre-stored design modules. Once the design for a semiconductor circuit has been completed, the resultant design, in a standardized electronic format (e.g., Opus, GDSII, or the like) may be transmitted to a semiconductor fabrication facility or fab" for fabrication.
The foregoing description has provided by way of exemplary and non-limiting examples a full and informative description of the exemplary embodiment of this invention. However, various modifications and adaptations may become apparent to those skilled in the relevant arts in view of the foregoing description, when read in conjunction with the accompanying drawings and the appended claims.
However, all such and similar modifications of the teachings of this invention will still fall within the scope of this invention as defined in the appended claims.

Claims (33)

  1. Claims 1. A method comprising: determining a first plurality of frequency coefficients from a first of at least two filter banks by transforming an audio signal; determining a second plurality of frequency coefficients from a second of the at least two filter banks by transforming the audio signal; determining at least two mixing coefficients dependent on the statistical characteristics of the audio signal; and combining at least one frequency coefficient from the first plurality of frequency coefficients with at least one frequency coefficient from the second plurality of frequency coefficients, wherein the combining is dependent on the at least two mixing coefficients.
  2. 2. The method according to claim 1, wherein determining the first plurality of frequency components and the second plurality of components comprises: determining the first plurality of frequency coefficients with a first time resolution; and determining the second plurality of frequency coefficients with a second time resolution.
  3. 3. The method according to claim 2, wherein the first time resolution is determined by windowing the audio signal with a window function of a first length, and wherein the second time resolution is determined by windowing the audio signal with a window function of a second length.
  4. 4. The method according to claim 3, further comprising changing at least one of: the length of the audio signal windowed with the first length window function to be equal to the length of the audio signal windowed with the second length window; and the length of the audio signal windowed with the second length window function to be equal to the length of the audio signal windowed with the first length window.
  5. 5. The method according to claims 1 to 4, wherein determining the at least two mixing coefficients comprises: calculating a statistical measure between the first plurality of frequency coefficients and the second plurality of frequency coefficients; determining at least two uncorrelated basis vectors from the statistical measure; calculating a basis vector rating measure for each of the at least two uncorrelated basis vectors; and selecting a principal basis vector from the at least two uncorrelated basis vectors, wherein the elements of the principal basis vector form the at least two mixing coefficients.
  6. 6. The method according to claim 5, wherein selecting the principal basis vector comprises: determining a maximum basis vector rating measure; and selecting the uncorrelated basis vector associated with the maximum basis vector.
  7. 7. The method according to claims 5 and 6, wherein the statistical measure is a covariance matrix, wherein the uncorrelated basis vector is an eigenvector, and wherein the basis vector rating measure is an eigenvalue associated with the eigenvector.
  8. 8. The method according to claims 5 and 6, wherein the statistical measure is a correlation matrix, wherein the uncorrelated basis vector is an eigenvector, and wherein the basis vector rating measure is an eigenvalue associated with the eigenvector.
  9. 9. The method according to claims 1 to 8, wherein combining the at least one frequency coefficient from the first plurality of frequency coefficients with the at least one frequency coefficient from the second plurality of frequency coefficients comprises: weighting the at least one frequency coefficient from the first plurality of frequency coefficients; weighting the at least one frequency coefficient from the second plurality of frequency coefficients; and adding the at least one frequency coefficient from the first plurality of frequency coefficients to the second plurality of frequency coefficients.
  10. 10. The method according to claim 9, wherein weighting the at least one frequency coefficient from the first plurality of frequency coefficients comprises: multiplying the at least one frequency coefficient from the first plurality of frequency coefficients with one of the at least two mixing coefficients dependent on the statistical characteristics of the audio signal.
  11. 11. The method according to claim 10, wherein weighting the at least one frequency coefficient from the second plurality of frequency coefficients comprises: multiplying the at least one frequency coefficient from the second plurality of frequency coefficients with a further of the at least two mixing coefficients dependent on the statistical characteristics of the audio signal.
  12. 12. The method according to any of claims Ito 11, wherein each of the at least two pluralities of frequency coefficients comprises at least two groups of frequency coefficients, and wherein at least one frequency coefficient from a first of at least two groups of frequency coefficients is the same frequency coefficient as at least one frequency coefficient from a second of the at least two groups of frequency coefficients.
  13. 13. The method according to any one of claims 1 to 12 wherein the audio signal comprises at least two audio channel signals, wherein the determining of the first plurality of frequency coefficients from the first of at least two filter banks comprises transforming a first of the audio channel signals; and the determining of the second plurality of frequency coefficients from the second of the at least two filter banks comprises transforming a second of the audio channel signals.
  14. 14. The method according to any one of claims ito 13, wherein determining at least two mixing coefficients dependent on the statistical characteristics of the audio signal comprises: applying principal component analysis to the first plurality of frequency coefficients and the second plurality of frequency coefficients.
  15. 15. An apparatus comprising a processor configured to: determine a first plurality of frequency coefficients from a first of at least two filter banks by transforming an audio signal; determine a second plurality of frequency coefficients from a second of the at least two filter banks by transforming the audio signal; determine at least two mixing coefficients dependent on the statistical characteristics of the audio signal; and combine at least one frequency coefficient from the first plurality of frequency coefficients with at least one frequency coefficient from the second plurality of frequency coefficients, wherein the combining is dependent on the at least two mixing coefficients.
  16. 16. The apparatus according to claim 15, further configured to determine the first plurality of frequency components and the second plurality of components by: determining the first plurality of frequency coefficients with a first time resolution; and determining the second plurality of frequency coefficients with a second time resolution.
  17. 17. The apparatus according to claim 16, further configured to window the audio signal with a window function of a first length to determine the first time resolution, and window the audio signal with a window function of a second length to determine the second time resolution.
  18. 18. The apparatus according to claim 17, further configured to change at least one of: the length of the audio signal windowed with the first length window function to be equal to the length of the audio signal windowed with the second length window; and the length of the audio signal windowed with the second length window function to be equal to the length of the audio signal windowed with the first length window.
  19. 19. The apparatus according to claims 15 to 18, further configured to determine the at least two mixing coefficients by being configured to: calculate a statistical measure between the first plurality of frequency coefficients and the second plurality of frequency coefficients; determine at least two uncorrelated basis vectors from the statistical measure; calculate a basis vector rating measure for each of the at least two uncorrelated basis vectors; and select a principal basis vector from the at least two uncorrelated basis vectors, wherein the elements of the principal basis vector form the at least two mixing coefficients.
  20. 20. The apparatus according to claim 19, further configured to select the principal basis vector by being configured to: determine a maximum basis vector rating measure; and select the uncorrelated basis vector associated with the maximum basis vector.
  21. 21. The apparatus according to claims 19 and 20, wherein the statistical measure is a covariance matrix, wherein the uncorrelated basis vector is an eigenvector, and wherein the basis vector rating measure is an eigenvalue associated with the eigenvector.
  22. 22. The apparatus according to claims 19 and 20, wherein the statistical measure is a correlation matrix, wherein the uncorrelated basis vector is an eigenvector, and wherein the basis vector rating measure is an eigenvalue associated with the eigenvector.
  23. 23. The apparatus according to claims 15 to 22, further configured to combine the at least one frequency coefficient from the first plurality of frequency coefficients with the at least one frequency coefficient from the second plurality of frequency coefficients by being configured to: weight the at least one frequency coefficient from the first plurality of frequency coefficients; weight the at least one frequency coefficient from the second plurality of frequency coefficients; and add the at least one frequency coefficient from the first plurality of frequency coefficients to the second plurality of frequency coefficients.
  24. 24. The apparatus according to claim 23, further configured to weight the at least one frequency coefficient from the first plurality of frequency coefficients by being configured to: multiply the at least one frequency coefficient from the first plurality of frequency coefficients with one of the at least two mixing coefficients dependent on the statistical characteristics of the audio signal.
  25. 25. The apparatus according to claim 10, further configured to weight the at least one frequency coefficient from the second plurality of frequency coefficients by being configured to: multiply the at least one frequency coefficient from the second plurality of frequency coefficients with a further of the at least two mixing coefficients dependent on the statistical characteristics of the audio signal.
  26. 26. The apparatus according to any of claims 15 to 25, wherein each of the at least two pluralities of frequency coefficients comprises at least two groups of frequency coefficients, and wherein at least one frequency coefficient from a first of at least two groups of frequency coefficients is the same frequency coefficient as at least one frequency coefficient from a second of the at least two groups of frequency coefficients.
  27. 27. The apparatus according to any one of claims 15 to 26 wherein the audio signal comprises at least two audio channel signals, wherein the apparatus is configured to determine the first plurality of frequency coefficients from the first of at least two filter banks by transforming a first of the audio channel signals; and determine the second plurality of frequency coefficients from the second of the at least two filter banks comprises by transforming a second of the audio channel signals.
  28. 28. The apparatus according to any one of claims 15 to 27, further configured to determine at least two mixing coefficients dependent on the statistical characteristics of the audio signal by being configured to: apply principal component analysis to the first plurality of frequency coefficients and the second plurality of frequency coefficients.
  29. 29. A computer-readable medium encoded with instructions that, when executed by a computer, perform: determining a first plurality of frequency coefficients from a first of at least two filter banks by transforming an audio signal; determining a second plurality of frequency coefficients from a second of the at least two filter banks by transforming the audio signal; determining at least two mixing coefficients dependent on the statistical characteristics of the audio signal; and combining at least one frequency coefficient from the first plurality of frequency coefficients with at least one frequency coefficient from the second plurality of frequency coefficients, wherein the combining is dependent on the at least two mixing coefficients.
  30. 30. An apparatus comprising: means for determining a first plurality of frequency coefficients from a first of at least two filter banks by transforming an audio signal; means for determining a second plurality of frequency coefficients from a second of the at least two filter banks by transforming the audio signal; means for determining at least two mixing coefficients dependent on the statistical characteristics of the audio signal; and means for combining at least one frequency coefficient from the first plurality of frequency coefficients with at least one frequency coefficient from the second plurality of frequency coefficients, wherein the combining is dependent on the at least two mixing coefficients.
  31. 31. The apparatus as claimed in claims 15 to 28, comprising an encoder.
  32. 32. An electronic device comprising apparatus as claimed in claims 15 to 28.
  33. 33. A chipset comprising apparatus as claimed in claims 15 to 28.
GB0823164A 2008-12-18 2008-12-18 Combining frequency coefficients based on at least two mixing coefficients which are determined on statistical characteristics of the audio signal Withdrawn GB2466286A (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
GB0823164A GB2466286A (en) 2008-12-18 2008-12-18 Combining frequency coefficients based on at least two mixing coefficients which are determined on statistical characteristics of the audio signal
PCT/EP2009/066270 WO2010069773A1 (en) 2008-12-18 2009-12-02 Audio signal processing using at least two filterbanks

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
GB0823164A GB2466286A (en) 2008-12-18 2008-12-18 Combining frequency coefficients based on at least two mixing coefficients which are determined on statistical characteristics of the audio signal

Publications (2)

Publication Number Publication Date
GB0823164D0 GB0823164D0 (en) 2009-01-28
GB2466286A true GB2466286A (en) 2010-06-23

Family

ID=40343874

Family Applications (1)

Application Number Title Priority Date Filing Date
GB0823164A Withdrawn GB2466286A (en) 2008-12-18 2008-12-18 Combining frequency coefficients based on at least two mixing coefficients which are determined on statistical characteristics of the audio signal

Country Status (2)

Country Link
GB (1) GB2466286A (en)
WO (1) WO2010069773A1 (en)

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2002080146A1 (en) * 2001-03-30 2002-10-10 University Of Bath Audio compression
US20050219068A1 (en) * 2000-11-30 2005-10-06 Jones Aled W Acoustic communication system
JP2008209768A (en) * 2007-02-27 2008-09-11 Mitsubishi Electric Corp Noise eliminator

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20050219068A1 (en) * 2000-11-30 2005-10-06 Jones Aled W Acoustic communication system
WO2002080146A1 (en) * 2001-03-30 2002-10-10 University Of Bath Audio compression
JP2008209768A (en) * 2007-02-27 2008-09-11 Mitsubishi Electric Corp Noise eliminator

Also Published As

Publication number Publication date
WO2010069773A1 (en) 2010-06-24
GB0823164D0 (en) 2009-01-28

Similar Documents

Publication Publication Date Title
JP6230602B2 (en) Method and apparatus for rendering an audio sound field representation for audio playback
US9640187B2 (en) Method and an apparatus for processing an audio signal using noise suppression or echo suppression
EP2297728B1 (en) Apparatus and method for adjusting spatial cue information of a multichannel audio signal
CN107170458A (en) The method and device that compression and decompression high-order ambisonics signal are represented
CN111316353B (en) Determining spatial audio parameter coding and associated decoding
CN101606192A (en) Low complexity parametric stereo decoder
EP3204940B1 (en) Method and apparatus for low bit rate compression of a higher order ambisonics hoa signal representation of a sound field
KR20080091099A (en) Audio channel extraction using inter-channel amplitude spectra
US11838738B2 (en) Method and device for applying Dynamic Range Compression to a Higher Order Ambisonics signal
WO2014126689A1 (en) Methods for controlling the inter-channel coherence of upmixed audio signals
CN106463130B (en) Method and apparatus for encoding/decoding the direction of a dominant direction signal within a subband represented by an HOA signal
JP7224302B2 (en) Processing of multi-channel spatial audio format input signals
US9076437B2 (en) Audio signal processing apparatus
JP7405962B2 (en) Spatial audio parameter encoding and related decoding decisions
WO2020016479A1 (en) Sparse quantization of spatial audio parameters
RU2642386C2 (en) Adaptive generation of scattered signal in upmixer
EP3550563A1 (en) Encoder, decoder, encoding method, decoding method, and program
GB2466286A (en) Combining frequency coefficients based on at least two mixing coefficients which are determined on statistical characteristics of the audio signal
US10224043B2 (en) Audio signal processing apparatuses and methods
WO2019197713A1 (en) Quantization of spatial audio parameters
EP2934025A1 (en) Method and device for applying dynamic range compression to a higher order ambisonics signal
WO2020066542A1 (en) Acoustic object extraction device and acoustic object extraction method
CN117136406A (en) Combining spatial audio streams
WO2014126684A1 (en) Time-varying filters for generating decorrelation signals

Legal Events

Date Code Title Description
WAP Application withdrawn, taken to be withdrawn or refused ** after publication under section 16(1)