WO2011000408A1 - Audio coding - Google Patents

Audio coding Download PDF

Info

Publication number
WO2011000408A1
WO2011000408A1 PCT/EP2009/058165 EP2009058165W WO2011000408A1 WO 2011000408 A1 WO2011000408 A1 WO 2011000408A1 EP 2009058165 W EP2009058165 W EP 2009058165W WO 2011000408 A1 WO2011000408 A1 WO 2011000408A1
Authority
WO
WIPO (PCT)
Prior art keywords
series
samples
sub
spectral band
frequency spectral
Prior art date
Application number
PCT/EP2009/058165
Other languages
French (fr)
Inventor
Mikko Tapio Tammi
Lasse Juhani Laaksonen
Adriana Vasilache
Anssi Sakari RÄMÖ
Original Assignee
Nokia Corporation
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nokia Corporation filed Critical Nokia Corporation
Priority to PCT/EP2009/058165 priority Critical patent/WO2011000408A1/en
Publication of WO2011000408A1 publication Critical patent/WO2011000408A1/en

Links

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/02Speech enhancement, e.g. noise reduction or echo cancellation
    • G10L21/038Speech enhancement, e.g. noise reduction or echo cancellation using band spreading techniques

Definitions

  • Embodiments of the present invention relate to audio coding.
  • they relate to coding high frequencies of an audio signal utilizing the low frequency content of the audio signal.
  • Audio encoding is commonly employed in apparatus for storing or transmitting a digital audio signal.
  • a high compression ratio enables better storage capacity or more efficient transmission through a channel.
  • it is also important to maintain the perceptual quality of the compressed signal.
  • SBR spectral band replication
  • An intermediate form between conventional spectral coding and bandwidth extension is to adaptively copy selected portions of lower frequency spectral band to model the higher frequency spectral band.
  • WOO7072088 teaches dividing the higher frequency spectral band into smaller spectral sub bands.
  • systematic searches are used to find the portions of the larger lower frequency spectral band of the audio signal that are most similar to the smaller higher frequency spectral sub bands.
  • a higher frequency spectral sub band can then be parametrically encoded by providing a parameter that identifies the most similar portion of the larger lower frequency spectral band.
  • the searches are computationally intensive.
  • the provided parameter is used to replicate the appropriate portions of the lower frequency spectral band in the appropriate higher frequency spectral sub bands.
  • a method comprising: processing a selected subset of a higher series of samples forming a higher frequency spectral band of an audio signal and a lower series of samples forming a lower frequency spectral band of the audio signal to parametrically encode the higher series of samples forming the higher frequency spectral band by identifying a sub-series of the lower series of samples.
  • a system comprising: an encoding apparatus configured to process a selected subset of a series of samples forming a higher frequency spectral band of an audio signal and a series of samples forming a lower frequency spectral band of the audio signal to parametrically encode the series of samples forming the higher frequency spectral band by identifying, using a parameter, a sub-series of the lower series of samples; and a decoding apparatus configured to replicate the series of samples forming the higher frequency spectral band using the sub-series of the lower series of samples identified by the parameter.
  • an apparatus comprising: circuitry configured to process a selected subset of a series of samples forming a higher frequency spectral band of an audio signal and a series of samples forming a lower frequency spectral band of the audio signal to parametrically encode the series of samples forming the higher frequency spectral band by identifying a sub-series of the lower series of samples.
  • an apparatus comprising: processing means for processing a selected subset of a series of samples forming a higher frequency spectral band of an audio signal and a series of samples forming a lower frequency spectral band of the audio signal to parametrically encode the series of samples forming the higher frequency spectral band by identifying a sub-series of the lower series of samples.
  • a computer program which when run on a processor enables the processor to process a selected subset of a series of samples forming a higher frequency spectral band of an audio signal and a series of samples forming a lower frequency spectral band of the audio signal to parametrically encode the series of samples forming the higher frequency spectral band by identifying a sub-series of the lower series of samples.
  • a computer program which when run on a processor enables the processor to select a subset of a higher series of samples in the frequency domain that form a higher frequency spectral band of an audio signal; process the selected subset of the higher series of samples and a lower series of samples in the frequency domain forming a lower frequency spectral band of the audio signal to select a sub-series of the lower series of samples; and parametrically encode the higher series of samples by identifying the selected sub-series of the lower series of samples.
  • a module comprising: circuitry configured to process a selected subset of a series of samples forming a higher frequency spectral band of an audio signal and a series of samples forming a lower frequency spectral band of the audio signal to parametrically encode the series of samples forming the higher frequency spectral band by identifying a sub-series of the lower series of samples.
  • Fig 1 schematically illustrates an audio encoding apparatus
  • Fig 2 schematically illustrates a parametric coding block
  • FIG. 3A schematically illustrates an illustrative example of a higher series of samples
  • Fig 3B schematically illustrates an illustrative example of a subset of the higher series of samples
  • Fig 4 schematically illustrates a system comprising an audio encoding apparatus and an audio decoding apparatus
  • Fig 5 schematically illustrates a controller
  • Fig 6 schematically illustrates a computer readable physical medium
  • Fig 7 schematically illustrates a method of processing a selected subset of a higher series of samples and a lower series of samples to parametrically encode the higher series of samples by identifying a sub-series of the lower series of samples.
  • Fig 1 schematically illustrates an audio encoding apparatus 2.
  • the audio encoding apparatus 2 processes digital audio 3 to produce encoded data 5 that represents the digital audio using less information.
  • the information content of the digital audio signal 3 is compressed to encoded data 5.
  • Fig 4 illustrates the audio encoding apparatus 2 in a system 8 that also comprises an audio decoding apparatus 4.
  • the audio decoding apparatus 4 processes the encoded data 5 to produce digital audio 7.
  • the digital audio 7 comprises less information than the original digital audio 3
  • the encoding and decoding processes are designed to maintain perceptually high quality audio. This may, for example, be achieved by using a psychoacoustic model for encoding/decoding a lower frequency spectral band of the digital audio and using a coding technique making use of the lower frequency spectral band for encoding/decoding a higher spectral band.
  • the audio encoding apparatus 2 comprises: a transformer block 10 for converting the digital audio 3 from the time domain into the frequency domain, an audio coding block 12 for encoding a lower frequency spectral band of the digital audio; and one or more parametric coding blocks 14 for parametrically encoding one or more higher frequency spectral bands of the digital audio.
  • the transformer 10 receives as input the time domain digital audio 3 and produces as output a series X of N samples representing the spectrum of the digital audio.
  • the boundaries of the lower series X L (k) and the one or more higher series X H J (k) may overlap in some embodiments and not overlap in other embodiments. In the following described embodiments they do not overlap.
  • the boundaries of the one or more higher series X H ' (k) may overlap in some embodiments and not overlap in other embodiments. In the following described embodiments they do not overlap.
  • the size n s of a higher series X H ' (k) of samples may be less than the size L of the lower series X L (k) of samples e.g. n, ⁇ L for all j.
  • the whole of the series X may be spanned by the lower series X L (k) and the one
  • the transformer block 10 may use a modified discrete cosine transform.
  • Other tranforms which represent signal in frequency domain with real-valued coefficients, such as discrete sine transform, can be utilized as well. Audio coding
  • the audio encoding block 12 in this example may use a psychoacoustic model to encode the lower series of samples X L (k) ⁇ o produce encoded audio 13.
  • the encoded audio may be a component of the encoded data 5.
  • the audio encoding block 12 may also decode the encoded audio 13 to produce a synthesized lower series X L (k) which represents the lower series of samples X L (k) available at a decoding apparatus 4.
  • the synthesized lower series X 1 (Jk) may be psycho-acoustically equivalent to the lower series of samples X L (k) .
  • the synthesized lower series X L (Jk) may be psycho-acoustically as similar as possible to the lower series of samples X L (k) , given the constraints imposed for example to bit-rate of encoded data, processing resources used by the encoding process, etc. Coding higher frequencies
  • the parametric coding blocks 14, parametrically encode the higher frequency spectral bands x H ] (k) of the digital audio.
  • the output of each of the parametric coding blocks 14, is a set of parameters representing the higher frequency band 15,.
  • the parameters representing the higher frequency band15,. may be components of the encoded data 5.
  • An example of a parametric coding block 14 is schematically illustrated in Fig 2.
  • One input to the coding block 14 is the higher series X H J (k) of samples representing the higher frequency spectral band j of the digital audio.
  • Another input to the coding block 14 is the lower series of samples representing the lower frequency spectral band of the digital audio.
  • the input lower series of samples may be in some embodiments the original lower series of samples X L (k) . In other embodiments it may be the synthesized lower series of samples X L (k) . Let us assume for the purpose of the description of this example that the lower series of samples representing the lower frequency spectral band of the digital audio is the synthesized lower series of samples X 1 (Jc) .
  • the parametric coding block 14 may comprise a subset selection block 20 for selecting a subset X H ] (Jc) of the higher series of samples X H ] (Jc) and a sub-series selection block 22 for selecting a sub-series of the lower series of samples X L (Jc) that is suitable for coding the higher series of samples X H ⁇ (k) .
  • Subset selection Fig 3A schematically illustrates an illustrative example of a higher series of samples X H (Jc) .
  • the samples are plotted on an x-y co-ordinate system with k plotted on the x-axis and the amplitude of the sample X H ' (Jc) plotted on the y axis.
  • Fig 3B schematically illustrates an illustrative example of a subset X ⁇ (Jc) of the higher series of samples.
  • the samples are plotted on an x-y co-ordinate system with k plotted on the x-axis and the amplitude of the sample X H ] (Jc) plotted on the y axis.
  • the sample X H ' (Jc) is the same as the sample Xj 1 (k) and that for other different values of k the sample X H ] (Jc) is null valued.
  • a null value results in either it being ignored in a future calculation of a similarity cost function, or in it being economically processed in the similarity cost function .
  • the subset selection block 20 selects a subset X H J (k) of a higher series of samples X H 3 (Jk) by, for example, selecting the values of k for which the sample Xh (Jc) is null valued.
  • the methodology used to produce the subset Xh (Jc) of a higher series of samples Xh (k) illustrated in Fig 3B maintains the h, samples with biggest absolute values, and sets all the other values to zero.
  • the value of h s may be selected independently for every one of the different higher series of samples, or the same value can be used for all the different higher series of samples.
  • the high amplitude spectral peaks in the spectrum are maintained which retains the most perceptually important information and also the information that is most influential in the similarity cost function.
  • Discontinuities or gaps are introduced into the continuous spectral band represented by the higher series of samples X H ] (Jc) .
  • the subset selection block 20 may select a subset X H J (k) of a higher series of samples by including psycho-acoustically significant samples and excluding psycho-acoustically insignificant samples.
  • the subset selection block 20 may select a subset Xh (Jc) Oi a higher series of samples based upon the amplitudes of the higher series of samples. It may for example, select the Z1 highest values, or the highest Z2% of values. It may for example use a statistical model to select the subset Xh (Jc) . For example it may select those samples with an amplitude greater than Z3 standard deviations from the mean amplitude.
  • the subset selection block 20 may select a subset X H ] (Jc) of a higher series of samples based upon the maxima in the amplitudes of the higher series of samples. It may for example, select the Z1 highest maxima, or the highest Z2% maxima.
  • the subset selection block 20 uses a criteria to select specific samples e.g. h s samples with biggest absolute values and then applies the selection to only those specific samples e.g. maintains only the h j samples.
  • the subset selection block 20 uses a criteria to select specific samples (e.g. h, samples with biggest absolute values) and then selects, for inclusion in the subset X H ⁇ (k) , not only those specific samples but also one or more of the samples adjacent those specific samples.
  • the subset selection block 20 may use a predetermined methodology for selecting the subset. Alternatively, the subset selection block 20 may select which one of a plurality of different methodologies are used. Processing
  • the sub-series selection block 22 processes the selected subset x H ⁇ ⁇ k) o ⁇ a higher series of samples and the lower series of samples x L (k) to parametrically encode the higher series of samples X H ' (k) by identifying a sub-series of the lower series of samples.
  • the sub-series selection block 22 determines a similarity cost function S(d), that is dependent upon the selected subset Xj 1 (K) and a putative sub-series X L (k + d) of the lower series of samples, for each one of a plurality of putative sub-series of the lower series.
  • X L (k + d) of the lower series having the best similarity cost function S(d). It identifies the position of the selected putative sub-series X L (k + d) within the lower series using a parameter (d).
  • the subset Xj 1 (Ic) of the higher series of samples Xj 1 (K) is obtained from the subset selection block 20.
  • the lower series of samples X L (k) is obtained from, in the example of Fig 1 , the decoder block 12.
  • d is set to 0.
  • S max is set to zero.
  • dmax is set to zero.
  • the value d determines the putative sub-series X L (k + d) of the lower series of samples X L (k) .
  • a similarity cost function S(d) that is dependent upon the selected subset Xj j (k) and the current putative sub-series X L (k + d) of the lower series of samples is determined.
  • Equation (1 A) expresses an example of the similarity cost function as a cross-correlation.
  • Equation (1 B) expresses another example of the similarity cost function as a normalized cross-correlation.
  • ri j is the length of the / h sub band.
  • the similarity cost function is a function of X ⁇ ik) as opposed to being a function of
  • the similarity cost function comprises processing of each of the samples in the selected subset X H ⁇ (k) with the respective corresponding sample in the putative sub-series X L (k + d) .
  • the method then moves to block 46.
  • the method moves to block 48. Otherwise the method moves to block 38, where d is incremented by one. and a new current putative sub-series X L (k + d) is defined for the search loop.
  • the position of the selected putative sub-series X L (k + Cl 102x ) within the lower series is identified using the parameter d max (j)
  • the range of allowed d values can be quite large (for example 256 different values) and thus a large number of S(d) values are computed in the loop of Fig 7.
  • the numerator of (1A) & (1 B) requires n ⁇ multiplications as well as n ⁇ - ⁇ additions for every d.
  • the numerator of (1A) & (1 B) is a source of complexity.
  • the subset Xj 1 (k) is of size h s only h s multiplications and h j - 1 additions are needed in the denominator of (1A) & (1 B) for every d, as all the other multiplications are known to be zero.
  • the total complexity of the correlation computation in the numerator of Equation (1A) or (1 B) reduces from 15000 multiplications and 14850 additions to 1500 multiplications and 1350 additions, which is significant reduction.
  • the reduced subset X H ⁇ (Jk) decreases significantly the complexity required by correlation calculation (equation (1A) or (1 B)).
  • the reduced subset may be achieved by including in the calculation of the similarity cost function only the perceptually most important spectral components.
  • the current putative sub-series X L (k + d) and the subset X H ] (Jk) of the higher series of samples are derived from the same frame of digital audio 3.
  • the search for the putative sub-series X L (k + d) that best matches the subset x H ] (Jc) of the higher series of samples may range across multiple audio frames.
  • the size of the higher series of samples and the size of the lower series of samples are predetermined. In other implementations the size of first series and/or the size of the second series may be dynamically varied.
  • the first scaling factor Gr 1 (Z) may be determined in the scaling parameter block 24.
  • the second scaling factor ct 2 (j) may be determined in the scaling parameter block 26.
  • the first scaling factor a ⁇ (j) is dependent upon the selected subset x H ] (k) o ⁇ the higher series of samples.
  • the first scaling factor is a function of X H ⁇ (k) as opposed to being a function of X H ] (Jc)
  • the first scaling factor operates on the linear domain to match the high amplitude peaks in the spectrum:
  • Equation (2) expresses an example of a suitable first scaling factor as a normalized cross-correlation.
  • Equation (1 A) or (1 B) and Equation (2) are the same.
  • the denominators of Equation (1A) or (1 B) and Equation (2) are related.
  • the numerator and/or the denominator calculated for S(d max ) in Equation (1A) may be re-used to calculate the first scaling factor.
  • the first scaling factor may be computed as a function of X H ⁇ (k) instead of a function of X H ] (k) , for example as shown in equation (3).
  • the second scaling factor ⁇ 2 (y) operates on the logarithmic domain and is used to provide better match with the energy and the logarithmic domain shape.
  • the second scaling factor is a function of the whole of the higher series of samples X H ⁇ (Jc) .
  • Equation (4) expresses an example of a suitable second scaling factor:
  • the overall synthesized sub band x H ' (k) is then obtained as where ⁇ k) is -1 if Cc 1 (J)XKk) is negative and otherwise 1 .
  • the output of each of the parametric coding blocks 14, is a set of parameters representing the higher frequency band 15,.
  • the parameters representing the higher frequency band 15, include the parameter d max (j) which identifies a sub-series of the lower series of samples X L (k) suitable for producing the higher series of samples
  • the audio decoding apparatus 4 processes the encoded data 5 to produce digital audio 7.
  • the encoded data 5 comprises encoded audio 13 (encoding the lower series of samples X L (k) ) and the parameters representing the higher frequency band 15,.
  • the decoding apparatus 4 is configured to decode the encoded audio 13 to produce the lower series of samples X L (k) .
  • the decoding apparatus 4 is configured to replicate the higher series of samples X H ] (Jk) forming the higher frequency spectral band using the sub-series X L (Jk) o ⁇ the lower series of samples identified by the parameter d max (j).
  • each of the parametric coding block 14 1 s 14 2 ....14 M may be provided as a distinct block or a single block may be reused with different inputs as the respective parametric coding blocks 14 1 s 14 2 ....14 M .
  • a block may be a hardware block such as circuitry.
  • a block may be a software block implemented via computer code.
  • the subset selection block 20 and the sub series selection block may be implemented by a single hardware block or by a single software block.
  • the subset selection block 20 and the sub series selection block may be implemented using distinct hardware blocks and/or software blocks.
  • a hardware block comprises circuitry.
  • the scaling parameter blocks 24, 26 are optional. When present, one or more of the scaling parameter blocks may be integrated with the sub series selection block 22 or may be integrated into a single block.
  • a software block or software blocks, a hardware block or hardware blocks and a mixture of software block(s) and hardware blocks may be provided by the apparatus 2. Examples of apparatus include modules, consumer devices, portable devices, personal devices, audio recorders, audio players, multimedia devices etc.
  • the apparatus 2 may comprise: circuitry 22 configured to process a selected subset of a series of samples X H ] (k) forming a higher frequency spectral band of an audio signal and a series X L (k) oi samples forming a lower frequency spectral band of the audio signal to parametrically encode the series of samples X H ] (Jc) forming the higher frequency spectral band by identifying a sub-series X L (k) of the lower series of samples using a parameter d max (j)..
  • Fig 5 schematically illustrates a controller 50 suitable for use in an encoding apparatus 2 and/or a decoding apparatus.
  • Implementation of a controller can be in hardware alone ( a circuit, a processor%), have certain aspects in software including firmware alone or can be a combination of hardware and software (including firmware).
  • a controller may be implemented using instructions that enable hardware
  • the controller 50 illustrated in Fig 5 comprises a processor 52 and a memory 54.
  • the processor 52 is configured to read from and write to the memory 54.
  • the processor 52 may also comprise an output interface 53 via which data and/or commands are output by the processor 52 and an input interface 55 via which data and/or commands are input to the processor 52.
  • the memory 54 stores a computer program 56 comprising computer program instructions that, when loaded into the processor 52, control the operation of the encoding apparatus 2 and/or decoding apparatus 4.
  • the computer program instructions 56 provide the logic and routines that enables the apparatus to perform the methods illustrated in Figs 1 to 4 and 7.
  • the processor 52 by reading the memory 54 is able to load and execute the computer program 56.
  • the computer program may arrive at the apparatus via any suitable delivery mechanism 58.
  • the delivery mechanism 58 may be, for example, a computer- readable physical storage medium as illustrated in Fig 6, a computer program product, a memory device, a record medium such as a CD-ROM or DVD, an article of manufacture that tangibly embodies the computer program 56.
  • the delivery mechanism may be a signal configured to reliably transfer the computer program 56.
  • the apparatus may propagate or transmit the computer program 56 as a computer data signal.
  • the memory 54 is illustrated as a single component it may be implemented as one or more separate components some or all of which may be
  • integrated/removable and/or may provide permanent/semi-permanent/
  • references to 'computer-readable storage medium', 'computer program product', 'tangibly embodied computer program' etc. or a 'controller', 'computer', 'processor' etc. should be understood to encompass not only computers having different architectures such as single /multi- processor architectures and sequential (Von Neumann)/parallel architectures but also specialized circuits such as field- programmable gate arrays (FPGA), application specific circuits (ASIC), signal processing devices and other devices.
  • programmable processor or firmware such as, for example, the programmable content of a hardware device whether instructions for a processor, or configuration settings for a fixed-function device, gate array or programmable logic device etc.
  • module' refers to a unit or apparatus that excludes certain
  • the blocks illustrated in the Figs may represent steps in a method and/or sections of code in the computer program 56.
  • the illustration of a particular order to the blocks does not necessarily imply that there is a required or preferred order for the blocks and the order and arrangement of the block may be varied. Furthermore, it may be possible for some steps to be omitted.

Landscapes

  • Engineering & Computer Science (AREA)
  • Computational Linguistics (AREA)
  • Quality & Reliability (AREA)
  • Signal Processing (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Compression, Expansion, Code Conversion, And Decoders (AREA)

Abstract

A method including: processing a selected subset of a higher series of samples forming a higher frequency spectral band of an audio signal and a lower series of samples forming a lower frequency spectral band of the audio signal to parametrically encode the higher series of samples forming the higher frequency spectral band by identifying a sub-series of the lower series of samples.

Description

TITLE
Audio coding
FIELD OF THE INVENTION
Embodiments of the present invention relate to audio coding. In particular, they relate to coding high frequencies of an audio signal utilizing the low frequency content of the audio signal. BACKGROUND TO THE INVENTION
Audio encoding is commonly employed in apparatus for storing or transmitting a digital audio signal. A high compression ratio enables better storage capacity or more efficient transmission through a channel. However, it is also important to maintain the perceptual quality of the compressed signal.
There may be good correlation between a low frequency region and a higher frequency region of an audio signal. This may be utilized for example by using a bandwidth extension technique, which instead of encoding the signal of the high frequency region aims to model the high frequency region by using a copy of a signal at the low frequency region and adjusting the copied spectral envelope to match the high frequency region., Another example is spectral band replication (SBR) coding, which proposes that a higher frequency spectral band should not itself be
coded/decoded but should be replicated based on a pre-selected segment from a decoded lower frequency spectral band. However, these methods only try to maintain the overall shape of the spectral envelope at the high frequency region, whereas the fine structure of the original spectrum, which may be quite different is not considered.
An intermediate form between conventional spectral coding and bandwidth extension is to adaptively copy selected portions of lower frequency spectral band to model the higher frequency spectral band. WOO7072088 teaches dividing the higher frequency spectral band into smaller spectral sub bands. During encoding, systematic searches are used to find the portions of the larger lower frequency spectral band of the audio signal that are most similar to the smaller higher frequency spectral sub bands. A higher frequency spectral sub band can then be parametrically encoded by providing a parameter that identifies the most similar portion of the larger lower frequency spectral band. The searches are computationally intensive. At decoding, the provided parameter is used to replicate the appropriate portions of the lower frequency spectral band in the appropriate higher frequency spectral sub bands.
BRIEF DESCRIPTION OF VARIOUS EMBODIMENTS OF THE INVENTION
According to various, but not necessarily all, embodiments of the invention there is provided a method comprising: processing a selected subset of a higher series of samples forming a higher frequency spectral band of an audio signal and a lower series of samples forming a lower frequency spectral band of the audio signal to parametrically encode the higher series of samples forming the higher frequency spectral band by identifying a sub-series of the lower series of samples.
The use of a selected subset of a higher series of samples forming a higher frequency spectral band of an audio signal advantageously reduces the processing load. According to various, but not necessarily all, embodiments of the invention there is provided a system comprising: an encoding apparatus configured to process a selected subset of a series of samples forming a higher frequency spectral band of an audio signal and a series of samples forming a lower frequency spectral band of the audio signal to parametrically encode the series of samples forming the higher frequency spectral band by identifying, using a parameter, a sub-series of the lower series of samples; and a decoding apparatus configured to replicate the series of samples forming the higher frequency spectral band using the sub-series of the lower series of samples identified by the parameter. According to various, but not necessarily all, embodiments of the invention there is provided an apparatus comprising: circuitry configured to process a selected subset of a series of samples forming a higher frequency spectral band of an audio signal and a series of samples forming a lower frequency spectral band of the audio signal to parametrically encode the series of samples forming the higher frequency spectral band by identifying a sub-series of the lower series of samples. According to various, but not necessarily all, embodiments of the invention there is provided an apparatus comprising: processing means for processing a selected subset of a series of samples forming a higher frequency spectral band of an audio signal and a series of samples forming a lower frequency spectral band of the audio signal to parametrically encode the series of samples forming the higher frequency spectral band by identifying a sub-series of the lower series of samples.
According to various, but not necessarily all, embodiments of the invention there is provided a computer program which when run on a processor enables the processor to process a selected subset of a series of samples forming a higher frequency spectral band of an audio signal and a series of samples forming a lower frequency spectral band of the audio signal to parametrically encode the series of samples forming the higher frequency spectral band by identifying a sub-series of the lower series of samples.
According to various, but not necessarily all, embodiments of the invention there is provided a computer program which when run on a processor enables the processor to select a subset of a higher series of samples in the frequency domain that form a higher frequency spectral band of an audio signal; process the selected subset of the higher series of samples and a lower series of samples in the frequency domain forming a lower frequency spectral band of the audio signal to select a sub-series of the lower series of samples; and parametrically encode the higher series of samples by identifying the selected sub-series of the lower series of samples. According to various, but not necessarily all, embodiments of the invention there is provided a module comprising: circuitry configured to process a selected subset of a series of samples forming a higher frequency spectral band of an audio signal and a series of samples forming a lower frequency spectral band of the audio signal to parametrically encode the series of samples forming the higher frequency spectral band by identifying a sub-series of the lower series of samples.
BRIEF DESCRIPTION OF THE DRAWINGS For a better understanding of various examples of embodiments of the present invention reference will now be made by way of example only to the accompanying drawings in which:
Fig 1 schematically illustrates an audio encoding apparatus;
Fig 2 schematically illustrates a parametric coding block;
Fig 3A schematically illustrates an illustrative example of a higher series of samples;
Fig 3B schematically illustrates an illustrative example of a subset of the higher series of samples;
Fig 4 schematically illustrates a system comprising an audio encoding apparatus and an audio decoding apparatus;
Fig 5 schematically illustrates a controller;
Fig 6 schematically illustrates a computer readable physical medium; and
Fig 7 schematically illustrates a method of processing a selected subset of a higher series of samples and a lower series of samples to parametrically encode the higher series of samples by identifying a sub-series of the lower series of samples.
DETAILED DESCRIPTION OF VARIOUS EMBODIMENTS OF THE INVENTION
Fig 1 schematically illustrates an audio encoding apparatus 2. The audio encoding apparatus 2 processes digital audio 3 to produce encoded data 5 that represents the digital audio using less information. The information content of the digital audio signal 3 is compressed to encoded data 5.
Fig 4 illustrates the audio encoding apparatus 2 in a system 8 that also comprises an audio decoding apparatus 4. The audio decoding apparatus 4 processes the encoded data 5 to produce digital audio 7. Although the digital audio 7 comprises less information than the original digital audio 3, the encoding and decoding processes are designed to maintain perceptually high quality audio. This may, for example, be achieved by using a psychoacoustic model for encoding/decoding a lower frequency spectral band of the digital audio and using a coding technique making use of the lower frequency spectral band for encoding/decoding a higher spectral band. Referring back to Fig 1 , the audio encoding apparatus 2 comprises: a transformer block 10 for converting the digital audio 3 from the time domain into the frequency domain, an audio coding block 12 for encoding a lower frequency spectral band of the digital audio; and one or more parametric coding blocks 14 for parametrically encoding one or more higher frequency spectral bands of the digital audio.
Transformer
The transformer 10 receives as input the time domain digital audio 3 and produces as output a series X of N samples representing the spectrum of the digital audio.
A lower series XL (k) o1 the N samples k=1 , 2...L represents a lower frequency spectral band of the digital audio. One or more higher series XH } (k) of the N samples, where j = 1 , ..., M, and where k=0, 1 , 2... n, represent one or more higher frequency spectral bands of the digital audio, rij may be a constant or some function of j.
The boundaries of the lower series XL (k) and the one or more higher series XH J (k) may overlap in some embodiments and not overlap in other embodiments. In the following described embodiments they do not overlap.
The boundaries of the one or more higher series XH' (k) may overlap in some embodiments and not overlap in other embodiments. In the following described embodiments they do not overlap.
The size ns of a higher series XH' (k) of samples may be less than the size L of the lower series XL (k) of samples e.g. n, < L for all j. The whole of the series X may be spanned by the lower series XL (k) and the one
M
or more higher series XH' (k) e.g. N= L + ^ n} .
;=i The transformer block 10 may use a modified discrete cosine transform. Other tranforms which represent signal in frequency domain with real-valued coefficients, such as discrete sine transform, can be utilized as well. Audio coding
The audio encoding block 12 in this example may use a psychoacoustic model to encode the lower series of samples XL (k) \o produce encoded audio 13. The encoded audio may be a component of the encoded data 5.
The audio encoding block 12 may also decode the encoded audio 13 to produce a synthesized lower series XL(k) which represents the lower series of samples XL (k) available at a decoding apparatus 4. The synthesized lower series X1(Jk) may be psycho-acoustically equivalent to the lower series of samples XL (k) . In some embodiments the synthesized lower series XL(Jk) may be psycho-acoustically as similar as possible to the lower series of samples XL (k) , given the constraints imposed for example to bit-rate of encoded data, processing resources used by the encoding process, etc. Coding higher frequencies
The parametric coding blocks 14, parametrically encode the higher frequency spectral bands xH ] (k) of the digital audio. The output of each of the parametric coding blocks 14, is a set of parameters representing the higher frequency band 15,. The parameters representing the higher frequency band15,. may be components of the encoded data 5. An example of a parametric coding block 14 is schematically illustrated in Fig 2.
One input to the coding block 14, is the higher series XH J (k) of samples representing the higher frequency spectral band j of the digital audio.
Another input to the coding block 14, is the lower series of samples representing the lower frequency spectral band of the digital audio. The input lower series of samples may be in some embodiments the original lower series of samples XL (k) . In other embodiments it may be the synthesized lower series of samples XL (k) . Let us assume for the purpose of the description of this example that the lower series of samples representing the lower frequency spectral band of the digital audio is the synthesized lower series of samples X1(Jc) .
Referring to Fig 2, the parametric coding block 14, may comprise a subset selection block 20 for selecting a subset XH ] (Jc) of the higher series of samples XH ] (Jc) and a sub-series selection block 22 for selecting a sub-series of the lower series of samples XL(Jc) that is suitable for coding the higher series of samples XH } (k) .
The selection of a subset XH' (Jc) of the higher series of samples xH' (k) and the use of that subset xH ] (Jc) in determining the sub-series of the lower series of samples XL(k) significantly reduces the number of calculations required compared to if , instead of using a subset X^ (Jc) of the higher series of samples XH 3 (k) , the whole higher series of samples xH' (k) is used to determine the sub-series of the lower series of samples XL(k) .
Subset selection Fig 3A schematically illustrates an illustrative example of a higher series of samples X H (Jc) . The samples are plotted on an x-y co-ordinate system with k plotted on the x-axis and the amplitude of the sample XH' (Jc) plotted on the y axis.
Fig 3B schematically illustrates an illustrative example of a subset X^ (Jc) of the higher series of samples. The samples are plotted on an x-y co-ordinate system with k plotted on the x-axis and the amplitude of the sample XH ] (Jc) plotted on the y axis.
It will be noted that for some values of k, the sample XH' (Jc) is the same as the sample Xj1 (k) and that for other different values of k the sample XH ] (Jc) is null valued. A null value results in either it being ignored in a future calculation of a similarity cost function, or in it being economically processed in the similarity cost function . The subset selection block 20 selects a subset XH J (k) of a higher series of samples XH 3 (Jk) by, for example, selecting the values of k for which the sample Xh (Jc) is null valued.
Many different methodologies may be used for the selection of the subset Xh (Jc) of a higher series of samples XH' (Jc) .
For example, the methodology used to produce the subset Xh (Jc) of a higher series of samples Xh (k) illustrated in Fig 3B maintains the h, samples with biggest absolute values, and sets all the other values to zero. The value of hs may be selected independently for every one of the different higher series of samples, or the same value can be used for all the different higher series of samples. As a result, the high amplitude spectral peaks in the spectrum are maintained which retains the most perceptually important information and also the information that is most influential in the similarity cost function. Discontinuities or gaps are introduced into the continuous spectral band represented by the higher series of samples XH ] (Jc) .
Other different methodologies may be used.
For example, the subset selection block 20 may select a subset XH J (k) of a higher series of samples by including psycho-acoustically significant samples and excluding psycho-acoustically insignificant samples. For example, the subset selection block 20 may select a subset Xh (Jc) Oi a higher series of samples based upon the amplitudes of the higher series of samples. It may for example, select the Z1 highest values, or the highest Z2% of values. It may for example use a statistical model to select the subset Xh (Jc) . For example it may select those samples with an amplitude greater than Z3 standard deviations from the mean amplitude.
For example, the subset selection block 20 may select a subset XH ] (Jc) of a higher series of samples based upon the maxima in the amplitudes of the higher series of samples. It may for example, select the Z1 highest maxima, or the highest Z2% maxima.
In the methodology described with reference to Fig 3A, the subset selection block 20 uses a criteria to select specific samples e.g. hs samples with biggest absolute values and then applies the selection to only those specific samples e.g. maintains only the hj samples. In an alternative methodology, the subset selection block 20 uses a criteria to select specific samples (e.g. h, samples with biggest absolute values) and then selects, for inclusion in the subset XH } (k) , not only those specific samples but also one or more of the samples adjacent those specific samples.
Due to complexity limitations it is possible that for example in total Y non-zero
M
samples can be selected in total to all M higher band series, i.e. ∑h} = Y. It is possible to analyze the perceptual importance of every higher series XH ] (k) and allocate the number of non-zero samples hs based on that. For example, the number of non-zero samples h, for the higher series XH } (k) can be defined by h} = h} _mm + p] , where hj mιn is the minimum number of non-zero components allocated to xH ] (k) in any case and ps has been selected using some suitable
M
perceptual criteria such that∑h} _mn +Pj = γ.
The subset selection block 20 may use a predetermined methodology for selecting the subset. Alternatively, the subset selection block 20 may select which one of a plurality of different methodologies are used. Processing
The sub-series selection block 22 processes the selected subset xH } {k) o\ a higher series of samples and the lower series of samples xL(k) to parametrically encode the higher series of samples XH' (k) by identifying a sub-series of the lower series of samples. The sub-series selection block 22 determines a similarity cost function S(d), that is dependent upon the selected subset Xj1 (K) and a putative sub-series XL (k + d) of the lower series of samples, for each one of a plurality of putative sub-series of the lower series.
It selects the best sub-series Xj (k) = XL (k + d) by choosing the putative sub-series
XL (k + d) of the lower series having the best similarity cost function S(d). It identifies the position of the selected putative sub-series XL (k + d) within the lower series using a parameter (d).
An example of a suitable method 30 is illustrated in Fig 7.
At block 32, the subset Xj1 (Ic) of the higher series of samples Xj1 (K) is obtained from the subset selection block 20.
At block 34, the lower series of samples XL(k) is obtained from, in the example of Fig 1 , the decoder block 12.
At block 36, initialization of the search loop occurs, d is set to 0. Smax is set to zero. dmax is set to zero.
The value d determines the putative sub-series XL (k + d) of the lower series of samples XL(k) . At block 40, a similarity cost function S(d) that is dependent upon the selected subset Xjj (k) and the current putative sub-series XL (k + d) of the lower series of samples is determined.
One example of a similarity cost function is the inverse of the Euclidian distance, another example is the normalized correlation. Equation (1 A) expresses an example of the similarity cost function as a cross-correlation.
Figure imgf000012_0001
. Equation (1 B) expresses another example of the similarity cost function as a normalized cross-correlation.
Figure imgf000012_0002
In (1 A) rij is the length of the /h sub band.
The similarity cost function is a function of X^ ik) as opposed to being a function of
XH ] (k)
In this example, the similarity cost function, comprises processing of each of the samples in the selected subset XH } (k) with the respective corresponding sample in the putative sub-series XL (k + d) .
At block 42, if the current putative sub-series XL (k + d) of the lower series has a better similarity cost function S(d) than the current value of Smax , then the method moves to block 44 otherwise it moves to block 46.
At block 44, the current best sub-series X[ (k) = XL (k + J1113x ) is updated by setting dmax(j)= d and Smax = S(d). The method then moves to block 46. At block 46, if the search has completed (d=D), the method moves to block 48. Otherwise the method moves to block 38, where d is incremented by one. and a new current putative sub-series XL (k + d) is defined for the search loop.
At block 48, the position of the selected putative sub-series XL (k + Cl102x ) within the lower series is identified using the parameter dmax(j)
The range of allowed d values (number of search loops) can be quite large (for example 256 different values) and thus a large number of S(d) values are computed in the loop of Fig 7. The numerator of (1A) & (1 B), requires n} multiplications as well as n} - \ additions for every d. Thus the numerator of (1A) & (1 B) is a source of complexity. With the proposed method as the subset Xj1 (k) is of size hs only hs multiplications and hj - 1 additions are needed in the denominator of (1A) & (1 B) for every d, as all the other multiplications are known to be zero.
As an example, if the higher series of samples XH 3 (k) has 100 samples but the subset xH J (k) has 10 samples, and there are 150 different delay values d, the total complexity of the correlation computation in the numerator of Equation (1A) or (1 B) reduces from 15000 multiplications and 14850 additions to 1500 multiplications and 1350 additions, which is significant reduction. The reduced subset XH } (Jk) , decreases significantly the complexity required by correlation calculation (equation (1A) or (1 B)). The reduced subset may be achieved by including in the calculation of the similarity cost function only the perceptually most important spectral components. In the similarity cost function S(d) defined at Equation (1A) or (1 B), the current putative sub-series XL (k + d) and the subset XH ] (Jk) of the higher series of samples are derived from the same frame of digital audio 3. In other implementations, the search for the putative sub-series XL (k + d) that best matches the subset xH ] (Jc) of the higher series of samples may range across multiple audio frames.
In the described implementation, the size of the higher series of samples and the size of the lower series of samples are predetermined. In other implementations the size of first series and/or the size of the second series may be dynamically varied.
Scaling Referring back to Fig 2, in this example, the most similar match X[(k) = XL (k + dmax ) may be scaled using two scaling factors aλ{j) and Ct2(J). The first scaling factor Gr1(Z) may be determined in the scaling parameter block 24. The second scaling factor ct2(j) may be determined in the scaling parameter block 26. The first scaling factor aλ(j) is dependent upon the selected subset xH ] (k) o\ the higher series of samples. The first scaling factor is a function of XH } (k) as opposed to being a function of XH ] (Jc)
The first scaling factor operates on the linear domain to match the high amplitude peaks in the spectrum:
Equation (2) expresses an example of a suitable first scaling factor as a normalized cross-correlation.
Figure imgf000014_0001
Notice that aλ(j) can get both positive and negative values.
The numerator of Equation (1 A) or (1 B) and Equation (2) are the same. The denominators of Equation (1A) or (1 B) and Equation (2) are related. The numerator and/or the denominator calculated for S(dmax) in Equation (1A) may be re-used to calculate the first scaling factor.
Alternatively, the first scaling factor may be computed as a function of XH } (k) instead of a function of XH ] (k) , for example as shown in equation (3).
Figure imgf000015_0001
The second scaling factor σ2(y) operates on the logarithmic domain and is used to provide better match with the energy and the logarithmic domain shape. The second scaling factor is a function of the whole of the higher series of samples XH } (Jc) .
as opposed to being a function of the subset XH ] (k) .
Equation (4) expresses an example of a suitable second scaling factor:
Figure imgf000015_0002
where M / = max(log10 ( OLx (j)X [ (k) )) .
The overall synthesized sub band xH' (k) is then obtained as
Figure imgf000015_0003
where ζ{k) is -1 if Cc1(J)XKk) is negative and otherwise 1 . The output of each of the parametric coding blocks 14, is a set of parameters representing the higher frequency band 15,. The parameters representing the higher frequency band 15, include the parameter dmax(j) which identifies a sub-series of the lower series of samples XL(k) suitable for producing the higher series of samples
XH 1 (Jk) , and the scaling factors aλ(j), Ct2(J).
The audio decoding apparatus 4 processes the encoded data 5 to produce digital audio 7. The encoded data 5 comprises encoded audio 13 (encoding the lower series of samples XL (k) ) and the parameters representing the higher frequency band 15,.
The decoding apparatus 4 is configured to decode the encoded audio 13 to produce the lower series of samples XL (k) . The decoding apparatus 4 is configured to replicate the higher series of samples XH ] (Jk) forming the higher frequency spectral band using the sub-series XL(Jk) o\ the lower series of samples identified by the parameter dmax(j).
Referring to Figs 1 and 2, each of the parametric coding block 141 s 142....14M, may be provided as a distinct block or a single block may be reused with different inputs as the respective parametric coding blocks 141 s 142....14M. A block may be a hardware block such as circuitry. A block may be a software block implemented via computer code.
Referring to Fig 2, the subset selection block 20 and the sub series selection block may be implemented by a single hardware block or by a single software block. Alternatively, the subset selection block 20 and the sub series selection block may be implemented using distinct hardware blocks and/or software blocks. A hardware block comprises circuitry. Referring to Fig 2, the scaling parameter blocks 24, 26 are optional. When present, one or more of the scaling parameter blocks may be integrated with the sub series selection block 22 or may be integrated into a single block. A software block or software blocks, a hardware block or hardware blocks and a mixture of software block(s) and hardware blocks may be provided by the apparatus 2. Examples of apparatus include modules, consumer devices, portable devices, personal devices, audio recorders, audio players, multimedia devices etc.
The apparatus 2 may comprise: circuitry 22 configured to process a selected subset of a series of samples XH ] (k) forming a higher frequency spectral band of an audio signal and a series XL (k) oi samples forming a lower frequency spectral band of the audio signal to parametrically encode the series of samples XH ] (Jc) forming the higher frequency spectral band by identifying a sub-series XL(k) of the lower series of samples using a parameter dmax(j)..
Fig 5 schematically illustrates a controller 50 suitable for use in an encoding apparatus 2 and/or a decoding apparatus.
Implementation of a controller can be in hardware alone ( a circuit, a processor...), have certain aspects in software including firmware alone or can be a combination of hardware and software (including firmware). A controller may be implemented using instructions that enable hardware
functionality, for example, by using executable computer program instructions in a general-purpose or special-purpose processor that may be stored on a computer readable storage medium (disk, memory etc) to be executed by such a processor. The controller 50 illustrated in Fig 5 comprises a processor 52 and a memory 54.
The processor 52 is configured to read from and write to the memory 54. The processor 52 may also comprise an output interface 53 via which data and/or commands are output by the processor 52 and an input interface 55 via which data and/or commands are input to the processor 52.
The memory 54 stores a computer program 56 comprising computer program instructions that, when loaded into the processor 52, control the operation of the encoding apparatus 2 and/or decoding apparatus 4. The computer program instructions 56 provide the logic and routines that enables the apparatus to perform the methods illustrated in Figs 1 to 4 and 7. The processor 52 by reading the memory 54 is able to load and execute the computer program 56. The computer program may arrive at the apparatus via any suitable delivery mechanism 58. The delivery mechanism 58 may be, for example, a computer- readable physical storage medium as illustrated in Fig 6, a computer program product, a memory device, a record medium such as a CD-ROM or DVD, an article of manufacture that tangibly embodies the computer program 56. The delivery mechanism may be a signal configured to reliably transfer the computer program 56.
The apparatus may propagate or transmit the computer program 56 as a computer data signal. Although the memory 54 is illustrated as a single component it may be implemented as one or more separate components some or all of which may be
integrated/removable and/or may provide permanent/semi-permanent/
dynamic/cached storage. References to 'computer-readable storage medium', 'computer program product', 'tangibly embodied computer program' etc. or a 'controller', 'computer', 'processor' etc. should be understood to encompass not only computers having different architectures such as single /multi- processor architectures and sequential (Von Neumann)/parallel architectures but also specialized circuits such as field- programmable gate arrays (FPGA), application specific circuits (ASIC), signal processing devices and other devices. References to computer program,
instructions, code etc. should be understood to encompass software for a
programmable processor or firmware such as, for example, the programmable content of a hardware device whether instructions for a processor, or configuration settings for a fixed-function device, gate array or programmable logic device etc.
Although a coding apparatus 2 and a decoding apparatus 4 have been described, it should be appreciated that a single apparatus may have the functionality to act as the coding apparatus and/or the decoding apparatus 4. As used here 'module' refers to a unit or apparatus that excludes certain
parts/components that would be added by an end manufacturer or a user.
The blocks illustrated in the Figs may represent steps in a method and/or sections of code in the computer program 56. The illustration of a particular order to the blocks does not necessarily imply that there is a required or preferred order for the blocks and the order and arrangement of the block may be varied. Furthermore, it may be possible for some steps to be omitted. Although embodiments of the present invention have been described in the preceding paragraphs with reference to various examples, it should be appreciated that modifications to the examples given can be made without departing from the scope of the invention as claimed.. Features described in the preceding description may be used in combinations other than the combinations explicitly described.
Although functions have been described with reference to certain features, those functions may be performable by other features whether described or not.
Although features have been described with reference to certain embodiments, those features may also be present in other embodiments whether described or not.
Whilst endeavoring in the foregoing specification to draw attention to those features of the invention believed to be of particular importance it should be understood that the Applicant claims protection in respect of any patentable feature or combination of features hereinbefore referred to and/or shown in the drawings whether or not particular emphasis has been placed thereon. I/we claim:

Claims

1. A method comprising:
processing a selected subset of a higher series of samples forming a higher frequency spectral band of an audio signal and a lower series of samples forming a lower frequency spectral band of the audio signal to parametrically encode the higher series of samples forming the higher frequency spectral band by identifying a sub- series of the lower series of samples.
2. A method as claimed in claim 1 , further comprising:
creating the selected subset by selecting a subset of a higher series of samples in the frequency domain that form a higher frequency spectral band of an audio signal; processing the selected subset of the higher series of samples and a lower series of samples in the frequency domain forming a lower frequency spectral band of the audio signal to select a sub-series of the lower series of samples; and
parametrically encoding the higher series of samples by identifying the selected sub- series of the lower series of samples.
3. A method as claimed in claim "l or 2, wherein the higher series has fewer samples that lower series.
4. A method as claimed in claim "l or 2, wherein the higher series and the lower series are non-overlapping
5. A method as claimed in any preceding claim, further comprising, for each one of different multiple higher series of samples forming different higher frequency spectral bands,
processing a selected subset for each one of multiple higher series of samples with the lower series of samples to parametrically encode each of the multiple higher series of samples by identifying, for each higher series of samples, a sub-series of the lower series of samples.
6. A method as claimed in any preceding claim, further comprising:
selecting a subset for each one of multiple different higher series of samples; processing each of the selected subsets and the first series of samples to select multiple sub-series of the first series of samples; and
parametrically encoding the multiple second series of samples by identifying the multiple selected sub-series of the first series of samples.
7. A method as claims in claims 5 or 6, wherein each of the multiple higher series has fewer samples that the lower series.
8. A method as claimed in claim 5, 6 or 7, wherein the multiple different higher series are non-overlapping.
9. A method as claimed in any preceding claim further comprising: creating the higher series of samples and the lower series of samples using a modified discrete cosine transform to transform a digital audio signal.
10. A method as claimed in any preceding claim further comprising psycho-acoustic encoding of the lower series of samples.
1 1 . A method as claimed in any preceding claim further comprising psycho-acoustic encoding and then decoding the lower series of samples before processing the selected subset of a higher series of samples and the lower series of samples to parametrically encode the higher series of samples by identifying a sub-series of the lower series of samples.
12. A method as claimed in any preceding claim, further comprising selecting a subset of a higher series of samples by modifying selected samples within the higher series of samples to have zero value.
13. A method as claimed in any preceding claim, further comprising selecting a subset of a higher series of samples by including psycho-acoustically significant samples and excluding psycho-acoustically insignificant samples
14. A method as claimed in any preceding claim, further comprising selecting a subset of a higher series of samples based upon the amplitudes of the higher series of samples
15. A method as claimed in claim 14, further comprising selecting a subset of a higher series of samples based upon the maxima in the amplitudes of the higher series of samples
16. A method as claimed in any preceding claim, further comprising selecting a subset of a higher series of samples based upon a statistical model
17. A method as claimed in any preceding claim, further comprising selecting a subset of a higher series of samples by selecting one of a plurality of different methodologies for determining a subset of a higher series of samples
18. A method as claimed in any preceding claim, wherein processing the selected subset and the lower series of samples to parametrically encode the higher series of samples by identifying a sub-series of the lower series of samples comprises: determining a similarity cost function, that is dependent upon the selected subset and a putative sub-series of the lower series of samples, for each one of a plurality of putative sub-series of the lower series;
selecting the putative sub-series of the lower series having the best similarity cost function; and
identifying the position of the selected putative sub-series within the lower series using a parameter.
19. A method as claimed in claim 18, wherein the similarity cost function, comprises processing of each of the samples in the selected subset with the respective corresponding sample in the putative sub-series.
20. A method as claimed in claim 18 or 19, wherein the similarity cost function, comprises correlation of the selected subset and the putative sub-series.
21 . A method as claimed in claim 20 wherein at least part of the correlation result for the selected putative sub-series is re-used to calculate a scaling factor.
22. A method as claimed in any preceding claim, wherein a first scaling factor is dependent upon the selected subset of the higher series of samples. .
23. A method as claimed in any one of claims 1 to 21 , wherein a first scaling factor is dependent upon the whole of the selected subset of the higher series of samples.
24. A method as claimed in claim 22 or 23, wherein a second scaling factor is dependent upon the whole of the higher series of samples.
25. A computer program which when run on a processor enables the processor to perform the method of any one of claims 1 to 24.
26. An apparatus configured to perform the method of any one of claims 1 to 24.
27. A module configured to perform the method of any one of claims 1 to 24.
28. A system comprising:
an encoding apparatus configured to process a selected subset of a series of samples forming a higher frequency spectral band of an audio signal and a series of samples forming a lower frequency spectral band of the audio signal to parametrically encode the series of samples forming the higher frequency spectral band by identifying, using a parameter, a sub-series of the lower series of samples; and
a decoding apparatus configured to replicate the series of samples forming the higher frequency spectral band using the sub-series of the lower series of samples identified by the parameter.
29. A system as claimed in claim 28, wherein the decoding apparatus is configured to decode data received from the encoding apparatus to produce the lower series of samples from which the sub-series of the lower series of samples is obtained.
30. Apparatus comprising:
circuitry configured to process a selected subset of a series of samples forming a higher frequency spectral band of an audio signal and a series of samples forming a lower frequency spectral band of the audio signal to parametrically encode the series of samples forming the higher frequency spectral band by identifying a sub-series of the lower series of samples.
31 . An apparatus as claimed in claim 30, further comprising: circuitry configured to select a subset of a higher series of samples in the frequency domain that form a higher frequency spectral band of an audio signal;
32. Apparatus comprising:
processing means for processing a selected subset of a series of samples forming a higher frequency spectral band of an audio signal and a series of samples forming a lower frequency spectral band of the audio signal to parametrically encode the series of samples forming the higher frequency spectral band by identifying a sub-series of the lower series of samples.
33. An apparatus as claimed in claim 32 further comprising:
means for selecting a subset of a higher series of samples in the frequency domain that form a higher frequency spectral band of an audio signal;
means for processing the selected subset of the higher series of samples and a lower series of samples in the frequency domain forming a lower frequency spectral band of the audio signal to select a sub-series of the lower series of samples; and means for parametrically encoding the higher series of samples by identifying the selected sub-series of the lower series of samples.
34. A computer program which when run on a processor enables the processor to process a selected subset of a series of samples forming a higher frequency spectral band of an audio signal and a series of samples forming a lower frequency spectral band of the audio signal to parametrically encode the series of samples forming the higher frequency spectral band by identifying a sub-series of the lower series of samples.
35. A computer program which when run on a processor enables the processor to select a subset of a higher series of samples in the frequency domain that form a higher frequency spectral band of an audio signal;
process the selected subset of the higher series of samples and a lower series of samples in the frequency domain forming a lower frequency spectral band of the audio signal to select a sub-series of the lower series of samples; and parametrically encode the higher series of samples by identifying the selected sub- series of the lower series of samples.
36. A computer readable physical medium tangibly embodying the computer program as claimed in claim 34 or 35.
37. A module comprising
circuitry configured to process a selected subset of a series of samples forming a higher frequency spectral band of an audio signal and a series of samples forming a lower frequency spectral band of the audio signal to parametrically encode the series of samples forming the higher frequency spectral band by identifying a sub-series of the lower series of samples.
PCT/EP2009/058165 2009-06-30 2009-06-30 Audio coding WO2011000408A1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
PCT/EP2009/058165 WO2011000408A1 (en) 2009-06-30 2009-06-30 Audio coding

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/EP2009/058165 WO2011000408A1 (en) 2009-06-30 2009-06-30 Audio coding

Publications (1)

Publication Number Publication Date
WO2011000408A1 true WO2011000408A1 (en) 2011-01-06

Family

ID=41557557

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/EP2009/058165 WO2011000408A1 (en) 2009-06-30 2009-06-30 Audio coding

Country Status (1)

Country Link
WO (1) WO2011000408A1 (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2013057895A1 (en) 2011-10-19 2013-04-25 パナソニック株式会社 Encoding device and encoding method
WO2013061530A1 (en) 2011-10-28 2013-05-02 パナソニック株式会社 Encoding apparatus and encoding method
US9997171B2 (en) 2014-05-01 2018-06-12 Gn Hearing A/S Multi-band signal processor for digital audio signals

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP1441330A2 (en) * 2002-12-23 2004-07-28 Samsung Electronics Co., Ltd. Method of encoding and/or decoding digital audio using time-frequency correlation and apparatus performing the method
WO2007052088A1 (en) * 2005-11-04 2007-05-10 Nokia Corporation Audio compression

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP1441330A2 (en) * 2002-12-23 2004-07-28 Samsung Electronics Co., Ltd. Method of encoding and/or decoding digital audio using time-frequency correlation and apparatus performing the method
WO2007052088A1 (en) * 2005-11-04 2007-05-10 Nokia Corporation Audio compression

Cited By (15)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP2770506A4 (en) * 2011-10-19 2015-02-25 Panasonic Ip Corp America Encoding device and encoding method
WO2013057895A1 (en) 2011-10-19 2013-04-25 パナソニック株式会社 Encoding device and encoding method
US20140244274A1 (en) * 2011-10-19 2014-08-28 Panasonic Corporation Encoding device and encoding method
US9336787B2 (en) 2011-10-28 2016-05-10 Panasonic Intellectual Property Corporation Of America Encoding apparatus and encoding method
JPWO2013061530A1 (en) * 2011-10-28 2015-04-02 パナソニック インテレクチュアル プロパティ コーポレーション オブアメリカPanasonic Intellectual Property Corporation of America Encoding apparatus and encoding method
EP2772913A4 (en) * 2011-10-28 2015-05-06 Panasonic Ip Corp America Encoding apparatus and encoding method
WO2013061530A1 (en) 2011-10-28 2013-05-02 パナソニック株式会社 Encoding apparatus and encoding method
US9472200B2 (en) 2011-10-28 2016-10-18 Panasonic Intellectual Property Corporation Of America Encoding apparatus and encoding method
JP2017049620A (en) * 2011-10-28 2017-03-09 パナソニック インテレクチュアル プロパティ コーポレーション オブ アメリカPanasonic Intellectual Property Corporation of America Encoding device and encoding method
EP3321931A1 (en) 2011-10-28 2018-05-16 Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V. Encoding apparatus and encoding method
JP2018132776A (en) * 2011-10-28 2018-08-23 フラウンホッファー−ゲゼルシャフト ツァ フェルダールング デァ アンゲヴァンテン フォアシュンク エー.ファオ Encoding device and encoding method
US10134410B2 (en) 2011-10-28 2018-11-20 Fraunhofer-Gesellschaft Zur Foerderung Der Angewandten Forschung E.V. Encoding apparatus and encoding method
EP3624119A1 (en) 2011-10-28 2020-03-18 Fraunhofer Gesellschaft zur Förderung der Angewand Encoding apparatus and encoding method
US10607617B2 (en) 2011-10-28 2020-03-31 Fraunhofer-Gesellschaft Zur Foerderung Der Angewandten Forschung E.V. Encoding apparatus and encoding method
US9997171B2 (en) 2014-05-01 2018-06-12 Gn Hearing A/S Multi-band signal processor for digital audio signals

Similar Documents

Publication Publication Date Title
US7460990B2 (en) Efficient coding of digital media spectral data using wide-sense perceptual similarity
KR101143225B1 (en) Complex-transform channel coding with extended-band frequency coding
US7864843B2 (en) Method and apparatus to encode and/or decode signal using bandwidth extension technology
JP6791839B2 (en) Packet loss hiding method
MX2008000528A (en) Modification of codewords in dictionary used for efficient coding of digital media spectral data.
CN106663449B (en) Encoding device and method, decoding device and method, and program
US20190057706A1 (en) Signal Encoding And Decoding Methods and Devices
US8825494B2 (en) Computation apparatus and method, quantization apparatus and method, audio encoding apparatus and method, and program
EP3616325A1 (en) Difference data in digital audio signals
JP2015512532A (en) Audio encoding and decoding with conditional quantizer
WO2011000408A1 (en) Audio coding
CN106409303B (en) Handle the method and apparatus of signal
EP2481048B1 (en) Audio coding
WO2009056866A1 (en) Fast spectral partitioning for efficient encoding
RU2409874C2 (en) Audio signal compression
CN110291583B (en) System and method for long-term prediction in an audio codec
US20110112841A1 (en) Apparatus
WO2011114192A1 (en) Method and apparatus for audio coding

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 09780019

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 09780019

Country of ref document: EP

Kind code of ref document: A1