EP3252763A1 - Codeur audio à faible retard - Google Patents

Codeur audio à faible retard Download PDF

Info

Publication number
EP3252763A1
EP3252763A1 EP16171853.1A EP16171853A EP3252763A1 EP 3252763 A1 EP3252763 A1 EP 3252763A1 EP 16171853 A EP16171853 A EP 16171853A EP 3252763 A1 EP3252763 A1 EP 3252763A1
Authority
EP
European Patent Office
Prior art keywords
audio
signal
frame
encoded
audio signal
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Withdrawn
Application number
EP16171853.1A
Other languages
German (de)
English (en)
Inventor
Adriana Vasilache
Anssi RÄMÖ
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Nokia Technologies Oy
Original Assignee
Nokia Technologies Oy
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nokia Technologies Oy filed Critical Nokia Technologies Oy
Priority to EP16171853.1A priority Critical patent/EP3252763A1/fr
Publication of EP3252763A1 publication Critical patent/EP3252763A1/fr
Withdrawn legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
    • G10L19/04Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using predictive techniques
    • G10L19/16Vocoder architecture
    • G10L19/18Vocoders using multiple modes
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
    • G10L19/04Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using predictive techniques
    • G10L19/16Vocoder architecture
    • G10L19/18Vocoders using multiple modes
    • G10L19/22Mode decision, i.e. based on audio signal content versus external parameters

Definitions

  • the example and non-limiting embodiments of the present invention relate to very low-delay coding of audio signals at high sound quality.
  • an audio coding technique When such an audio coding technique is applied in an audio processing system that involves e.g. capturing and processing an audio signal and related processing, encoding the captured/processed audio signal, transmitting the encoded audio signal from one entity to another, decoding the received encoded audio signal and reproducing the decoded audio signal, the overall processing delay typically increases clearly beyond the mere coding delay, thereby rendering such audio coding techniques unsuitable for applications that cannot tolerate long latency such as telephony, wireless microphones or audio co-creation systems.
  • Speech coding techniques such as adaptive multi-rate (AMR), adaptive multi-rate wideband (AMR-WB) and 3GPP enhanced voice services (EVS) employ coding delay in the range of 25 to 32 ms, which makes them somewhat better suited for some latency-critical applications.
  • these coding techniques are speech coding techniques that operate on bandwidth-limited audio signals at a relatively low-bitrates, thereby providing an audio quality that is not suited for applications that require high-quality full-band audio.
  • speech coding techniques such as. ITU-T G.726, G.728 and G.722 that enable very low coding delay even in a range below 1 ms, but also these coding techniques operate on voice band (e.g. at 8 or 16 kHz sampling frequency) and provide a rather modest compression ratio.
  • Some recently introduced audio coding techniques such as Opus (in a low-delay mode) and AAC-ULD enable relatively low coding delay in a range from 2.5 to 20 ms for full-band audio at a relatively good sound quality.
  • the AAC-ULD coding technique enables good sound quality using a coding delay of approximately 8 ms at bit-rates around 72 to 96 kilobits per second (kbps) or using a coding delay of approximately 2 ms at bit-rates around 128 to 192 kbps.
  • a method for encoding a frame of an input audio signal that comprises a time series of input samples into a frame of an encoded audio signal comprising encoding said frame of the input audio signal using at least two of a plurality of audio encoding modes, wherein each of said plurality of audio encoding modes is arranged to encode the frame of the input audio signal into a respective encoded signal, wherein said plurality of audio encoding modes include at least a first audio encoding mode that comprises linear predictive filtering of said time series of input samples using linear predictive filter coefficients computed using a backward prediction into a residual signal that comprises a respective time series of residual samples and quantizing the time series of residual samples, and a second audio encoding mode that comprises directly quantizing the time series of input samples, and selecting, in accordance with a mode selection rule, one of the respective encoded signals as the frame of the encoded audio signal.
  • the selecting one of the respective encoded signals as the frame of the encoded audio signal may comprise: computing a respective distortion value for each of said respective encoded signals; and selecting the respective encoded signal that results in the smallest distortion value as the frame of the encoded audio signal.
  • the computing of a distortion value for a given respective encoded signal may comprise: creating a reconstructed audio signal on basis of the given respective encoded signal; and computing the distortion value as a value that is indicative of the difference between said frame of the input audio signal and the reconstructed audio signal.
  • the first audio encoding mode may comprise computing the linear predictive filter coefficients on basis of a reconstructed audio signal derived on basis of one or more frames of encoded audio signal that immediately precede said frame of the input audio signal.
  • the first audio encoding mode may comprise encoding said time series of the residual samples by using a first gain-shape encoder to generate a first gain and first relative sample values that represent said frame of the residual signal.
  • the first audio encoding mode may comprise quantizing the first gain and the first relative sample values that represent said frame of the residual signal by using a first pyramidally truncated lattice quantizer.
  • the second audio encoding mode may comprise encoding said time series of the input samples by using a second gain-shape encoder to generate a second gain and second relative sample values that represent said frame of the input audio signal.
  • the second audio encoding mode may comprise quantizing the second gain and the second relative sample values that represent said frame of the input audio signal by using a second pyramidally truncated lattice quantizer.
  • the second gain-shape encoder may comprise the first gain-shape encoder; and the second pyramidally truncated lattice quantizer may comprise the first pyramidally truncated lattice quantizer.
  • the method may further comprise providing an indication of the selected audio encoding mode in said frame of the encoded audio signal.
  • a method for decoding a frame of an encoded audio signal into a frame of a reconstructed audio signal that comprises a time series of output samples comprising decoding said frame of the encoded audio signal with one of a plurality of audio decoding modes, wherein said plurality of audio decoding modes include at least a first audio decoding mode and a second audio decoding mode, wherein the first audio decoding mode comprises dequantizing encoded residual parameters received in said frame of the encoded audio signal into a frame of reconstructed residual signal that comprises a time series of reconstructed residual samples and linear predictive filtering of said time series of reconstructed residual samples into said time series of output samples using linear predictive filter coefficients computed using a backward prediction, and wherein the second audio decoding mode comprises directly dequantizing encoded signal-domain parameters received in said frame of the encoded audio signal into said time series of output samples.
  • the method may further comprise: receiving an indication of one of the plurality of audio encoding modes; and decoding said frame of the encoded audio signal using one of the plurality of audio decoding modes in accordance with said received indication.
  • the first audio decoding mode may comprise computing the linear predictive filter coefficients on basis of a plurality of samples of reconstructed audio signal that immediately precede said frame of the reconstructed audio signal.
  • the encoded residual parameters may comprise a first gain and first relative sample values that represent said frame of the reconstructed residual signal; and the first audio decoding mode may comprise decoding said first gain and said first relative sample values using a first gain-shape decoder.
  • the first audio decoding mode may comprise dequantizing the first gain and the first relative sample values by using a first pyramidally truncated lattice quantizer.
  • the encoded signal-domain parameters may comprise a second gain and second relative sample values that represent said frame of the reconstructed audio signal; and the second audio decoding mode may comprise decoding said second gain and said second relative sample values using a second gain-shape decoder.
  • the second audio decoding mode may comprise dequantizing the second gain and the second relative sample values by using a second pyramidally truncated lattice quantizer.
  • the second gain-shape encoder may comprise the first gain-shape encoder; and the second pyramidally truncated lattice quantizer may comprise the first pyramidally truncated lattice quantizer.
  • an apparatus for encoding a frame of an input audio signal that comprises a time series of input samples into a frame of an encoded audio signal configured to: encode said frame of the input audio signal using at least two of a plurality of audio encoding modes, wherein each of said plurality of audio encoding modes is arranged to encode the frame of the input audio signal into a respective encoded signal, wherein said plurality of audio encoding modes include at least a first audio encoding mode configured to linear predictive filter said time series of input samples using linear predictive filter coefficients computed using a backward prediction into a residual signal that comprises a respective time series of residual samples and quantize the time series of residual samples, and a second audio encoding mode configured to directly quantize the time series of input samples, and select, in accordance with a mode selection rule, one of the respective encoded signals as the frame of the encoded audio signal.
  • an apparatus for decoding a frame of an encoded audio signal into a frame of a reconstructed audio signal that comprises a time series of output samples configured to decode said frame of the encoded audio signal with one of a plurality of audio decoding modes, wherein said plurality of audio decoding modes include at least a first audio decoding mode and a second audio decoding mode, wherein the first audio decoding mode is configured to dequantize encoded residual parameters received in said frame of the encoded audio signal into a frame of reconstructed residual signal that comprises a time series of reconstructed residual samples and linear predictive filter said time series of reconstructed residual samples into said time series of output samples using linear predictive filter coefficients computed using a backward prediction, and wherein the second audio decoding mode is configured to directly dequantize encoded signal-domain parameters received in said frame of the encoded audio signal into said time series of output samples.
  • an apparatus for encoding a frame of an input audio signal that comprises a time series of input samples into a frame of an encoded audio signal comprising audio encoding means for encoding said frame of the input audio signal using at least two of a plurality of audio encoding modes, wherein each of said plurality of audio encoding modes is arranged to encode the frame of the input audio signal into a respective encoded signal, wherein said plurality of audio encoding modes include at least a first audio encoding mode that comprises linear predictive filtering of said time series of input samples using linear predictive filter coefficients computed using a backward prediction into a residual signal that comprises a respective time series of residual samples and quantizing the time series of residual samples, and a second audio encoding mode that comprises directly quantizing the time series of input samples, and selection means for selecting, in accordance with a mode selection rule, one of the respective encoded signals as the frame of the encoded audio signal.
  • an apparatus for decoding a frame of an encoded audio signal into a frame of a reconstructed audio signal that comprises a time series of output samples comprising audio decoding means for decoding said frame of the encoded audio signal with one of a plurality of audio decoding modes, wherein said plurality of audio decoding modes include at least a first audio decoding mode and a second audio decoding mode, wherein the first audio decoding mode comprises dequantizing encoded residual parameters received in said frame of the encoded audio signal into a frame of reconstructed residual signal that comprises a time series of reconstructed residual samples and linear predictive filtering of said time series of reconstructed residual samples into said time series of output samples using linear predictive filter coefficients computed using a backward prediction, and wherein the second audio decoding mode comprises directly dequantizing encoded signal-domain parameters received in said frame of the encoded audio signal into said time series of output samples.
  • an apparatus for encoding a frame of an input audio signal that comprises a time series of input samples into a frame of an encoded audio signal comprises at least one processor; and at least one memory including computer program code, which when executed by the at least one processor, causes the apparatus to: encode said frame of the input audio signal using at least two of a plurality of audio encoding modes, wherein each of said plurality of audio encoding modes is arranged to encode the frame of the input audio signal into a respective encoded signal, wherein said plurality of audio encoding modes include at least a first audio encoding mode configured to linear predictive filter said time series of input samples using linear predictive filter coefficients computed using a backward prediction into a residual signal that comprises a respective time series of residual samples and quantize the time series of residual samples, and a second audio encoding mode configured to directly quantizing the time series of input samples, and select, in accordance with a mode selection rule, one of the respective encoded signals as the frame of the encoded audio
  • an apparatus for decoding a frame of an encoded audio signal into a frame of a reconstructed audio signal that comprises a time series of output samples comprising at least one processor; and at least one memory including computer program code, which when executed by the at least one processor, causes the apparatus to: decode said frame of the encoded audio signal with one of a plurality of audio decoding modes, wherein said plurality of audio decoding modes include at least a first audio decoding mode and a second audio decoding mode, wherein the first audio decoding mode is configured to dequantize encoded residual parameters received in said frame of the encoded audio signal into a frame of reconstructed residual signal that comprises a time series of reconstructed residual samples and linear predictive filter said time series of reconstructed residual samples into said time series of output samples using linear predictive filter coefficients computed using a backward prediction, and wherein the second audio decoding mode is configured to directly dequantize encoded signal-domain parameters received in said frame of the encoded audio signal into said time series of output samples
  • a computer program comprising computer readable program code configured to cause performing at least a method according to the example embodiment described in the foregoing when said program code is executed on a computing apparatus:
  • FIG. 1 schematically illustrates a block diagram of some components and/or entities of an audio processing system 100.
  • the audio processing system comprises an audio capturing entity 110 for capturing an input audio signal 115 that represents at least one sound, an audio encoding entity 120 for encoding the input audio signal 115 into an encoded audio signal 125, an audio decoding entity 130 for decoding the encoded audio signal 125 obtained from the audio encoding entity into a reconstructed audio signal 135, and an audio reproduction entity 140 for playing back the reconstructed audio signal 135.
  • the audio capturing entity 110 may comprise e.g. a microphone, an arrangement of two or more microphones or a microphone array, each operable for capturing a respective sound signal.
  • the audio capturing entity 110 serves to process one or more sound signals that each represent an aspect of the captured sound into the (single-channel) input audio signal 115 for provision to the audio encoding entity 120 and/or for storage in a storage means for subsequent use.
  • the audio encoding entity 120 employs an audio coding algorithm, referred herein to as an audio encoder, to process the input audio signal 115 into the encoded audio signal 125.
  • the audio encoder may be considered to implement a transform from a signal domain (the input audio signal 115) to the compressed domain (the encoded audio signal 125).
  • the audio encoding entity 120 may further include a pre-processing entity for processing the input audio signal 115 from a format in which it is received from the audio capturing entity 110 into a format suited for the audio encoder. This pre-processing may involve, for example, level control of the input audio signal 115 and/or modification of frequency characteristics of the input audio signal 115 (e.g. low-pass, high-pass or bandpass filtering).
  • the pre-processing may be provided as a pre-processing entity that is separate from the audio encoder, as a sub-entity of the audio encoder or as a processing entity whose functionality is shared between a separate pre-processing and the audio encoder.
  • the audio decoding entity 130 employs an audio decoding algorithm, referred herein to as an audio decoder, to process the encoded audio signal 125 into the reconstructed audio signal 135.
  • the audio encoder may be considered to implement a transform from an encoded domain (the encoded audio signal 125) back to the signal domain (the reconstructed audio signal 135).
  • the audio decoding entity 130 may further include a post-processing entity for processing the reconstructed audio signal 115 from a format in which it is received from the audio decoder into a format suited for the audio reproduction entity 140. This post-processing may involve, for example, level control of the reconstructed audio signal 135 and/or modification of frequency characteristics of the reconstructed audio signal 135 (e.g.
  • the post-processing may be provided as a post-processing entity that is separate from the audio decoder, as a sub-entity of the audio decoder or as a processing entity whose functionality is shared between a separate post-processing and the audio decoder.
  • the audio reproduction entity 140 may comprise, for example, headphones, a headset, a loudspeaker or an arrangement of one or more loudspeakers.
  • the audio processing system 100 may include a storage means for storing pre-captured or pre-created audio signals, among which the audio input signal for provision to the audio encoding entity 120 can be selected.
  • the audio processing system 100 may comprise a storage means for storing the reconstructed audio signal 135 for subsequent analysis, processing, playback and/or transmission to a further entity.
  • the dotted vertical line in Figure 1 serves to denote that, typically, the audio encoding entity 120 and the audio decoding entity 130 are provided in separate devices that may be connected to each other via a network or via a transmission channel.
  • the network/channel may enable a wireless connection, a wired connection or a combination of the two between the audio encoding entity 120 and the audio decoding entity 130.
  • the audio encoding entity 120 may further comprise a (first) network interface for encapsulating the encoded audio signal 125 into a sequence of protocol data units (PDUs) for transfer to the decoding entity 130 over a network/channel, whereas the audio decoding entity 130 may further comprise a (second) network interface for decapsulating the encoded audio signal 125 from the sequence of PDUs received from the audio encoding entity 120 over the network/channel.
  • PDUs protocol data units
  • Figure 2 illustrates a block diagram of some components and/or entities of an audio encoder 121 that may be provided as part of the audio encoding entity 120 according to an example.
  • the audio encoder 121 combines encoding in a signal domain and in an excitation domain to enable high sound quality in combination with a low delay, as will be described in more detail in examples in the following.
  • the audio encoding entity 120 may include further components or entities in addition to the audio encoder 121, e.g. the pre-processing entity referred to in the foregoing, which pre-processing entity may be arranged to process the input audio signal 115 before passing it for the audio encoder 121.
  • the audio encoder 121 carries out encoding of the input audio signal 115 into the encoded audio signal 125, i.e. the audio encoder 121 implements a transform from the signal domain to the encoded domain.
  • the audio encoder 121 may be arranged to process the input audio signal 115 arranged into a sequence of input frames, each input frame including digital audio signal at a predefined sampling frequency and comprising a time series of input samples.
  • the audio encoder 121 employs a fixed predefined frame length.
  • the frame length may be a selectable frame length that may be selected from a plurality of predefined frame lengths, or the frame length may be an adjustable frame length that may be selected from a predefined range of frame lengths.
  • a frame length may be defined as number samples L included in the frame, which at the predefined sampling frequency maps to a corresponding duration in time.
  • the audio encoder 121 includes two signal paths: a first signal path that involves a linear predictive coding (LPC) encoder 122 followed by a residual encoder 124 and a second signal path that involves a signal-domain encoder or can be referred to as a time sample domain encoder 126.
  • LPC encoding is a coding technique well known in the art and it makes use of short-term redundancies in the input audio signal 125.
  • the LPC encoder 122 carries out an LPC encoding procedure to process the input audio signal 115 into a residual signal 123, which is provided as input to the residual encoder 124.
  • the residual encoder 124 carries out residual encoding procedure to process the residual signal 123 into a first encoded signal 125-1 for provision to the selection entity 128.
  • the signal-domain encoder 126 carries out input signal encoding procedure to process the input audio signal 115 into a second encoded signal 125-2 for provision to the selection entity 128.
  • the selection entity further receives the input audio signal 115 and carries out selection of one of the first and second encoded signals 125-1, 125-2 as the encoded audio signal 125.
  • the input audio signal 115 is processed into the respective encoded signal 125-1, 125-2 frame by frame.
  • the LPC encoder 122 carries out the LPC encoding for a frame of input audio signal 115 and produces a corresponding frame of the residual signal 123, which in turn is processed by the residual encoder 124 into a corresponding frame of the first encoded signal 125-1.
  • the signal-domain encoder 126 processes the frame of input audio signal 115 into a corresponding frame of the second encoded signal 125-2.
  • the first signal path constitutes a first audio encoding mode and the second signal path constitutes a second audio encoding mode.
  • the first and second signal paths (i.e. the first and second audio encoding modes, respectively) outlined above and described in more detail in the following serve as non-limiting examples and hence one or both of the first and second signal paths may include additional processing components or entities.
  • the first signal path may further comprise a long-term prediction (LTP) encoder that encodes the residual signal 123 provided by the LPC encoder 122 into a second residual signal for provision instead of the residual signal 123 to the residual encoder 124 for residual encoding therein.
  • LTP encoding is a coding technique well known in the art and makes use of long(er) term redundancies (e.g.
  • the LPC encoder 122 may provide an improvement for encoding of audio input signals 125 that include a periodic or a quasi-periodic signal component whose periodicity falls into the range of long(er) term redundancies (e.g. a voice of a human subject).
  • the LPC encoder 122 carries out an LPC analysis based on past values of the reconstructed audio signal 135 using a backward prediction technique known in the art.
  • a 'local' copy of the reconstructed audio signal 135 may be stored in a past audio buffer, which may be provided e.g. in a memory in the audio encoder 121 or in the LPC encoder 122, thereby making the reconstructed audio signal 135 available for the LPC analysis in the LPC encoder 122.
  • the references to the reconstructed audio signal 135 in context of the audio encoder 121 refer to the local copy available therein. This aspect will be described in more detail later below.
  • a i , i 0: KLPC
  • N Ipc denotes the analysis window length (in number of samples)
  • t t - N LPC :
  • t denotes a signal reconstructed on basis of one or more past frames of the encoded audio signal, i.e. the most recent samples of the reconstructed audio signal 135, and the symbol ⁇ denotes an applied norm, e.g. the Euclidean norm.
  • the backward prediction computes LPC filter coefficients on basis of past samples of the reconstructed audio signal and carries out LPC analysis filtering for a frame of the input audio signal 115 using the computed LPC filter coefficients to produce a corresponding frame of the residual signal 123.
  • the LPC analysis filtering involves processing a time series of input samples into a corresponding time series of residual samples.
  • the LPC analysis filtering to compute the residual signal 123 on basis of the input audio signal 115 may be carried out e.g.
  • L denotes the frame length (in number of samples)
  • the LPC encoder 122 passes the residual signal 123 to the residual encoder 124 for computation of the first encoded signal 125-1 therein.
  • the LPC encoder 122 may further pass the LPC filter coefficients computed therein to the residual encoder 124 for subsequent forwarding to the selection entity 128 or the LPC encoder 122 may pass the computed LPC filter coefficients directly to the selection entity 128.
  • the backward prediction in the LPC encoder 122 employs a predefined window length, denoted as N Ipc , implying that the backward prediction bases the LPC analysis on N Ipc most recent samples of the reconstructed audio signal 135.
  • the analysis window covers 608 most recent samples of the reconstructed audio signal 135, which at the sampling frequency of 48 kHz corresponds to approx. 12.7 ms.
  • a shorter or longer window may be employed instead, e.g. a window having a duration of 16 ms or a duration selected from the range 12 to 30 ms.
  • a suitable length of the analysis window depends also on the existence and/or characteristics of other encoding components employed in the first audio encoding mode.
  • the first audio encoding mode may, additionally, involve LTP referred to in the foregoing, and the range of delays considered by the LTP encoder may have an effect on the most appropriate choice for the temporal length of the analysis window for the backward predictive LPC analysis.
  • the analysis window has a predefined shape, which may be selected in view of desired LPC analysis characteristics.
  • Several analysis windows for the LPC analysis applicable for the LPC encoder 122 are known in the art, e.g. a (modified) Hamming window and a (modified) Hanning window, as well as hybrid windows such as one specified in the ITU-T Recommendation G.728 (section 3.3).
  • the LPC encoder 122 employs a predefined LPC model order, denoted as K Ipc , resulting in a set of K Ipc LPC filter coefficients. Since the LPC analysis in the LPC encoder 122 relies on past values of the reconstructed audio signal 135, there is no need to transmit parameters that are descriptive of the computed LPC filter coefficients to the decoding entity 130, but the decoding entity 130 is able to compute an identical set of LPC filter coefficients for LPC synthesis filtering therein on basis of the reconstructed audio signal 135 available in the audio decoding entity 130.
  • LPC model order K Ipc may be employed since it does not have an effect on the resulting bit-rate of the encoded audio signal 125, thereby enabling accurate modeling of spectral envelope of the input audio signal 115 especially for input audio signals 115 that include a periodic or a quasi-periodic signal component.
  • required computing capacity increases with increasing LPC model order K Ipc , and hence selection of the most appropriate LPC model order K Ipc for a given use case may involve a trade-off between the desired accuracy of modeling the spectral envelope of the input audio signal 115 and the available computational resources.
  • the LPC model order K Ipc may be selected as a value between 30 and 60.
  • the residual encoder 124 carries out a residual encoding procedure that involves computing the first encoded signal 125-1 on basis of the residual signal 123 received from the LPC encoder 122.
  • the residual encoding may employ, for example, a gain-shape coding technique (e.g. a gain-shape encoder) known in the art, where the relative amplitudes of samples in a frame of the residual signal 123 are encoded separately from the gain of the frame of the residual signal 123.
  • the encoded residual parameters for a frame of the residual signal 123 hence include a vector v r (or two or more sub-vectors v r,i ) of amplitude values and a gain value g r , where a reconstructed frame of the residual signal 123 can be formed by multiplying each amplitude value of the vector v r (or the two or more sub-vectors v r,i ) by the gain value g r .
  • the gain-shape coding technique makes use of pyramidally truncated lattice quantization in generating quantized values of the vector v r (or the sub-vectors v r,i ), whereas quantized value of the gain g r may be generated separately e.g. by using a suitable scalar quantizer.
  • a coding technique different from the gain-shape coding and/or quantization technique different from the lattice quantization may be employed instead.
  • the lattice quantization has an advantage that it enables computationally feasible approach for encoding relatively long vectors (e.g. 48 samples or even higher) at a good quantization accuracy without the need to store large codebooks for the residual encoder 124.
  • the residual encoder 124 passes the encoded parameters that are descriptive of the residual signal 123 as the first encoded signal 125-1 to the selection entity 128. In a scenario where the residual encoder 124 has received the LPC filter coefficients from the LPC encoder 122, it may further pass the LPC filter coefficients to the selection entity 128 together with the first encoded signal 125-1.
  • the zero-input response of the LPC analysis filter derived in the LPC encoder 122 can be removed from the residual signal 123 before encoding the residual signal 123 in the residual encoder 124.
  • the zero-input response removal may be provided, for example, as part of the LPC encoder 122 (before passing the residual signal 123 obtained by the LPC analysis filtering to the residual encoder 124) or in the residual encoder 124 (before carrying out the encoding procedure therein).
  • K LPC denote the LPC filter coefficients
  • L denotes the frame length (in number of samples)
  • t t - K LPC + 1: t denotes a signal reconstructed on basis of one or more past frames of the encoded audio signal, i.e. the most recent samples of the reconstructed audio signal 135.
  • the computation of the zero input response is a recursive process: for the first sample of the zero input response all x ( t ) refer to past samples of the reconstructed audio signal 135, whereas the following samples of the zero input response are computed at least in part using signal samples computed for the zero input response.
  • the calculated zero input response is added back to the reconstructed audio signal 135. Consequently, also in the audio decoder, after reconstructing the residual signal therein and filtering it through the LPC synthesis filter, the zero input response is added to the reconstructed audio signal 135, as described in the following.
  • the signal-domain encoder 126 In the second audio encoding mode, the signal-domain encoder 126, also referred to as the time sample encoder 126 (as described in the foregoing), carries out an encoding procedure that involves computing the second encoded signal 125-2 directly on basis of the input audio signal 115.
  • the signal-domain encoder 126 may directly encode and/or quantize the time series of input samples, i.e. the input samples that constitute a frame of the input audio signal 115, into encoded signal-domain parameters that are descriptive of the frame of the input audio signal 115.
  • the signal-domain encoder 126 further passes the encoded signal-domain parameters as the second encoded signal 125-2 to the selection entity 128.
  • the signal-domain encoder 126 employs the same or similar coding technique as applied in the residual encoder 124. Such an approach enables efficient re-use of components within the audio encoder 121 while enabling high quality of the reconstructed audio.
  • the signal-domain encoder 126 may employ a gain-shape coding technique (e.g. a gain-shape encoder) known in the art (as outlined in the foregoing), wherein the vector of amplitude values is denoted as v s (or two or more sub-vectors denoted as v s,i ) and the gain value is denoted as g s , and use the pyramidally truncated lattice quantization (e.g.
  • the Z 48 lattice in generating quantized values of the vector v s (or the sub-vectors v s,i ) together with a suitable separate scalar quantizer for generating the quantized value of the gain g s .
  • the signal-domain encoder 126 employs a coding technique and/or quantization technique different from those employed in the residual encoder 124. While this approach would fall short of providing the benefit that arises from sharing the respective component(s) with the residual encoder 124, on the other hand it may enable tailoring the respective coding techniques and/or quantization techniques employed in the residual encoder 124 and the signal-domain encoder 126 in accordance with characteristics of the respective input signals these coding entities are arranged to process.
  • the selection entity 128 receives, for each frame, the first and second encoded signals 125-1, 125-2 together with the input audio signal 115 and the LPC filter coefficients computed in the LPC encoder 122. Based at least in part on this information, the selection entity 128 selects one of the first and second encoded signals 125-1, 125-2 for provision in the encoded audio signal 125.
  • the selection entity 128 computes a first distortion value D 1 on basis of the first encoded signal 125-1 and the input audio signal 115, which first distortion value D 1 is descriptive of the difference between the input audio signal 115 and a first reconstructed audio signal that is derivable on basis of the first encoded signal 125-1.
  • the selection entity 128 derives the first reconstructed audio signal by carrying out LPC synthesis filtering of a reconstructed residual signal by using the LPC filter coefficients derived for the current frame in the LPC encoder 122.
  • the reconstructed residual signal may be received as side information from the residual encoder 124 or the selection entity 128 may apply the encoded parameters carried in the first encoded signal 125-1 to derive the reconstructed residual signal therein.
  • the selection entity 128 may compute first distortion value D 1 e.g. as a mean squared deviation (MSD) between the first reconstructed audio signal and the input audio signal 115 or as a mean absolute deviation (MAD) between the first reconstructed audio signal and the input audio signal 115.
  • MSD mean squared deviation
  • MAD mean absolute deviation
  • the selection entity 128 further computes a second distortion value D 2 on basis of the second encoded signal 125-2 and the input audio signal 115, which second distortion value D 2 is descriptive of the difference between the input audio signal 115 and a second reconstructed audio signal that is derivable on basis of the second encoded signal 125-2.
  • the second reconstructed audio signal may be received as side information from the signal-domain encoder 126 or the selection entity 128 may apply the encoded parameters carried in the second encoded signal 125-2 to derive the second reconstructed audio signal therein.
  • the selection entity 128 may derive the second distortion value D 2 , for example, as the MSD or the MAE between the second reconstructed audio signal and the input audio signal 115.
  • the selection entity 128 may select one of the first and second encoded signals 125-1, 125-2 for the encoded audio signal 125 on basis of comparison of the first and second distortion values D 1 and D 2 .
  • the selection entity may select the first encoded signal 125-1 for the current frame in response to the first distortion value D 1 being smaller than the second distortion value D 1 (e.g. in case D 1 ⁇ D 2 holds true) and, conversely, select the second encoded signal 125-1 for the current frame in response to the first distortion value D 1 being larger than or equal to the second distortion value D 2 (e.g. in case D 1 ⁇ D 2 holds true).
  • the selection entity 128 may select the second encoded signal 125-2 for the current frame in case the first distortion value D 1 exceeds the second distortion value D 2 by at least a predefined margin.
  • Application of the margin serves to avoid unnecessarily switching between selecting first and second encoded signal 125-1, 125-2 for the encoded audio signal 125 from frame to frame by favoring the first audio encoding mode that involves also the LPC encoding. This enhances sound quality in the reconstructed audio signal 135 by avoiding the switching that is likely to result in distortion especially at high frequencies.
  • the margin may be defined as a relative value or as an absolute value:
  • the selection entity 128 appends the selected one of the first and second encoded signals 125-1, 125-2 with an indication of the selected one of the first and second encoded signals 125-1, 125-2 to provide the encoded audio signal 125 for the current frame.
  • Such indication may be referred to as a coding mode indication that serves to identify which one of the first and second audio encoding modes has been selected by the selection entity 128 to represent the current frame.
  • the coding mode indication enables the decoding entity 130 to correctly reconstruct the audio signal therein.
  • the audio encoder 121 stores at least a predefined number of most recent samples of the reconstructed audio signal 135 to enable the backward prediction in the LPC encoder 122. As described in the foregoing, this may be implemented by generating a local copy of the reconstructed audio signal 135 in the audio encoder 121 (e.g. in the selection entity 128) and storing the local copy of the reconstructed audio signal 135 in the past audio buffer in the LPC encoder 122 or otherwise within the audio encoder 121. In this regard, the past audio buffer stores at least the N Ipc most recent samples of the reconstructed audio signal 135 to cover the analysis window applied by the LPC encoder 122.
  • the selection entity 128 After having selected one of the first and second encoded signals 125-1, 125-2 for the current frame, the selection entity 128 updates the past audio buffer by discarding the L oldest samples in the past audio buffer and, depending on the selection of the first or the second encoded signal 125-1 to represent the current frame, inserting corresponding one of the first and second reconstructed audio signals in the past audio buffer to facilitate LPC analysis in the next frame.
  • Figure 3 illustrates a block diagram of some components and/or entities of an audio decoder 131 that may be provided as part of the audio decoding entity 130 according to an example.
  • the audio decoder 131 carries out decoding of the encoded audio signal 125 into the reconstructed audio signal 135, thereby serving to implement a transform from the encoded domain (back) to the signal domain and, in a way, reversing the encoding operation carried out in the audio encoder 121.
  • the audio decoder 131 process the encoded audio signal 125 frame by frame.
  • the audio decoder 131 can also have two signal paths: a first signal path that involves a residual decoder 134 followed by a LPC decoder 132 and a second signal path that involves a signal-domain decoder 136.
  • a frame of the encoded audio signal 125 received at the audio decoder 131 is processed through one of the first and second signal paths in accordance with the coding mode indication received in the encoded audio signal 125.
  • the first and second signal paths in the audio decoder 132 constitute first and second audio decoding modes, respectively.
  • a selection entity 138 receives the frame of encoded audio signal 125, reads the coding mode indication for the current frame, extracts the encoded signal from the frame of encoded audio signal 125, and passes the extracted encoded signal to one of the first and second signal paths in the audio decoder 131 accordingly.
  • the coding mode indication indicates that the encoded signal from first signal path was selected for the current frame in the audio encoder 121
  • the encoded signal in the encoded audio signal 125 comprises the first encoded signal 125-1 and the selection entity 138 passes this signal to the first signal path in the audio decoder 131 for decoding according to the first audio decoding mode.
  • the encoded signal in the encoded audio signal 125 comprises the second encoded signal 125-2 and the selection entity 138 passes this signal to the second signal path in the audio decoder 131 for decoding according to the second audio decoding mode.
  • the residual decoder 134 processes the first encoded signal 125-1 into a reconstructed residual signal 133, which is provided as input to the LPC decoder 132, which in turn carries out LPC synthesis on basis of the reconstructed residual signal 133 to output a reconstructed audio signal 135-1, which will serve as the reconstructed audio signal 135.
  • the signal-domain decoder 136 processes the second encoded signal 125-2 into a reconstructed audio signal 135-2, which will serve as the reconstructed audio signal 135.
  • the residual decoder 134 carries out a residual decoding procedure that involves computing the reconstructed residual signal 133 on basis of the first encoded signal 125-1 received from the selection entity 138.
  • a frame of reconstructed residual signal 133 is provided as respective time series of reconstructed residual samples.
  • the reconstructed residual signal 133 is passed to the LPC decoder 132 for LPC synthesis therein.
  • the residual decoder 134 In order to enable meaningful reconstruction of the residual signal, the residual decoder 134 must employ the same or otherwise matching residual coding technique as employed in the residual encoder 124.
  • the residual decoding procedure involves dequantizing the encoded residual parameters received as part of the encoded audio signal 125 and using the dequantized residual parameters to create a frame of the reconstructed residual signal 133, i.e. the time series of reconstructed residual samples.
  • the gain-shape coding technique e.g.
  • a gain-shape decoder may be employed, where the dequantization may comprise using the received encoded residual parameter to find the vector v r (or the two or more sub-vectors v r , i ) of amplitude values and the gain value g r and creation of the frame of the reconstructed residual signal 133 may comprise multiplying each amplitude value of the vector v r (or the two or more sub-vectors v r , i ) by the gain value g r .
  • the LPC decoder 132 carries out the LPC analysis based on past values of the reconstructed audio signal 135 using the same backward prediction technique as applied in the LPC encoder 122. Hence, the backward prediction computes LPC filter coefficients on basis of past samples of the reconstructed audio signal 135.
  • the LPC decoder further carries out LPC synthesis filtering of the reconstructed residual signal 133 by using the LPC filter coefficients derived for the current frame in the LPC decoder 132, thereby generating the reconstructed audio signal 135-1.
  • the LPC synthesis filtering in the LPC decoder 132 involves processing a time series of reconstructed residual samples into a corresponding time series of output samples that hence constitute a corresponding frame of the reconstructed audio signal 135.
  • the LPC decoder 132 may find the LPC filter coefficients for the LPC synthesis therein, for example, using the procedure outlined in the foregoing for the LPC encoder 122.
  • the LPC synthesis may be carried out e.g.
  • L denotes the frame length (in number of samples)
  • the resulting LPC filter coefficients are also the same or similar.
  • the past values of the reconstructed audio signal 135 required for the LPC analysis in the LPC decoder 131 are stored in a past audio buffer, which may be provided e.g. in a memory in the audio decoder 131 or in the LPC decoder 132.
  • the LPC decoder 132 After having derived the reconstructed audio signal 135-1, the LPC decoder 132 further adds the zero input response of the LPC synthesis filter to the reconstructed audio signal 135-1 before using the reconstructed audio signal 135-1 from the LPC decoder 132 as the reconstructed audio signal 135 provided as output from the audio decoder 131 and before using this signal to update the past audio buffer of the audio decoder 131 (as will be described later in this text).
  • the zero input response may be calculated on basis of the reconstructed audio signal 135-1, for example, as described in the foregoing for computation of the zero input response in the audio encoder 121.
  • the signal-domain decoder 136 In the second signal path of the audio decoder 131, the signal-domain decoder 136, which may be alternatively referred to as a time sample decoder or as a time sample domain decoder, carries out a decoding procedure that involves computing the reconstructed audio signal 135-1 directly on basis of the encoded signal-domain parameters received as part of the second encoded signal 125-2 received from the selection entity 138. Consequently, a frame of reconstructed audio signal 133 is provided as respective time series of output samples. In order to enable meaningful reconstruction of the audio signal, the signal-domain decoder 136 must employ the same or otherwise matching coding technique as employed in the signal-domain encoder 126.
  • the decoding procedure involves dequantizing the encoded signal-domain parameters and using the dequantized signal-domain parameters to create a frame of the reconstructed audio signal 135-1.
  • the gain-shape coding technique e.g. a gain-shape decoder
  • the dequantization may comprise using the received encoded signal-domain parameter to find the vector v s (or the two or more sub-vectors v s,i ) of amplitude values and the gain value g s
  • creation of the frame of the reconstructed audio signal 135-2 may comprise multiplying each amplitude value of the vector v s (or the two or more sub-vectors v s,i ) by the gain value g s .
  • the audio decoder 131 stores at least N Ipc most recent samples of the reconstructed audio signal 135 to enable the backward prediction in the LPC decoder 132. This may be implemented by storing sufficient number of most recent samples in the past audio buffer of the audio decoder 131. After having carried out decoding using one of the first and second decoding modes, the audio decoder 131 updates the past audio buffer therein by discarding the L oldest samples in the past audio buffer and inserting the samples of the reconstructed audio signal 135 in the past audio buffer to facilitate the LPC analysis in the next frame.
  • the audio decoder carries out the LPC analysis to derive the LPC filter coefficients therein also for those frames of audio signal that are encoded by the audio encoder 121 by using the second encoding mode.
  • the LPC synthesis for such frames may be carried out by the LPC decoder 132.
  • the audio encoder 131 further carries out the LPC analysis filtering (e.g. by the LPC decoder 132) of the current frame of the reconstructed audio signal 135 to derive the respective residual signal also in the audio decoder 131.
  • the residual signal derived in the audio decoder 131 is employed as part of the memory of the LPC synthesis filter in decoding of the following frame of the encoded audio signal 125.
  • r ( t ) denotes the residual signal obtained (by the LPC analysis filtering) in the audio decoder 131 and ( h 1 h 2 ... h n ) denotes the LPC synthesis filter impulse response.
  • the residual encoder 124 and the signal-domain encoder 126 of the audio encoder 121 employ the same or substantially the same bit-rate of the encoded audio signal to ensure constant or substantially constant bit-rate regardless of the currently employed audio encoding mode.
  • the bit-rate of the encoded audio signal may be selected, for example, from the range from 80 to 150 kilobits per second (kbps), e.g. as approximately 100 kbps, 119 kbps or 133 kbps, depending on the desired tradeoff between the required transmission bandwidth and sound quality in the reconstructed audio signal 135.
  • the encoded audio signal 135 is provided as frames of 100, 119 or 133 bits, respectively.
  • Tables 1, 2 and 3 in the following provide examples of performance gain enables by an audio coding arrangement that makes use of the audio encoder 121 and the audio decoder 131 according to respective examples.
  • Each of Tables 1, 2 and 3 provides respective signal to noise ratio (SNR) values computed for 12 test signals that comprise audio of different characteristics (identified in the first column of a table).
  • the second column of the table provides the SNR obtained by using a reference audio coding arrangement that enables only the first audio encoding mode operated at a certain bit-rate while the third column of the table provides the SNR obtained by using an audio coding arrangement that makes use of the audio encoder 121 and the audio decoder 131 arranged to operate at the same bit-rate as the reference audio coding arrangement.
  • the fourth column of the table indicates the relative increase in the SNR obtained by using the audio coding arrangement that makes use of the audio encoder 121 and the audio decoder 131 instead of the reference audio coding arrangement at the same bit-rate
  • the fifth column of the table indicates the percentage of frames for which the second encoding mode has been selected by the audio encoder 121.
  • Tables 1, 2 and 3 provide this information for the two audio coding arrangements operated at 133 kbps, 119 kbps and 100 kbps, respectively.
  • the operation of the audio encoder 121 and the audio decoder 131 is described using an example that involves two audio encoding modes in the audio encoder 121 and respective two audio decoding modes in the audio decoder 131.
  • This is a non-limiting example and in other examples an arrangement where the audio encoder 121 comprises two or more audio encoding modes and the audio decoder 131 comprises respective two or more audio decoding modes may be employed instead.
  • the audio encoder 121 may include three audio encoding modes, including the first and second audio encoding modes described in the foregoing together with a third audio encoding mode that is otherwise similar to the first audio encoding mode but further includes the LTP encoder envisaged in the foregoing as an exemplifying variation of the first signal path.
  • the audio encoder 121 carries out the encoding procedure via two or more signal paths that each correspond to a respective audio encoding mode.
  • the selection entity 128 receives the encoded signals 125-k from each of the signal paths and makes, derives respective reconstructed audio signals, derives for each reconstructed audio signal a respective distortion value D k that is descriptive of the difference between the input audio signal 115 and the reconstructed audio signal that is derivable on basis of the respective encoded signal 125-k.
  • Each of the distortion values D k may be computed, for example, as MSD or MAE as described in the foregoing.
  • the selection entity 138 extracts the coding mode indication and the encoded signal from a frame of the encoded audio signal 135 and carries out audio decoding on basis of the extracted encoded signal using the indicated audio decoding mode.
  • Figure 4 depicts an outline of a method 200, which serves as an exemplifying method for encoding a frame of the input audio signal 115 that comprises a time series of input samples into a corresponding frame of the encoded audio signal 125 according to an example.
  • the method 200 commences from encoding the frame of the input audio signal 115 using at least one of a plurality of audio encoding modes that include at least the first audio encoding mode and the second audio encoding mode.
  • the method 200 comprises encoding the frame of the input audio signal 115 using the first audio encoding mode that comprises linear predictive filtering of the time series of input samples using a linear predictive filter coefficients computed using a backward prediction into a residual signal 123 that comprises a respective time series of residual samples and quantizing the time series of residual samples, as indicated in block 210.
  • the method 200 further comprises encoding the frame of input audio signal 115 using the second audio encoding mode that comprises directly quantizing the time series of input samples, as indicated in block 220.
  • the method 200 further comprises selecting one of the input audio signal 115 encoded using the first audio encoding mode and the input audio signal 115 encoded using the second audio encoding mode for provision as the encoded audio signal 125, as indicated in block 230.
  • the method 200 generalizes into encoding the input audio signal 115 using a desired number of audio encoding modes (e.g. two or more) and selecting the input audio signal 115 encoded using one of the audio encoding modes for provision as the encoded audio signal 125.
  • a desired number of audio encoding modes e.g. two or more
  • Figure 5 depicts an outline of a method 300, which serves as an exemplifying method for decoding a frame of the encoded audio signal 125 into a corresponding frame of the reconstructed audio signal 135 that comprises a time series of output samples according to an example.
  • the method 300 commences from receiving an indication of the employed audio encoding mode, as indicated in block 310, and decoding the encoded audio signal 125 using one of a plurality of audio decoding modes in accordance with the received indication of the employed audio encoding mode.
  • the method 300 further comprises decoding the frame of encoded audio signal 125 using the first audio decoding mode in response the received indication indicating the first audio encoding mode, wherein the first audio decoding mode comprises dequantizing encoded residual parameters received in the frame of the encoded audio signal 215 into a frame of reconstructed residual signal 133 that comprises a time series of reconstructed residual samples and linear predictive filtering of the time series of reconstructed residual samples into the time series of output samples using a linear predictive filter coefficients computed using a backward prediction, as indicated in block 320.
  • the method 300 further comprises decoding the frame of encoded audio signal 125 using the second audio decoding mode in response to the received indication indicating the second audio encoding mode, wherein the second audio decoding mode comprises directly dequantizing encoded signal-domain parameters received in the frame of encoded audio signal 125 into the time series of output samples.
  • the method 300 generalizes into decoding the frame of encoded audio signal 125 using one of a plurality of audio decoding modes (including two or more audio decoding modes) in accordance with the received indication of the audio encoding mode employed by the audio encoder 121.
  • the method 200 may be provided, for example, in the audio encoding entity 120 or in a device that operates as or implements the audio encoding entity 120.
  • the method 300 may be provided, for example, in the audio decoding entity 130 or in a device that operates as or implements the audio decoding entity 130.
  • the method 200 and/or the method 300 may be varied in a number of ways, e.g. in accordance with the examples provided in context of description of the audio encoder 121 and the audio decoder 131 in the foregoing.
  • Figure 6 illustrates a block diagram of some components of an exemplifying apparatus 400.
  • the apparatus 400 may comprise further components, elements or portions that are not depicted in Figure 6 .
  • the apparatus 400 may be employed in implementing e.g. the audio encoder 121 or the audio decoder 131.
  • the apparatus 400 further comprises a processor 416 and a memory 415 for storing data and computer program code 417.
  • the memory 415 and a portion of the computer program code 417 stored therein may be further arranged to, with the processor 416, to implement the function(s) described in the foregoing in context of the audio encoder 121 or the audio decoder 131.
  • the apparatus 400 comprises a communication portion 412 for communication with other devices.
  • the communication portion 412 comprises at least one communication apparatus that enables wired or wireless communication with other apparatuses.
  • a communication apparatus of the communication portion 412 may also be referred to as a respective communication means.
  • the apparatus 400 may further comprise user I/O (input/output) components 418 that may be arranged, possibly together with the processor 416 and a portion of the computer program code 417, to provide a user interface for receiving input from a user of the apparatus 400 and/or providing output to the user of the apparatus 400 to control at least some aspects of operation of the audio encoder 121 or the audio decoder 131 implemented by the apparatus 400.
  • the user I/O components 418 may comprise hardware components such as a display, a touchscreen, a touchpad, a mouse, a keyboard, and/or an arrangement of one or more keys or buttons, etc.
  • the user I/O components 418 may be also referred to as peripherals.
  • the processor 416 may be arranged to control operation of the apparatus 400 e.g. in accordance with a portion of the computer program code 417 and possibly further in accordance with the user input received via the user I/O components 418 and/or in accordance with information received via the communication portion 412.
  • processor 416 is depicted as a single component, it may be implemented as one or more separate processing components.
  • memory 415 is depicted as a single component, it may be implemented as one or more separate components, some or all of which may be integrated/removable and/or may provide permanent / semi-permanent/ dynamic/cached storage.
  • the computer program code 417 stored in the memory 415 may comprise computer-executable instructions that control one or more aspects of operation of the apparatus 400 when loaded into the processor 416.
  • the computer-executable instructions may be provided as one or more sequences of one or more instructions.
  • the processor 416 is able to load and execute the computer program code 417 by reading the one or more sequences of one or more instructions included therein from the memory 415.
  • the one or more sequences of one or more instructions may be configured to, when executed by the processor 416, cause the apparatus 400 to carry out operations, procedures and/or functions described in the foregoing in context of the audio encoder 121 or the audio decoder 131.
  • the apparatus 400 may comprise at least one processor 416 and at least one memory 415 including the computer program code 417 for one or more programs, the at least one memory 415 and the computer program code 417 configured to, with the at least one processor 416, cause the apparatus 400 to perform operations, procedures and/or functions described in the foregoing in context of the audio encoder 121 or the audio decoder 131.
  • the computer programs stored in the memory 415 may be provided e.g. as a respective computer program product comprising at least one computer-readable non-transitory medium having the computer program code 417 stored thereon, the computer program code, when executed by the apparatus 400, causes the apparatus 400 at least to perform operations, procedures and/or functions described in the foregoing in context of the audio encoder 121 or the audio decoder 131.
  • the computer-readable non-transitory medium may comprise a memory device or a record medium such as a CD-ROM, a DVD, a Blu-ray disc or another article of manufacture that tangibly embodies the computer program.
  • the computer program may be provided as a signal configured to reliably transfer the computer program.
  • references(s) to a processor should not be understood to encompass only programmable processors, but also dedicated circuits such as field-programmable gate arrays (FPGA), application specific circuits (ASIC), signal processors, etc.
  • FPGA field-programmable gate arrays
  • ASIC application specific circuits
  • signal processors etc.

Landscapes

  • Engineering & Computer Science (AREA)
  • Computational Linguistics (AREA)
  • Signal Processing (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Compression, Expansion, Code Conversion, And Decoders (AREA)
EP16171853.1A 2016-05-30 2016-05-30 Codeur audio à faible retard Withdrawn EP3252763A1 (fr)

Priority Applications (1)

Application Number Priority Date Filing Date Title
EP16171853.1A EP3252763A1 (fr) 2016-05-30 2016-05-30 Codeur audio à faible retard

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
EP16171853.1A EP3252763A1 (fr) 2016-05-30 2016-05-30 Codeur audio à faible retard

Publications (1)

Publication Number Publication Date
EP3252763A1 true EP3252763A1 (fr) 2017-12-06

Family

ID=56087174

Family Applications (1)

Application Number Title Priority Date Filing Date
EP16171853.1A Withdrawn EP3252763A1 (fr) 2016-05-30 2016-05-30 Codeur audio à faible retard

Country Status (1)

Country Link
EP (1) EP3252763A1 (fr)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
RU2811412C1 (ru) * 2020-04-28 2024-01-11 Хуавей Текнолоджиз Ко., Лтд. СПОСОБ КОДИРОВАНИЯ ПАРАМЕТРОВ КОДИРОВАНИЯ С ЛИНЕЙНЫМ ПРОГНОЗИРОВАНИЕМ и УСТРОЙСТВО КОДИРОВАНИЯ

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP0417739A2 (fr) * 1989-09-11 1991-03-20 Fujitsu Limited Appareil pour codage de parole utilisant un codage multimode
US20020069075A1 (en) * 1998-05-26 2002-06-06 Gilles Miet Transceiver for selecting a source coder based on signal distortion estimate
US20120093213A1 (en) * 2009-06-03 2012-04-19 Nippon Telegraph And Telephone Corporation Coding method, coding apparatus, coding program, and recording medium therefor
US20120232913A1 (en) * 2011-03-07 2012-09-13 Terriberry Timothy B Methods and systems for bit allocation and partitioning in gain-shape vector quantization for audio coding

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP0417739A2 (fr) * 1989-09-11 1991-03-20 Fujitsu Limited Appareil pour codage de parole utilisant un codage multimode
US20020069075A1 (en) * 1998-05-26 2002-06-06 Gilles Miet Transceiver for selecting a source coder based on signal distortion estimate
US20120093213A1 (en) * 2009-06-03 2012-04-19 Nippon Telegraph And Telephone Corporation Coding method, coding apparatus, coding program, and recording medium therefor
US20120232913A1 (en) * 2011-03-07 2012-09-13 Terriberry Timothy B Methods and systems for bit allocation and partitioning in gain-shape vector quantization for audio coding

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
THOMAS R. FISHER: "A pyramid Vector Quantizer", IEEE TRANSACTIONS ON INFORMATION THEORY, vol. 32, July 1986 (1986-07-01), pages 568 - 583

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
RU2811412C1 (ru) * 2020-04-28 2024-01-11 Хуавей Текнолоджиз Ко., Лтд. СПОСОБ КОДИРОВАНИЯ ПАРАМЕТРОВ КОДИРОВАНИЯ С ЛИНЕЙНЫМ ПРОГНОЗИРОВАНИЕМ и УСТРОЙСТВО КОДИРОВАНИЯ

Similar Documents

Publication Publication Date Title
JP7244609B2 (ja) ビットバジェットに応じて2サブフレームモデルと4サブフレームモデルとの間で選択を行うステレオ音声信号の左チャンネルおよび右チャンネルを符号化するための方法およびシステム
US11978460B2 (en) Truncateable predictive coding
KR101125429B1 (ko) 오디오 코딩 시스템내에서 향상 계층을 발생시키는 방법 및 장치
TWI498882B (zh) 音訊解碼器
JP5283046B2 (ja) ピーク検出に基づく選択的スケーリングマスク計算
JP5285162B2 (ja) ピーク検出に基づいた選択型スケーリングマスク演算
US7904292B2 (en) Scalable encoding device, scalable decoding device, and method thereof
JPWO2007116809A1 (ja) ステレオ音声符号化装置、ステレオ音声復号装置、およびこれらの方法
JP2011013560A (ja) オーディオ符号化装置、オーディオ符号化方法及びオーディオ符号化用コンピュータプログラムならびに映像伝送装置
JP4842147B2 (ja) スケーラブル符号化装置およびスケーラブル符号化方法
RU2715026C1 (ru) Устройство кодирования для обработки входного сигнала и устройство декодирования для обработки кодированного сигнала
EP3762923A1 (fr) Codage audio
JPWO2008132850A1 (ja) ステレオ音声符号化装置、ステレオ音声復号装置、およびこれらの方法
WO2010016270A1 (fr) Dispositif de quantification, dispositif de codage, procédé de quantification et procédé de codage
JP2009512895A (ja) スペクトル・ダイナミックスに基づく信号コーディング及びデコーディング
JPWO2008132826A1 (ja) ステレオ音声符号化装置およびステレオ音声符号化方法
US11176954B2 (en) Encoding and decoding of multichannel or stereo audio signals
JPWO2008090970A1 (ja) ステレオ符号化装置、ステレオ復号装置、およびこれらの方法
EP3252763A1 (fr) Codeur audio à faible retard
CN114097029A (zh) 用于基于DirAC的空间音频编码的分组丢失隐藏
JP5774490B2 (ja) 符号化装置、復号装置およびこれらの方法
WO2018073486A1 (fr) Codage audio à faible retard
WO2019173195A1 (fr) Signaux dans des codecs audio fondés sur une transformée

Legal Events

Date Code Title Description
PUAI Public reference made under article 153(3) epc to a published international application that has entered the european phase

Free format text: ORIGINAL CODE: 0009012

AK Designated contracting states

Kind code of ref document: A1

Designated state(s): AL AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HR HU IE IS IT LI LT LU LV MC MK MT NL NO PL PT RO RS SE SI SK SM TR

AX Request for extension of the european patent

Extension state: BA ME

STAA Information on the status of an ep patent application or granted ep patent

Free format text: STATUS: THE APPLICATION IS DEEMED TO BE WITHDRAWN

18D Application deemed to be withdrawn

Effective date: 20180607