EP2045800A1 - Method and apparatus for transcoding - Google Patents

Method and apparatus for transcoding Download PDF

Info

Publication number
EP2045800A1
EP2045800A1 EP07117956A EP07117956A EP2045800A1 EP 2045800 A1 EP2045800 A1 EP 2045800A1 EP 07117956 A EP07117956 A EP 07117956A EP 07117956 A EP07117956 A EP 07117956A EP 2045800 A1 EP2045800 A1 EP 2045800A1
Authority
EP
European Patent Office
Prior art keywords
sampling frequency
predictive coding
linear predictive
codec format
coding coefficients
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Withdrawn
Application number
EP07117956A
Other languages
German (de)
French (fr)
Inventor
Christophe Beaugeant
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Nokia Solutions and Networks Oy
Original Assignee
Nokia Siemens Networks Oy
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nokia Siemens Networks Oy filed Critical Nokia Siemens Networks Oy
Priority to EP07117956A priority Critical patent/EP2045800A1/en
Publication of EP2045800A1 publication Critical patent/EP2045800A1/en
Withdrawn legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/02Speech enhancement, e.g. noise reduction or echo cancellation
    • G10L21/038Speech enhancement, e.g. noise reduction or echo cancellation using band spreading techniques
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
    • G10L19/04Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using predictive techniques
    • G10L19/16Vocoder architecture
    • G10L19/173Transcoding, i.e. converting between two coded representations avoiding cascaded coding-decoding
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
    • G10L19/04Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using predictive techniques
    • G10L19/06Determination or coding of the spectral characteristics, e.g. of the short-term prediction coefficients

Definitions

  • the present invention relates to transcoding.
  • networks of different type have been developed, like mobile GSM, UMTS, CDMA and IP, providing alternative ways to the 'classical' circuit switched network.
  • the interconnection of all these networks leads to an interoperability problem regarding transmission of speech.
  • non-compatible speech standards have been adopted in the different networks, although, most of the codecs at medium rate (5-16,5 kbit/s for narrowband codecs, 5-25 kbit/s for wideband codecs) are based on the same model Code Excited Linear Prediction (CELP).
  • CELP Code Excited Linear Prediction
  • the simplest method to provide inter-connectivity consists of decoding one codec standard compressed bitstream A and re-encoding it into the other codec standard bitstream B. This conventional method is called tandem transcoding. It suffers from several problems such as complexity, delay and degradation of speech.
  • bitstreams A and B transmit a similar set of parameters, such as Linear Prediction Coding (LPC) coefficients, pitch delays, fixed codebook indexes and fixed and adaptive gains.
  • LPC Linear Prediction Coding
  • the key idea of smart transcoding consists of avoiding the computation of parameters already available.
  • An intelligent mapping and quantization of the parameters available in bitstream A into bitstream B parameters allow the skipping of many functions and hence reduce the computation load of the transcoding. As depicted in Figure 1 , only a partial decoding is necessary to extract the parameters from bitstream A. Their mapping as well as a partial encoding then builds the accurate bitstream B.
  • LPC Linear Prediction Coefficients vector
  • An object of the present invention is thus to provide a method and an apparatus for implementing the method so as to solve the above problem or at least to alleviate it.
  • the objects of the invention are achieved by a method, a computer program product, an apparatus and a module which are characterized by what is stated in the independent claims.
  • the preferred embodiments of the invention are disclosed in the dependent claims.
  • the invention is based on recognizing the problem and on the realization that in transcoding between two codec formats employing different sampling frequencies, the LPC coefficients of the LPC filter of the target codec format can be estimated by applying a modification on the sampling frequency of the extracted LPC coefficients.
  • An advantage of the method and apparatus of the invention is that it enables smart transcoding between two codec formats employing different sampling frequencies.
  • the following embodiments are exemplary. Although the specification may refer to "an”, “one”, or “some” embodiment(s) in several locations, this does not necessarily mean that each such a reference is to the same embodiment(s), or that the feature only applies to a single embodiment. Single features of different embodiments may also be combined to provide other embodiments.
  • the present invention is applicable to any communication system or any combination of different communication systems such as GSM (Global System for Mobile Communications), WCDMA (Wideband Code Division Multiple Access), WLAN (Wireless Local Area Network) UMTS (Universal Mobile Telecommunications System), CDMA and/or IP (Internet Protocol) standard, or any other suitable standard/non-standard communication means.
  • GSM Global System for Mobile Communications
  • WCDMA Wideband Code Division Multiple Access
  • WLAN Wireless Local Area Network
  • UMTS Universal Mobile Telecommunications System
  • CDMA and/or IP Internet Protocol
  • the communication system may be a fixed communication system or a wireless communication system or a communication system utilizing both fixed networks and wireless networks.
  • the protocols used and the specifications of communication systems, especially in wireless communication develop rapidly. Such a development may require extra changes to an embodiment. Therefore, all terms and expressions should be interpreted broadly and they are intended to illustrate, not to restrict, the embodiment. In the following, different embodiments will be described using, as an example a system architecture to which the embodiments may be applied, without restricting the embodiment to such an architecture, however.
  • the invention generally deals with the process of finding an estimation of filter B ( z ) when LPC filter A ( z ) is known.
  • B ⁇ ( z ) the filter obtained from A ( z ).
  • the coefficient of the constructed filter B ⁇ ( z ) can be mapped into encoder B in a similar way as smart transcoding based on LPC mapping, thus avoiding the computation of the LPC coefficients within encoder B and accordingly saving computation load.
  • Filters A ( z ) and B ( z ) are AR models of two signals with different sampling frequencies.
  • coefficients a i need to be extrapolated if N > M or interpolated if N ⁇ M.
  • modification of the sampling frequency comprises up-sampling the extracted linear predictive coding coefficients when the sampling frequency of the target codec (B) format is higher than the sampling frequency of the source codec (A) format.
  • the up-sampling factor is equal to the ratio of the sampling frequency of the target codec format to the sampling frequency of the source codec format.
  • modification of the sampling frequency comprises down-sampling the extracted linear predictive coding coefficients when the sampling frequency of the target codec format is lower than the sampling frequency of the source codec format.
  • the down-sampling factor is equal to the ratio of the sampling frequency of the second codec format to the sampling frequency of the first codec format.
  • the number of coefficients b ⁇ i can be further adjusted to the number of coefficients of target LPC filter B ( z ) if necessary. For instance if M* F s ( B )/ F s ( A )>N, the number of coefficients b ⁇ i can be restricted to N, and if M* F s ( B )/ F s ( A )>N, N- M* F s ( B )/ F s ( A ), zeros can be added to the vector b ⁇ t .
  • FIG. 2 is a block diagram of an apparatus according to an embodiment. Different modules or units 10, 20 and 30 of the apparatus may be implemented in one or more physical or logical entities. Figure 2 is a simplified diagram that only shows some elements and functional entities relevant to understanding the various embodiments described here and whose implementation may differ from what is shown. The connections shown in Figure 2 are logical connections; the actual physical connections may be different.
  • bitstream A of codec format A enters decoder A 10.
  • Decoder A 10 may be a plain decoder or a codec unit, for example.
  • bitstream A is partially decoded by extracting at least LPC coefficients from bitstream A. Other parameters, such as pitch delays, fixed codebook indexes, and fixed and adaptive gains, may also be extracted.
  • the LPC coefficients and possible other extracted parameters as well as the partially decoded bitstream (signal) are further transmitted to a frequency modification unit 30.
  • the frequency modification unit 30 applies a modification of the sampling frequency to the LPC coefficients according to the embodiments described above.
  • the partially decoded bitstream (signal) is up-sampled or down-sampled from the sampling frequency employed by source codec format A to the sampling frequency employed by target codec format B. This is also preferably done in the frequency modification unit 30.
  • the modified LPC coefficients and possible other parameters as well as the modified signal are then transmitted to encoder B 20.
  • Encoder B 20 may be a plain encoder or a codec unit, for example.
  • encoder B 20 the modified LPC coefficients are mapped into LPC coefficients of codec format B and the partially decoded bitstream is encoded into a bitstream of codec format B using the mapped LPC coefficients.
  • the partial encoding i.e. the extraction of LPC coefficients in decoder A and the mapping of parameters and encoding in encoder B, can be performed in a similar manner as in existing transcoding solutions. Therefore, they need not to be discussed in more detail here. It should be further noted that not only e.g. existing mapping schemes can be used but also any future mapping schemes may be utilized.
  • the modification of the sampling frequency can be implemented in many different ways. Concrete performance of smart transcoding depends on the way the modification of the sampling frequency is done.
  • One possible problem related to up-sampling and down-sampling deals with smoothing that may appear either in low frequency or high frequency of the vector [ b ⁇ i ]. Therefore, it is preferable to enhance the obtained [ b ⁇ i ] by resynthesizing properly the lower or higher frequency.
  • a separate step may be used before the mapping step 3 above, in which an appropriate property of filter B ⁇ ( z ), such as (but not restricted to) frequency response in the low and high frequency, is assured.
  • the following will now describe in more detail an implementation example according to an embodiment.
  • the example presents transcoding between AMR 12.2 kbit/s and AMRWB 23.05 kbit/s codecs. It should be noted, however, that the use of the invention is not restricted to any particular codec format or standard or a particular mode of a given codec format.
  • FR Full Rate
  • HR Half Rate
  • EFR Enhanced Full Rate
  • AMR Adaptive Multi-Rate
  • AMR-WB Adaptive Multi Rate WideBand plus, G.723.1, G.729, G.729.1, Enhanced Variable Rate Codec (EVRC), Variable-Rate Multi-Mode Wideband (VMR-WB) and Speex.
  • EVRC Enhanced Variable Rate Codec
  • VMR-WB Variable-Rate Multi-Mode Wideband
  • Speex Speex.
  • the source codec format is AMR and the target codec format is AMR-WB.
  • the correct amount of LPC coefficients may be obtained directly by the modification of the sampling frequency.
  • a low pass filter is preferably applied to the up-sampled signal to avoid aliasing.
  • a down-sampling by a factor 2 is then achieved.
  • the total up-sampling factor is 3/2.
  • F s ( B ) and F s ( A ) the factors of the down-sampling and up-sampling could have been 8 and 5, respectively. But the numbers 3 and 2 can lead to a better performance as the smoothing applied to the low or high frequency are less important.
  • the obtained [ b ⁇ i ] can be enhanced through the following exemplary processing: the zeros of filter B ⁇ ( z ) are modified by taking into account the zeros of filter A ( z ) and of an additive estimated filter B ⁇ 1 ( z ).
  • B ⁇ ( z ) is preferably designed so that smoothing is applied to the high frequency and no smoothing to the low frequency.
  • B ⁇ 1 ( z ) is designed reversely by high smoothing in the low frequency and low smoothing in the high frequency domain.
  • a ( z ) presents only information and zeros in the low frequency (since LPC filter A ( z ) models a 8 kHz signal and has no zeros above 4 kHz). Accordingly with A ( z ) and B ⁇ 1 ( z ), we consider two additional filters which apply a correction to the zeros of B ⁇ ( z ) in the low and high frequency, respectively. It permits an accurate estimation of B ( z ), providing good performance of the smart transcoding based on mapping of the LPC.
  • an up-sampling was applied to the LPC coefficients because of transcoding from 8 kHz AMR codec format to 12.8 kHz AMR-WB codec format.
  • Transcoding from e.g. AMR-WB codec format to AMR codec format can be arranged in a similar manner but by applying down-sampling to the LPC coefficients instead of up-sampling.
  • An apparatus may be implemented as one unit (e.g. a transcoding unit) or as two or more separate units that are configured to implement the functionality of the various embodiments described.
  • the term 'unit' refers generally to a physical or logical entity, such as a physical device or a part thereof or a software routine.
  • units 10, 20 and 30 may be physically separate units or implemented as one entity.
  • An apparatus can be implemented by means of a computer or corresponding digital signal processing equipment with suitable software therein, for example.
  • a computer or digital signal processing equipment preferably comprises at least a working memory (RAM) providing storage area used for arithmetical operations and a central processing unit (CPU), such as a general-purpose digital signal processor (DSP).
  • the CPU may comprise a set of registers, an arithmetic logic unit, and a control unit.
  • the control unit is controlled by a sequence of program instructions transferred to the CPU from the RAM.
  • the control unit may contain a number of microinstructions for basic operations. The implementation of microinstructions may vary depending on the CPU design.
  • the program instructions may be coded by a programming language, which may be a high-level programming language, such as C, Java, etc., or a low-level programming language, such as a machine language, or an assembler.
  • the computer may also have an operating system which may provide system services to a computer program written with the program instructions. It is also possible to use a specific integrated circuit or circuits, or corresponding components and devices for implementing the functionality according to any one of the embodiments
  • the invention can be implemented in existing system elements, such as various communication system elements, or by using separate dedicated elements or devices in a centralized or distributed manner.
  • An example of such a system element is a media gateway or an internet protocol telephony gateway.
  • Present elements for communication systems typically comprise processors and memory that can be utilized in the functions according to the embodiments.
  • All modifications and configurations required for implementing an embodiment in existing devices may be performed as software routines, which may be implemented as added or updated software routines.
  • software can be provided as a computer program product comprising computer program code which, when run on a computer, causes the computer or corresponding arrangement to perform the functionality according to the invention as described above.
  • Such a computer program code can be stored on a computer readable medium, such as suitable memory means, e.g. a flash memory or a disc memory, from which it is loadable to the unit or units executing the program code.
  • suitable memory means e.g. a flash memory or a disc memory
  • such a computer program code implementing the invention can be loaded to the unit or units executing the computer program code via a suitable data network, for example, and it can replace or update a possibly existing program code.
  • a frequency modification unit 30 may be implemented as a module for interfacing between two codec formats.
  • a module may be a physical device, a part of a physical device or a software module, for example.
  • a module is configured to modify the sampling frequency of extracted linear predictive coding coefficients according to the various embodiments described.
  • the module may comprise an up/down-sampling unit.
  • the module is configured to receive the linear predictive coding coefficients extracted from a bitstream from a decoder and to send the linear predictive coding coefficients obtained from the modification of the sampling frequency to an encoder.
  • the module may comprise e.g. suitable input and output terminals and receiving and sending units in connection thereto.

Landscapes

  • Engineering & Computer Science (AREA)
  • Computational Linguistics (AREA)
  • Signal Processing (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Quality & Reliability (AREA)
  • Compression, Expansion, Code Conversion, And Decoders (AREA)

Abstract

A method and apparatus for transcoding comprising means (10) for partially decoding a first bitstream of a first codec format by extracting at least linear predictive coding coefficients from the first bitstream; means (20) for mapping the extracted linear predictive coding coefficients into linear predictive coding coefficients of a second codec format; and means (20) for encoding the partially decoded first bitstream into a second bitstream of a second codec format using the mapped linear predictive coding coefficients, wherein the apparatus comprises means (30) for modifying the sampling frequency of the extracted linear predictive coding coefficients before the mapping of the extracted linear predictive coding coefficients.

Description

    FIELD OF THE INVENTION
  • The present invention relates to transcoding.
  • BACKGROUND OF THE INVENTION
  • In the last decades, networks of different type have been developed, like mobile GSM, UMTS, CDMA and IP, providing alternative ways to the 'classical' circuit switched network. The interconnection of all these networks leads to an interoperability problem regarding transmission of speech. Indeed, non-compatible speech standards have been adopted in the different networks, although, most of the codecs at medium rate (5-16,5 kbit/s for narrowband codecs, 5-25 kbit/s for wideband codecs) are based on the same model Code Excited Linear Prediction (CELP). The simplest method to provide inter-connectivity consists of decoding one codec standard compressed bitstream A and re-encoding it into the other codec standard bitstream B. This conventional method is called tandem transcoding. It suffers from several problems such as complexity, delay and degradation of speech.
  • Recently, so-called 'smart transcoding' solutions have been proposed, which are based on the fact that the different standards are based on the CELP principle. They aim at reducing the complexity of the transcoding as many functions at encoder B can be skipped, decreasing the delay and enhancing the quality or at least getting the same quality as with the normal transcoding. The basic idea is to use redundancy on the standard to avoid computing parameters that have already been computed. Reference is made to Figure 1 that shows the principle of smart coding. When transcoding from a bitstream format of codec A into a bitstream format of codec B, bitstream A is first decoded in decoder A. The obtained decoded signal is then encoded into target format B (bitstream B) by encoder B. In case both codecs are CELP codecs, bitstreams A and B transmit a similar set of parameters, such as Linear Prediction Coding (LPC) coefficients, pitch delays, fixed codebook indexes and fixed and adaptive gains. The key idea of smart transcoding consists of avoiding the computation of parameters already available. An intelligent mapping and quantization of the parameters available in bitstream A into bitstream B parameters allow the skipping of many functions and hence reduce the computation load of the transcoding. As depicted in Figure 1, only a partial decoding is necessary to extract the parameters from bitstream A. Their mapping as well as a partial encoding then builds the accurate bitstream B.
  • Article C. Beaugeant, H. Taddei, "Quality and Computation Load Reduction achieved by applying Smart Transcoding between CELP Speech Codecs", Eusipco, Poland, September 2007, gives several examples of possible mapping of parameters in the case of G.729.A and AMR codecs.
  • One of the possible parameters mapped between speech codecs in transcoding is the Linear Prediction Coefficients vector (LPC). The mapping of the LPC coefficients is relatively straightforward when the speech codecs are applied to the signal at the same sampling frequency. A transposition of the LPCs from decoder A to encoder B leads to good quality and reduction of complexity as shown in the above-mentioned article. However, such a solution cannot be applied when codecs A and B employ different sampling frequencies. In that case if the LPC filters of codec A and B model signals of different sampling frequencies, it leads to a different number of coefficients and different meanings of the LPC coefficients. Existing solutions that provide mapping of LPC parameters for smart transcoding purposes are only based on the mapping of the LPC filter at the same sampling frequency (e.g. narrowband signal at 8 kHz).
  • BRIEF DESCRIPTION OF THE INVENTION
  • An object of the present invention is thus to provide a method and an apparatus for implementing the method so as to solve the above problem or at least to alleviate it. The objects of the invention are achieved by a method, a computer program product, an apparatus and a module which are characterized by what is stated in the independent claims. The preferred embodiments of the invention are disclosed in the dependent claims.
  • The invention is based on recognizing the problem and on the realization that in transcoding between two codec formats employing different sampling frequencies, the LPC coefficients of the LPC filter of the target codec format can be estimated by applying a modification on the sampling frequency of the extracted LPC coefficients.
  • An advantage of the method and apparatus of the invention is that it enables smart transcoding between two codec formats employing different sampling frequencies.
  • BRIEF DESCRIPTION OF THE DRAWINGS
  • In the following the invention will be described in greater detail by means of preferred embodiments with reference to the accompanying drawings, in which
    • Figure 1 is a block diagram showing the principle of smart transcoding; and
    • Figure 2 is a block diagram of an embodiment.
    DETAILED DESCRIPTION OF THE INVENTION
  • The following embodiments are exemplary. Although the specification may refer to "an", "one", or "some" embodiment(s) in several locations, this does not necessarily mean that each such a reference is to the same embodiment(s), or that the feature only applies to a single embodiment. Single features of different embodiments may also be combined to provide other embodiments. The present invention is applicable to any communication system or any combination of different communication systems such as GSM (Global System for Mobile Communications), WCDMA (Wideband Code Division Multiple Access), WLAN (Wireless Local Area Network) UMTS (Universal Mobile Telecommunications System), CDMA and/or IP (Internet Protocol) standard, or any other suitable standard/non-standard communication means. The communication system may be a fixed communication system or a wireless communication system or a communication system utilizing both fixed networks and wireless networks. The protocols used and the specifications of communication systems, especially in wireless communication, develop rapidly. Such a development may require extra changes to an embodiment. Therefore, all terms and expressions should be interpreted broadly and they are intended to illustrate, not to restrict, the embodiment. In the following, different embodiments will be described using, as an example a system architecture to which the embodiments may be applied, without restricting the embodiment to such an architecture, however.
  • Notation and environment: in the following, two codecs A and B based on an LPC analysis at sampling frequencies Fs (A) and Fs (B) respectively are considered. The CELP codecs family is a subset of such codecs. A transcoding scheme where a signal sA (t) at sampling frequency Fs (A) is encoded by encoder A is considered. In a 'classical' transcoding scheme (i.e. without smart transcoding), the signal is decoded into a pcm signal which is resampled into the sampling frequency Fs (B) into a signal sB (t) and signal sB (t) is then encoded by encoder B.
  • The LPC analysis within encoder A provides an autoregressive (AR) model of signal sA (t), so that an approximation of the signal sA (t) is given by: s ^ A t = i = 1 M a i s A t - i .
    Figure imgb0001
  • In this example the LPC filter A z = i = 1 M a i z - 1 ,
    Figure imgb0002
    with ao =1 is considered.
  • Similarly, encoder B provides an AR estimate of sB (t) through its LPC analysis: s ^ B t = i = 1 N b i s B t - i .
    Figure imgb0003
  • In that case the LPC filter is: B z = i = 0 N b i z - 1 with b o = 1.
    Figure imgb0004
  • Taking into account these notations, the invention generally deals with the process of finding an estimation of filter B(z) when LPC filter A(z) is known. Let's note as (z) the filter obtained from A(z). The coefficient of the constructed filter (z) can be mapped into encoder B in a similar way as smart transcoding based on LPC mapping, thus avoiding the computation of the LPC coefficients within encoder B and accordingly saving computation load. Filters A(z) and B(z) are AR models of two signals with different sampling frequencies. To obtain the coefficient of filter (z), coefficients ai need to be extrapolated if N > M or interpolated if N < M. The interpolation/extrapolation can be seen as a modification of the sampling frequency of signal [ai ] i=0...M (or alternatively [ai ], i=1...M) from sampling frequency Fs (A) to sampling frequency Fs (B). Accordingly, according to an embodiment, finding the coefficients of (z) that approximate filter B(z) can be done through the following steps:
    1. 1. Extracting LPC coefficients [ai ], i=1...M from bitstream A in decoder A
    2. 2. Applying a modification of the sampling frequency on LPC coefficients [ai ], i=1...M, thus obtaining coefficients i
    3. 3. Mapping coefficients i in encoder B for quantization and for computation of the rest of the coefficients of encoder B (in CELP codecs e.g. pitch, gains, fixed codebook).
  • In step 2 above it is alternatively possible to apply the modification of the sampling frequency on LPC coefficients [ai ] i=0...M . In that case the target LPC filter is preferably forced to set b0=1. According to an embodiment, modification of the sampling frequency comprises up-sampling the extracted linear predictive coding coefficients when the sampling frequency of the target codec (B) format is higher than the sampling frequency of the source codec (A) format. According to an embodiment, the up-sampling factor is equal to the ratio of the sampling frequency of the target codec format to the sampling frequency of the source codec format. According to an embodiment, modification of the sampling frequency comprises down-sampling the extracted linear predictive coding coefficients when the sampling frequency of the target codec format is lower than the sampling frequency of the source codec format. According to an embodiment, the down-sampling factor is equal to the ratio of the sampling frequency of the second codec format to the sampling frequency of the first codec format. Acccordingly, when applying a modification of the sampling frequency to LPC coefficients [ai ], i=1...M from sampling frequency Fs (A) to sampling frequency Fs (B), M* Fs (B)/Fs (A) coefficients i are obtained. According to an embodiment, the number of coefficients i can be further adjusted to the number of coefficients of target LPC filter B(z) if necessary. For instance if M* Fs (B)/Fs (A)>N, the number of coefficients i can be restricted to N, and if M* Fs (B)/Fs (A)>N, N- M* Fs (B)/Fs (A), zeros can be added to the vector t .
  • Figure 2 is a block diagram of an apparatus according to an embodiment. Different modules or units 10, 20 and 30 of the apparatus may be implemented in one or more physical or logical entities. Figure 2 is a simplified diagram that only shows some elements and functional entities relevant to understanding the various embodiments described here and whose implementation may differ from what is shown. The connections shown in Figure 2 are logical connections; the actual physical connections may be different. In the example shown bitstream A of codec format A enters decoder A 10. Decoder A 10 may be a plain decoder or a codec unit, for example. In Decoder A 10 bitstream A is partially decoded by extracting at least LPC coefficients from bitstream A. Other parameters, such as pitch delays, fixed codebook indexes, and fixed and adaptive gains, may also be extracted. The LPC coefficients and possible other extracted parameters as well as the partially decoded bitstream (signal) are further transmitted to a frequency modification unit 30. The frequency modification unit 30 applies a modification of the sampling frequency to the LPC coefficients according to the embodiments described above. According to an embodiment, the partially decoded bitstream (signal) is up-sampled or down-sampled from the sampling frequency employed by source codec format A to the sampling frequency employed by target codec format B. This is also preferably done in the frequency modification unit 30. The modified LPC coefficients and possible other parameters as well as the modified signal are then transmitted to encoder B 20. Encoder B 20 may be a plain encoder or a codec unit, for example. In encoder B 20 the modified LPC coefficients are mapped into LPC coefficients of codec format B and the partially decoded bitstream is encoded into a bitstream of codec format B using the mapped LPC coefficients. It should be noted that the partial encoding, i.e. the extraction of LPC coefficients in decoder A and the mapping of parameters and encoding in encoder B, can be performed in a similar manner as in existing transcoding solutions. Therefore, they need not to be discussed in more detail here. It should be further noted that not only e.g. existing mapping schemes can be used but also any future mapping schemes may be utilized.
  • The modification of the sampling frequency (step 2 above) can be implemented in many different ways. Concrete performance of smart transcoding depends on the way the modification of the sampling frequency is done. One possible problem related to up-sampling and down-sampling deals with smoothing that may appear either in low frequency or high frequency of the vector [i ]. Therefore, it is preferable to enhance the obtained [i ] by resynthesizing properly the lower or higher frequency. In order to achieve this, a separate step may be used before the mapping step 3 above, in which an appropriate property of filter (z), such as (but not restricted to) frequency response in the low and high frequency, is assured.
  • The following will now describe in more detail an implementation example according to an embodiment. The example presents transcoding between AMR 12.2 kbit/s and AMRWB 23.05 kbit/s codecs. It should be noted, however, that the use of the invention is not restricted to any particular codec format or standard or a particular mode of a given codec format. For example, the following codec formats could be used in connection with the invention: Full Rate (FR), Half Rate (HR), Enhanced Full Rate (EFR), Adaptive Multi-Rate (AMR), Adaptive Multi Rate WideBand (AMR-WB), Adaptive Multi Rate WideBand plus, G.723.1, G.729, G.729.1, Enhanced Variable Rate Codec (EVRC), Variable-Rate Multi-Mode Wideband (VMR-WB) and Speex.
  • In the example the source codec format is AMR and the target codec format is AMR-WB. The AMR codec processes signals at a sampling frequency of Fs (A) = 8 kHz and provides an LPC analysis on 10 coefficients. The AMR-WB codec operates with a signal of 16 kHz and its LPC analysis is done on a signal of Fs (B) =12.8 kHz (a down-sampling is applied within the encoding). The LPC filter of the AMR-WB has 16 coefficients. In this case M* Fs (B)/Fs (A) = N. Thus, the correct amount of LPC coefficients may be obtained directly by the modification of the sampling frequency. The modification of the sampling frequency may be done in two phases such that first an up-sampling of vector [ai ], i=0...M by a factor 3 is applied. A low pass filter is preferably applied to the up-sampled signal to avoid aliasing. A down-sampling by a factor 2 is then achieved. Thus the total up-sampling factor is 3/2. It has to be noted that considering Fs (B) and Fs (A), the factors of the down-sampling and up-sampling could have been 8 and 5, respectively. But the numbers 3 and 2 can lead to a better performance as the smoothing applied to the low or high frequency are less important. Considering that i=0...M, b0 is set to be 1 and thus the resulting number of coefficients is 1 + 10*3/2 = 16.
  • Additionally in the exemplary embodiment, the obtained [i ] can be enhanced through the following exemplary processing: the zeros of filter (z) are modified by taking into account the zeros of filter A(z) and of an additive estimated filter 1(z). Such an operation makes it possible to avoid smoothing in the down-sampling phase of the above example which tends to reduce the number of zeros of the LPC analysis. (z) is preferably designed so that smoothing is applied to the high frequency and no smoothing to the low frequency. 1(z) is designed reversely by high smoothing in the low frequency and low smoothing in the high frequency domain. A(z) presents only information and zeros in the low frequency (since LPC filter A(z) models a 8 kHz signal and has no zeros above 4 kHz). Accordingly with A(z) and 1(z), we consider two additional filters which apply a correction to the zeros of (z) in the low and high frequency, respectively. It permits an accurate estimation of B(z), providing good performance of the smart transcoding based on mapping of the LPC.
  • In the above-described detailed example, an up-sampling was applied to the LPC coefficients because of transcoding from 8 kHz AMR codec format to 12.8 kHz AMR-WB codec format. Transcoding from e.g. AMR-WB codec format to AMR codec format can be arranged in a similar manner but by applying down-sampling to the LPC coefficients instead of up-sampling.
  • An apparatus according to an embodiment, such as the one shown in Figure 2, may be implemented as one unit (e.g. a transcoding unit) or as two or more separate units that are configured to implement the functionality of the various embodiments described. Here the term 'unit' refers generally to a physical or logical entity, such as a physical device or a part thereof or a software routine. For example, units 10, 20 and 30 may be physically separate units or implemented as one entity.
  • An apparatus according to any one of the embodiments can be implemented by means of a computer or corresponding digital signal processing equipment with suitable software therein, for example. Such a computer or digital signal processing equipment preferably comprises at least a working memory (RAM) providing storage area used for arithmetical operations and a central processing unit (CPU), such as a general-purpose digital signal processor (DSP). The CPU may comprise a set of registers, an arithmetic logic unit, and a control unit. The control unit is controlled by a sequence of program instructions transferred to the CPU from the RAM. The control unit may contain a number of microinstructions for basic operations. The implementation of microinstructions may vary depending on the CPU design. The program instructions may be coded by a programming language, which may be a high-level programming language, such as C, Java, etc., or a low-level programming language, such as a machine language, or an assembler. The computer may also have an operating system which may provide system services to a computer program written with the program instructions. It is also possible to use a specific integrated circuit or circuits, or corresponding components and devices for implementing the functionality according to any one of the embodiments
  • The invention can be implemented in existing system elements, such as various communication system elements, or by using separate dedicated elements or devices in a centralized or distributed manner. An example of such a system element is a media gateway or an internet protocol telephony gateway. Present elements for communication systems typically comprise processors and memory that can be utilized in the functions according to the embodiments. Thus, all modifications and configurations required for implementing an embodiment in existing devices may be performed as software routines, which may be implemented as added or updated software routines. If the functionality of the embodiments is implemented by software, such software can be provided as a computer program product comprising computer program code which, when run on a computer, causes the computer or corresponding arrangement to perform the functionality according to the invention as described above. Such a computer program code can be stored on a computer readable medium, such as suitable memory means, e.g. a flash memory or a disc memory, from which it is loadable to the unit or units executing the program code. In addition, such a computer program code implementing the invention can be loaded to the unit or units executing the computer program code via a suitable data network, for example, and it can replace or update a possibly existing program code.
  • A frequency modification unit 30 may be implemented as a module for interfacing between two codec formats. Such a module may be a physical device, a part of a physical device or a software module, for example. According to an embodiment, such a module is configured to modify the sampling frequency of extracted linear predictive coding coefficients according to the various embodiments described. For this purpose the module may comprise an up/down-sampling unit. Further such a module is configured to receive the linear predictive coding coefficients extracted from a bitstream from a decoder and to send the linear predictive coding coefficients obtained from the modification of the sampling frequency to an encoder. For this purpose the module may comprise e.g. suitable input and output terminals and receiving and sending units in connection thereto.
  • It will be obvious to a person skilled in the art that, as the technology advances, the inventive concept can be implemented in various ways. The invention and its embodiments are not limited to the examples described above but may vary within the scope of the claims.

Claims (23)

  1. A method for transcoding, the method comprising:
    partially decoding a first bitstream of a first codec format by extracting at least linear predictive coding coefficients from the first bitstream;
    mapping the extracted linear predictive coding coefficients into linear predictive coding coefficients of a second codec format; and
    encoding the partially decoded first bitstream into a second bitstream of a second codec format using the mapped linear predictive coding coefficients, characterized in that the first and second codec formats employ different sampling frequencies and in that the method comprises:
    modifying the sampling frequency of the extracted linear predictive coding coefficients before the mapping of the extracted linear predictive coding coefficients.
  2. A method according to claim 1, characterized in that the modifying of the sampling frequency of the extracted linear predictive coding coefficients comprises:
    up-sampling the extracted linear predictive coding coefficients when the sampling frequency of the second codec format is higher than the sampling frequency of the first codec format.
  3. A method according to claim 2, characterized in that the up-sampling factor is equal to the ratio of the sampling frequency of the second codec format to the sampling frequency of the first codec format.
  4. A method according to claim 1, 2 or 3, characterized in that the modifying of the sampling frequency of the extracted linear predictive coding coefficients comprises:
    down-sampling the extracted linear predictive coding coefficients when the sampling frequency of the second codec format is lower than the sampling frequency of the first codec format.
  5. A method according to claim 4, characterized in that the down-sampling factor is equal to the ratio of the sampling frequency of the second codec format to the sampling frequency of the first codec format.
  6. A method according to any one of claims 1 to 5, characterized in that the method comprises up-sampling or down-sampling the partially decoded first bitstream from the sampling frequency employed by the first codec format to the sampling frequency employed by the second codec format before encoding.
  7. A method according to any one of claims 1 to 6, characterized in that the method comprises adjusting the number of linear predictive coding coefficients after modifying the sampling frequency of the extracted linear predictive coding coefficients to the number of coefficients required for encoding the partially decoded first bitstream into a second bitstream of a second codec format.
  8. A method according to any one of claims 1 to 7, characterized in that the first and/or the second codec format is selected from the following: Full Rate, Half Rate, Enhanced Full Rate, Adaptive Multi-Rate, Adaptive Multi Rate WideBand, Adaptive Multi Rate WideBand plus, G.723.1, G.729, G.729.1, Enhanced Variable Rate Codec, Variable-Rate Multi-Mode Wideband and Speex.
  9. A method according to any one of claims 1 to 8, characterized in that the first and second codec formats are Adaptive Multi-Rate employing a sampling frequency of 8 kHz and Adaptive Multi Rate WideBand employing a sampling frequency of 12.8 kHz.
  10. A computer program product comprising computer program code, wherein the execution of the program code in a computer causes the computer to carry out the steps of the method according to any one of claims 1 to 9.
  11. An apparatus for transcoding comprising:
    means for partially decoding a first bitstream of a first codec format by extracting at least linear predictive coding coefficients from the first bitstream;
    means for mapping the extracted linear predictive coding coefficients into linear predictive coding coefficients of a second codec format; and
    means for encoding the partially decoded first bitstream into a second bitstream of a second codec format using the mapped linear predictive coding coefficients, characterized in that the apparatus comprises means for modifying the sampling frequency of the extracted linear predictive coding coefficients before the mapping of the extracted linear predictive coding coefficients.
  12. An apparatus according to claim 11, characterized in that the means for modifying the sampling frequency of the extracted linear predictive coding coefficients is arranged to up-sample the extracted linear predictive coding coefficients when the sampling frequency of the second codec format is higher than the sampling frequency of the first codec format.
  13. An apparatus according to claim 12, characterized in that the up-sampling factor is equal to the ratio of the sampling frequency of the second codec format to the sampling frequency of the first codec format.
  14. An apparatus according to claim 11, 12 or 13, characterized in that the means for modifying the sampling frequency of the extracted linear predictive coding coefficients is arranged to down-sample the extracted linear predictive coding coefficients when the sampling frequency of the second codec format is lower than the sampling frequency of the first codec format.
  15. An apparatus according to claim 14, characterized in that the down-sampling factor is equal to the ratio of the sampling frequency of the second codec format to the sampling frequency of the first codec format.
  16. An apparatus according to any one of claims 11 to 15, characterized in that the apparatus comprises means for up-sampling or down-sampling the partially decoded first bitstream from the sampling frequency employed by the first codec format to the sampling frequency employed by the second codec format before encoding.
  17. An apparatus according to any one of claims 11 to 16, characterized in that the apparatus comprises means for adjusting the number of linear predictive coding coefficients after the modification of the sampling frequency of the extracted linear predictive coding coefficients to the number of coefficients required for encoding the partially decoded first bitstream into a second bitstream of a second codec format.
  18. An apparatus according to any one of claims 11 to 17, characterized in that the first and/or the second codec format is selected from the following: Full Rate, Half Rate, Enhanced Full Rate, Adaptive Multi-Rate, Adaptive Multi Rate WideBand, Adaptive Multi Rate WideBand plus, G.723.1, G.729, G.729.1, Enhanced Variable Rate Codec, Variable-Rate Multi-Mode Wideband and Speex.
  19. An apparatus according to any one of claims 11 to 18, characterized in that the first and second codec formats are Adaptive Multi-Rate employing a sampling frequency of 8 kHz and Adaptive Multi Rate WideBand employing a sampling frequency of 12.8 kHz.
  20. A module for interfacing between codec formats, characterized in that the module comprises:
    means for receiving from a decoder linear predictive coding coefficients extracted from a bitstream;
    means for modifying the sampling frequency of the extracted linear predictive coding coefficients; and
    means for sending the linear predictive coding coefficients obtained from the modification of the sampling frequency to an encoder.
  21. A module according to claim 20, characterized in that the module comprises:
    means for receiving from the decoder a partially decoded bitstream from which at least the linear predictive coding coefficients have been extracted;
    means for up-sampling or down-sampling the partially decoded bitstream from the sampling frequency employed by the decoder to the sampling frequency employed by the encoder; and
    means for sending the up- or down-sampled partially decoded bitstream to the encoder.
  22. A module according to claim 20 or 21, characterized in that the module is a module for a gateway.
  23. A module according to claim 22, characterized in that the gateway is a media gateway or an internet protocol telephony gateway.
EP07117956A 2007-10-05 2007-10-05 Method and apparatus for transcoding Withdrawn EP2045800A1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
EP07117956A EP2045800A1 (en) 2007-10-05 2007-10-05 Method and apparatus for transcoding

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
EP07117956A EP2045800A1 (en) 2007-10-05 2007-10-05 Method and apparatus for transcoding

Publications (1)

Publication Number Publication Date
EP2045800A1 true EP2045800A1 (en) 2009-04-08

Family

ID=39110850

Family Applications (1)

Application Number Title Priority Date Filing Date
EP07117956A Withdrawn EP2045800A1 (en) 2007-10-05 2007-10-05 Method and apparatus for transcoding

Country Status (1)

Country Link
EP (1) EP2045800A1 (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105869653A (en) * 2016-05-31 2016-08-17 华为技术有限公司 Voice signal processing method and related device and system

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2003003770A1 (en) * 2001-06-26 2003-01-09 Nokia Corporation Method for transcoding audio signals, transcoder, network element, wireless communications network and communications system
US20040102966A1 (en) * 2002-11-25 2004-05-27 Jongmo Sung Apparatus and method for transcoding between CELP type codecs having different bandwidths
US20040111257A1 (en) * 2002-12-09 2004-06-10 Sung Jong Mo Transcoding apparatus and method between CELP-based codecs using bandwidth extension
US6829579B2 (en) * 2002-01-08 2004-12-07 Dilithium Networks, Inc. Transcoding method and system between CELP-based speech codes
US20050053130A1 (en) * 2003-09-10 2005-03-10 Dilithium Holdings, Inc. Method and apparatus for voice transcoding between variable rate coders
US20070124138A1 (en) * 2003-12-10 2007-05-31 France Telecom Transcoding between the indices of multipulse dictionaries used in compressive coding of digital signals
EP1796084A1 (en) * 2004-11-04 2007-06-13 Matsushita Electric Industrial Co., Ltd. Vector conversion device and vector conversion method

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2003003770A1 (en) * 2001-06-26 2003-01-09 Nokia Corporation Method for transcoding audio signals, transcoder, network element, wireless communications network and communications system
US6829579B2 (en) * 2002-01-08 2004-12-07 Dilithium Networks, Inc. Transcoding method and system between CELP-based speech codes
US20040102966A1 (en) * 2002-11-25 2004-05-27 Jongmo Sung Apparatus and method for transcoding between CELP type codecs having different bandwidths
US20040111257A1 (en) * 2002-12-09 2004-06-10 Sung Jong Mo Transcoding apparatus and method between CELP-based codecs using bandwidth extension
US20050053130A1 (en) * 2003-09-10 2005-03-10 Dilithium Holdings, Inc. Method and apparatus for voice transcoding between variable rate coders
US20070124138A1 (en) * 2003-12-10 2007-05-31 France Telecom Transcoding between the indices of multipulse dictionaries used in compressive coding of digital signals
EP1796084A1 (en) * 2004-11-04 2007-06-13 Matsushita Electric Industrial Co., Ltd. Vector conversion device and vector conversion method

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105869653A (en) * 2016-05-31 2016-08-17 华为技术有限公司 Voice signal processing method and related device and system
WO2017206432A1 (en) * 2016-05-31 2017-12-07 华为技术有限公司 Voice signal processing method, and related device and system
US10218856B2 (en) 2016-05-31 2019-02-26 Huawei Technologies Co., Ltd. Voice signal processing method, related apparatus, and system

Similar Documents

Publication Publication Date Title
KR102240271B1 (en) Apparatus and method for generating a bandwidth extended signal
JP5203929B2 (en) Vector quantization method and apparatus for spectral envelope display
US7502734B2 (en) Method and device for robust predictive vector quantization of linear prediction parameters in sound signal coding
US11282530B2 (en) Methods, encoder and decoder for linear predictive encoding and decoding of sound signals upon transition between frames having different sampling rates
US7184953B2 (en) Transcoding method and system between CELP-based speech codes with externally provided status
JP4390803B2 (en) Method and apparatus for gain quantization in variable bit rate wideband speech coding
RU2509379C2 (en) Device and method for quantising and inverse quantising lpc filters in super-frame
CN101375330B (en) Re-phasing of decoder states after packet loss
US8666754B2 (en) Audio signal encoding method, audio signal decoding method, encoding device, decoding device, audio signal processing system, audio signal encoding program, and audio signal decoding program
EP1788556B1 (en) Scalable decoding device and signal loss concealment method
JP2007537494A (en) Method and apparatus for speech rate conversion in a multi-rate speech coder for telecommunications
EP2502231B1 (en) Bandwidth extension of a low band audio signal
SE521693C3 (en) A method and apparatus for noise suppression
CN100578618C (en) Decoding method and device
CN1751338B (en) Method and apparatus for speech coding
JP2005515486A (en) Transcoding scheme between speech codes by CELP
EP2045800A1 (en) Method and apparatus for transcoding
JPH09127987A (en) Signal coding method and device therefor
JP2004177982A (en) Encoding device and decoding device for sound music signal

Legal Events

Date Code Title Description
PUAI Public reference made under article 153(3) epc to a published international application that has entered the european phase

Free format text: ORIGINAL CODE: 0009012

AK Designated contracting states

Kind code of ref document: A1

Designated state(s): AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HU IE IS IT LI LT LU LV MC MT NL PL PT RO SE SI SK TR

AX Request for extension of the european patent

Extension state: AL BA HR MK RS

AKX Designation fees paid
REG Reference to a national code

Ref country code: DE

Ref legal event code: 8566

STAA Information on the status of an ep patent application or granted ep patent

Free format text: STATUS: THE APPLICATION IS DEEMED TO BE WITHDRAWN

18D Application deemed to be withdrawn

Effective date: 20091009