EP2045800A1 - Transkodierverfahren und -vorrichtung - Google Patents

Transkodierverfahren und -vorrichtung Download PDF

Info

Publication number
EP2045800A1
EP2045800A1 EP07117956A EP07117956A EP2045800A1 EP 2045800 A1 EP2045800 A1 EP 2045800A1 EP 07117956 A EP07117956 A EP 07117956A EP 07117956 A EP07117956 A EP 07117956A EP 2045800 A1 EP2045800 A1 EP 2045800A1
Authority
EP
European Patent Office
Prior art keywords
sampling frequency
predictive coding
linear predictive
codec format
coding coefficients
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Withdrawn
Application number
EP07117956A
Other languages
English (en)
French (fr)
Inventor
Christophe Beaugeant
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Nokia Solutions and Networks Oy
Original Assignee
Nokia Siemens Networks Oy
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nokia Siemens Networks Oy filed Critical Nokia Siemens Networks Oy
Priority to EP07117956A priority Critical patent/EP2045800A1/de
Publication of EP2045800A1 publication Critical patent/EP2045800A1/de
Withdrawn legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/02Speech enhancement, e.g. noise reduction or echo cancellation
    • G10L21/038Speech enhancement, e.g. noise reduction or echo cancellation using band spreading techniques
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
    • G10L19/04Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using predictive techniques
    • G10L19/16Vocoder architecture
    • G10L19/173Transcoding, i.e. converting between two coded representations avoiding cascaded coding-decoding
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
    • G10L19/04Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using predictive techniques
    • G10L19/06Determination or coding of the spectral characteristics, e.g. of the short-term prediction coefficients

Definitions

  • the present invention relates to transcoding.
  • networks of different type have been developed, like mobile GSM, UMTS, CDMA and IP, providing alternative ways to the 'classical' circuit switched network.
  • the interconnection of all these networks leads to an interoperability problem regarding transmission of speech.
  • non-compatible speech standards have been adopted in the different networks, although, most of the codecs at medium rate (5-16,5 kbit/s for narrowband codecs, 5-25 kbit/s for wideband codecs) are based on the same model Code Excited Linear Prediction (CELP).
  • CELP Code Excited Linear Prediction
  • the simplest method to provide inter-connectivity consists of decoding one codec standard compressed bitstream A and re-encoding it into the other codec standard bitstream B. This conventional method is called tandem transcoding. It suffers from several problems such as complexity, delay and degradation of speech.
  • bitstreams A and B transmit a similar set of parameters, such as Linear Prediction Coding (LPC) coefficients, pitch delays, fixed codebook indexes and fixed and adaptive gains.
  • LPC Linear Prediction Coding
  • the key idea of smart transcoding consists of avoiding the computation of parameters already available.
  • An intelligent mapping and quantization of the parameters available in bitstream A into bitstream B parameters allow the skipping of many functions and hence reduce the computation load of the transcoding. As depicted in Figure 1 , only a partial decoding is necessary to extract the parameters from bitstream A. Their mapping as well as a partial encoding then builds the accurate bitstream B.
  • LPC Linear Prediction Coefficients vector
  • An object of the present invention is thus to provide a method and an apparatus for implementing the method so as to solve the above problem or at least to alleviate it.
  • the objects of the invention are achieved by a method, a computer program product, an apparatus and a module which are characterized by what is stated in the independent claims.
  • the preferred embodiments of the invention are disclosed in the dependent claims.
  • the invention is based on recognizing the problem and on the realization that in transcoding between two codec formats employing different sampling frequencies, the LPC coefficients of the LPC filter of the target codec format can be estimated by applying a modification on the sampling frequency of the extracted LPC coefficients.
  • An advantage of the method and apparatus of the invention is that it enables smart transcoding between two codec formats employing different sampling frequencies.
  • the following embodiments are exemplary. Although the specification may refer to "an”, “one”, or “some” embodiment(s) in several locations, this does not necessarily mean that each such a reference is to the same embodiment(s), or that the feature only applies to a single embodiment. Single features of different embodiments may also be combined to provide other embodiments.
  • the present invention is applicable to any communication system or any combination of different communication systems such as GSM (Global System for Mobile Communications), WCDMA (Wideband Code Division Multiple Access), WLAN (Wireless Local Area Network) UMTS (Universal Mobile Telecommunications System), CDMA and/or IP (Internet Protocol) standard, or any other suitable standard/non-standard communication means.
  • GSM Global System for Mobile Communications
  • WCDMA Wideband Code Division Multiple Access
  • WLAN Wireless Local Area Network
  • UMTS Universal Mobile Telecommunications System
  • CDMA and/or IP Internet Protocol
  • the communication system may be a fixed communication system or a wireless communication system or a communication system utilizing both fixed networks and wireless networks.
  • the protocols used and the specifications of communication systems, especially in wireless communication develop rapidly. Such a development may require extra changes to an embodiment. Therefore, all terms and expressions should be interpreted broadly and they are intended to illustrate, not to restrict, the embodiment. In the following, different embodiments will be described using, as an example a system architecture to which the embodiments may be applied, without restricting the embodiment to such an architecture, however.
  • the invention generally deals with the process of finding an estimation of filter B ( z ) when LPC filter A ( z ) is known.
  • B ⁇ ( z ) the filter obtained from A ( z ).
  • the coefficient of the constructed filter B ⁇ ( z ) can be mapped into encoder B in a similar way as smart transcoding based on LPC mapping, thus avoiding the computation of the LPC coefficients within encoder B and accordingly saving computation load.
  • Filters A ( z ) and B ( z ) are AR models of two signals with different sampling frequencies.
  • coefficients a i need to be extrapolated if N > M or interpolated if N ⁇ M.
  • modification of the sampling frequency comprises up-sampling the extracted linear predictive coding coefficients when the sampling frequency of the target codec (B) format is higher than the sampling frequency of the source codec (A) format.
  • the up-sampling factor is equal to the ratio of the sampling frequency of the target codec format to the sampling frequency of the source codec format.
  • modification of the sampling frequency comprises down-sampling the extracted linear predictive coding coefficients when the sampling frequency of the target codec format is lower than the sampling frequency of the source codec format.
  • the down-sampling factor is equal to the ratio of the sampling frequency of the second codec format to the sampling frequency of the first codec format.
  • the number of coefficients b ⁇ i can be further adjusted to the number of coefficients of target LPC filter B ( z ) if necessary. For instance if M* F s ( B )/ F s ( A )>N, the number of coefficients b ⁇ i can be restricted to N, and if M* F s ( B )/ F s ( A )>N, N- M* F s ( B )/ F s ( A ), zeros can be added to the vector b ⁇ t .
  • FIG. 2 is a block diagram of an apparatus according to an embodiment. Different modules or units 10, 20 and 30 of the apparatus may be implemented in one or more physical or logical entities. Figure 2 is a simplified diagram that only shows some elements and functional entities relevant to understanding the various embodiments described here and whose implementation may differ from what is shown. The connections shown in Figure 2 are logical connections; the actual physical connections may be different.
  • bitstream A of codec format A enters decoder A 10.
  • Decoder A 10 may be a plain decoder or a codec unit, for example.
  • bitstream A is partially decoded by extracting at least LPC coefficients from bitstream A. Other parameters, such as pitch delays, fixed codebook indexes, and fixed and adaptive gains, may also be extracted.
  • the LPC coefficients and possible other extracted parameters as well as the partially decoded bitstream (signal) are further transmitted to a frequency modification unit 30.
  • the frequency modification unit 30 applies a modification of the sampling frequency to the LPC coefficients according to the embodiments described above.
  • the partially decoded bitstream (signal) is up-sampled or down-sampled from the sampling frequency employed by source codec format A to the sampling frequency employed by target codec format B. This is also preferably done in the frequency modification unit 30.
  • the modified LPC coefficients and possible other parameters as well as the modified signal are then transmitted to encoder B 20.
  • Encoder B 20 may be a plain encoder or a codec unit, for example.
  • encoder B 20 the modified LPC coefficients are mapped into LPC coefficients of codec format B and the partially decoded bitstream is encoded into a bitstream of codec format B using the mapped LPC coefficients.
  • the partial encoding i.e. the extraction of LPC coefficients in decoder A and the mapping of parameters and encoding in encoder B, can be performed in a similar manner as in existing transcoding solutions. Therefore, they need not to be discussed in more detail here. It should be further noted that not only e.g. existing mapping schemes can be used but also any future mapping schemes may be utilized.
  • the modification of the sampling frequency can be implemented in many different ways. Concrete performance of smart transcoding depends on the way the modification of the sampling frequency is done.
  • One possible problem related to up-sampling and down-sampling deals with smoothing that may appear either in low frequency or high frequency of the vector [ b ⁇ i ]. Therefore, it is preferable to enhance the obtained [ b ⁇ i ] by resynthesizing properly the lower or higher frequency.
  • a separate step may be used before the mapping step 3 above, in which an appropriate property of filter B ⁇ ( z ), such as (but not restricted to) frequency response in the low and high frequency, is assured.
  • the following will now describe in more detail an implementation example according to an embodiment.
  • the example presents transcoding between AMR 12.2 kbit/s and AMRWB 23.05 kbit/s codecs. It should be noted, however, that the use of the invention is not restricted to any particular codec format or standard or a particular mode of a given codec format.
  • FR Full Rate
  • HR Half Rate
  • EFR Enhanced Full Rate
  • AMR Adaptive Multi-Rate
  • AMR-WB Adaptive Multi Rate WideBand plus, G.723.1, G.729, G.729.1, Enhanced Variable Rate Codec (EVRC), Variable-Rate Multi-Mode Wideband (VMR-WB) and Speex.
  • EVRC Enhanced Variable Rate Codec
  • VMR-WB Variable-Rate Multi-Mode Wideband
  • Speex Speex.
  • the source codec format is AMR and the target codec format is AMR-WB.
  • the correct amount of LPC coefficients may be obtained directly by the modification of the sampling frequency.
  • a low pass filter is preferably applied to the up-sampled signal to avoid aliasing.
  • a down-sampling by a factor 2 is then achieved.
  • the total up-sampling factor is 3/2.
  • F s ( B ) and F s ( A ) the factors of the down-sampling and up-sampling could have been 8 and 5, respectively. But the numbers 3 and 2 can lead to a better performance as the smoothing applied to the low or high frequency are less important.
  • the obtained [ b ⁇ i ] can be enhanced through the following exemplary processing: the zeros of filter B ⁇ ( z ) are modified by taking into account the zeros of filter A ( z ) and of an additive estimated filter B ⁇ 1 ( z ).
  • B ⁇ ( z ) is preferably designed so that smoothing is applied to the high frequency and no smoothing to the low frequency.
  • B ⁇ 1 ( z ) is designed reversely by high smoothing in the low frequency and low smoothing in the high frequency domain.
  • a ( z ) presents only information and zeros in the low frequency (since LPC filter A ( z ) models a 8 kHz signal and has no zeros above 4 kHz). Accordingly with A ( z ) and B ⁇ 1 ( z ), we consider two additional filters which apply a correction to the zeros of B ⁇ ( z ) in the low and high frequency, respectively. It permits an accurate estimation of B ( z ), providing good performance of the smart transcoding based on mapping of the LPC.
  • an up-sampling was applied to the LPC coefficients because of transcoding from 8 kHz AMR codec format to 12.8 kHz AMR-WB codec format.
  • Transcoding from e.g. AMR-WB codec format to AMR codec format can be arranged in a similar manner but by applying down-sampling to the LPC coefficients instead of up-sampling.
  • An apparatus may be implemented as one unit (e.g. a transcoding unit) or as two or more separate units that are configured to implement the functionality of the various embodiments described.
  • the term 'unit' refers generally to a physical or logical entity, such as a physical device or a part thereof or a software routine.
  • units 10, 20 and 30 may be physically separate units or implemented as one entity.
  • An apparatus can be implemented by means of a computer or corresponding digital signal processing equipment with suitable software therein, for example.
  • a computer or digital signal processing equipment preferably comprises at least a working memory (RAM) providing storage area used for arithmetical operations and a central processing unit (CPU), such as a general-purpose digital signal processor (DSP).
  • the CPU may comprise a set of registers, an arithmetic logic unit, and a control unit.
  • the control unit is controlled by a sequence of program instructions transferred to the CPU from the RAM.
  • the control unit may contain a number of microinstructions for basic operations. The implementation of microinstructions may vary depending on the CPU design.
  • the program instructions may be coded by a programming language, which may be a high-level programming language, such as C, Java, etc., or a low-level programming language, such as a machine language, or an assembler.
  • the computer may also have an operating system which may provide system services to a computer program written with the program instructions. It is also possible to use a specific integrated circuit or circuits, or corresponding components and devices for implementing the functionality according to any one of the embodiments
  • the invention can be implemented in existing system elements, such as various communication system elements, or by using separate dedicated elements or devices in a centralized or distributed manner.
  • An example of such a system element is a media gateway or an internet protocol telephony gateway.
  • Present elements for communication systems typically comprise processors and memory that can be utilized in the functions according to the embodiments.
  • All modifications and configurations required for implementing an embodiment in existing devices may be performed as software routines, which may be implemented as added or updated software routines.
  • software can be provided as a computer program product comprising computer program code which, when run on a computer, causes the computer or corresponding arrangement to perform the functionality according to the invention as described above.
  • Such a computer program code can be stored on a computer readable medium, such as suitable memory means, e.g. a flash memory or a disc memory, from which it is loadable to the unit or units executing the program code.
  • suitable memory means e.g. a flash memory or a disc memory
  • such a computer program code implementing the invention can be loaded to the unit or units executing the computer program code via a suitable data network, for example, and it can replace or update a possibly existing program code.
  • a frequency modification unit 30 may be implemented as a module for interfacing between two codec formats.
  • a module may be a physical device, a part of a physical device or a software module, for example.
  • a module is configured to modify the sampling frequency of extracted linear predictive coding coefficients according to the various embodiments described.
  • the module may comprise an up/down-sampling unit.
  • the module is configured to receive the linear predictive coding coefficients extracted from a bitstream from a decoder and to send the linear predictive coding coefficients obtained from the modification of the sampling frequency to an encoder.
  • the module may comprise e.g. suitable input and output terminals and receiving and sending units in connection thereto.

Landscapes

  • Engineering & Computer Science (AREA)
  • Computational Linguistics (AREA)
  • Signal Processing (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Quality & Reliability (AREA)
  • Compression, Expansion, Code Conversion, And Decoders (AREA)
EP07117956A 2007-10-05 2007-10-05 Transkodierverfahren und -vorrichtung Withdrawn EP2045800A1 (de)

Priority Applications (1)

Application Number Priority Date Filing Date Title
EP07117956A EP2045800A1 (de) 2007-10-05 2007-10-05 Transkodierverfahren und -vorrichtung

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
EP07117956A EP2045800A1 (de) 2007-10-05 2007-10-05 Transkodierverfahren und -vorrichtung

Publications (1)

Publication Number Publication Date
EP2045800A1 true EP2045800A1 (de) 2009-04-08

Family

ID=39110850

Family Applications (1)

Application Number Title Priority Date Filing Date
EP07117956A Withdrawn EP2045800A1 (de) 2007-10-05 2007-10-05 Transkodierverfahren und -vorrichtung

Country Status (1)

Country Link
EP (1) EP2045800A1 (de)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105869653A (zh) * 2016-05-31 2016-08-17 华为技术有限公司 话音信号处理方法和相关装置和系统

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2003003770A1 (en) * 2001-06-26 2003-01-09 Nokia Corporation Method for transcoding audio signals, transcoder, network element, wireless communications network and communications system
US20040102966A1 (en) * 2002-11-25 2004-05-27 Jongmo Sung Apparatus and method for transcoding between CELP type codecs having different bandwidths
US20040111257A1 (en) * 2002-12-09 2004-06-10 Sung Jong Mo Transcoding apparatus and method between CELP-based codecs using bandwidth extension
US6829579B2 (en) * 2002-01-08 2004-12-07 Dilithium Networks, Inc. Transcoding method and system between CELP-based speech codes
US20050053130A1 (en) * 2003-09-10 2005-03-10 Dilithium Holdings, Inc. Method and apparatus for voice transcoding between variable rate coders
US20070124138A1 (en) * 2003-12-10 2007-05-31 France Telecom Transcoding between the indices of multipulse dictionaries used in compressive coding of digital signals
EP1796084A1 (de) * 2004-11-04 2007-06-13 Matsushita Electric Industrial Co., Ltd. Vektorumwandlung vorrichtung und vektorumwandlung methode

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2003003770A1 (en) * 2001-06-26 2003-01-09 Nokia Corporation Method for transcoding audio signals, transcoder, network element, wireless communications network and communications system
US6829579B2 (en) * 2002-01-08 2004-12-07 Dilithium Networks, Inc. Transcoding method and system between CELP-based speech codes
US20040102966A1 (en) * 2002-11-25 2004-05-27 Jongmo Sung Apparatus and method for transcoding between CELP type codecs having different bandwidths
US20040111257A1 (en) * 2002-12-09 2004-06-10 Sung Jong Mo Transcoding apparatus and method between CELP-based codecs using bandwidth extension
US20050053130A1 (en) * 2003-09-10 2005-03-10 Dilithium Holdings, Inc. Method and apparatus for voice transcoding between variable rate coders
US20070124138A1 (en) * 2003-12-10 2007-05-31 France Telecom Transcoding between the indices of multipulse dictionaries used in compressive coding of digital signals
EP1796084A1 (de) * 2004-11-04 2007-06-13 Matsushita Electric Industrial Co., Ltd. Vektorumwandlung vorrichtung und vektorumwandlung methode

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105869653A (zh) * 2016-05-31 2016-08-17 华为技术有限公司 话音信号处理方法和相关装置和系统
WO2017206432A1 (zh) * 2016-05-31 2017-12-07 华为技术有限公司 话音信号处理方法和相关装置和系统
US10218856B2 (en) 2016-05-31 2019-02-26 Huawei Technologies Co., Ltd. Voice signal processing method, related apparatus, and system

Similar Documents

Publication Publication Date Title
KR102240271B1 (ko) 대역폭 확장신호 생성장치 및 방법
JP5203929B2 (ja) スペクトルエンベロープ表示のベクトル量子化方法及び装置
US7502734B2 (en) Method and device for robust predictive vector quantization of linear prediction parameters in sound signal coding
US11282530B2 (en) Methods, encoder and decoder for linear predictive encoding and decoding of sound signals upon transition between frames having different sampling rates
US7184953B2 (en) Transcoding method and system between CELP-based speech codes with externally provided status
JP4390803B2 (ja) 可変ビットレート広帯域通話符号化におけるゲイン量子化方法および装置
RU2509379C2 (ru) Устройство и способ квантования и обратного квантования lpc-фильтров в суперкадре
US7752038B2 (en) Pitch lag estimation
CN101375330B (zh) 丢包后解码音频信号的时间扭曲的方法
US8666754B2 (en) Audio signal encoding method, audio signal decoding method, encoding device, decoding device, audio signal processing system, audio signal encoding program, and audio signal decoding program
EP1788556B1 (de) Skalierbare dekodierungsvorrichtung und verfahren zur signalverlustmaskierung
JP2007537494A (ja) 遠隔通信のためのマルチレート音声コーダにおける音声レート変換の方法及び装置
EP2502231B1 (de) Bandbreitenerweiterung eines niedrigband-audiosignals
SE521693C3 (sv) En metod och anordning för brusundertryckning
CN100578618C (zh) 一种解码方法及装置
CN1751338B (zh) 用于语音编码的方法和设备
JP2005515486A (ja) Celpによる音声符号間のトランスコーディング・スキーム
EP2045800A1 (de) Transkodierverfahren und -vorrichtung
JPH09127987A (ja) 信号符号化方法及び装置
JP2004177982A (ja) 音声音楽信号の符号化装置および復号装置

Legal Events

Date Code Title Description
PUAI Public reference made under article 153(3) epc to a published international application that has entered the european phase

Free format text: ORIGINAL CODE: 0009012

AK Designated contracting states

Kind code of ref document: A1

Designated state(s): AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HU IE IS IT LI LT LU LV MC MT NL PL PT RO SE SI SK TR

AX Request for extension of the european patent

Extension state: AL BA HR MK RS

AKX Designation fees paid
REG Reference to a national code

Ref country code: DE

Ref legal event code: 8566

STAA Information on the status of an ep patent application or granted ep patent

Free format text: STATUS: THE APPLICATION IS DEEMED TO BE WITHDRAWN

18D Application deemed to be withdrawn

Effective date: 20091009