CA2293165A1 - Method for transmitting data in wireless speech channels - Google Patents

Method for transmitting data in wireless speech channels Download PDF

Info

Publication number
CA2293165A1
CA2293165A1 CA002293165A CA2293165A CA2293165A1 CA 2293165 A1 CA2293165 A1 CA 2293165A1 CA 002293165 A CA002293165 A CA 002293165A CA 2293165 A CA2293165 A CA 2293165A CA 2293165 A1 CA2293165 A1 CA 2293165A1
Authority
CA
Canada
Prior art keywords
speech
vocoder
output
codebook
information
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Abandoned
Application number
CA002293165A
Other languages
French (fr)
Inventor
Michael Charles Recchione
Steven A. Benno
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Nokia of America Corp
Original Assignee
Lucent Technologies Inc
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Lucent Technologies Inc filed Critical Lucent Technologies Inc
Publication of CA2293165A1 publication Critical patent/CA2293165A1/en
Abandoned legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
    • G10L19/018Audio watermarking, i.e. embedding inaudible data in the audio signal
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
    • G10L19/04Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using predictive techniques
    • G10L19/08Determination or coding of the excitation function; Determination or coding of the long-term prediction parameters
    • G10L19/12Determination or coding of the excitation function; Determination or coding of the long-term prediction parameters the excitation function being a code excitation, e.g. in code excited linear prediction [CELP] vocoders
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
    • G10L19/04Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using predictive techniques
    • G10L19/16Vocoder architecture
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/78Detection of presence or absence of voice signals

Abstract

Non-speech information is sent in the bits allocated to one or both of a vocoder's codebooks' output by setting the gain for the corresponding codebook to zero.
By setting the gain to zero, the codebook output will not be interpreted by the receiving vocoder. In this way, it is possible to transmit additional information in a way that is totally transparent to the vocoder. Applications for this technique of sending "secret" messages include, but is not limited to, transmitting parameters for generating non-speech signals. As an example, information to generate call waiting tones, DTMF, or TTY/TDD characters can be clandestinely embedded in the compressed bit stream so that these non-speech tones can be regenerated.

Description

S. A. Benno 1 METHOD FOR TRANSMITTING DATA IN
WIRELESS SPEECH CHANNELS
Background of the Invention 1. Field of the Invention The present invention relates to telecommunications; more particularly, to transmitting data in wireless speech channels.
2. Description of the Prior Art A voice encoder/de':oder (vocoder) is used to compress voice signals so as to reduce the transmission bandwidth over a communications channel. By reducing the bandwidth per call, it becomes possible to place more calls over the same channel. There exists a class of vocoders known as code excited lineax prediction (CELP) vocoders. In these vocoders, the speech is modeled by a series of filters. The parameters to these filters can be transmitted with much fewer bits than the original speech. It is also necessary to transmit the input (or excitation) to these filters in order to reconstruct the original speech.
Because it would require too much bandwidth to transmit the excitation directly, a crude approximation is made by replacing the excitation by a. few non-zero pulses. The locations of these pulses can be transmitted using very few trits and this crude approximation to the original excitation is adequate to reproduce high quality speech. The excitation is represented by a fixed codebook contribution and an associated gain. Also the quasi-periodicity found in speech is represented by an adaptive codebook output and an associated gain. The fixed codebook output and its associated gain, the adaptive: codebook output and its associated gain, and filter parameters (also known as linear predictive coder parameters) are transmitted to represent the encoded speech signal.
The vocoders were initially designed to compress speech by modeling its characteristics and transmitting the parameters of that model in much fewer bits than transmitting the speech itself. As wireless phones become more commonplace, people are increasingly expecting to use them for the same range of non-speech applications as they have used traditional landline phones, such as accessing voice mail and receiving call waiting tones.
Recently, the FCC has S. A. Benno 1 mandated that text-telephones for the hearing impaired (TTY/TDD) work with digital cellular phones. The problem with non-speech applications is that they do not fit the vocoder's speech model. When non-speech signals are passed through the vocoder, the decoded result it not always acceptable. The problem is further exacerbated by the fact that wireless phones operate in an error prone environment. In order to recover from transmission errors, the vocoder depends on a speech model to recover from random errors. Once again, non-speech signals do not match this model and so the reconstruction is inadequate.
Summary of the Invention The present invention sends information in the bits allocated to one or both of the codebooks' output by setting; the gain for the corresponding codebook to zero.
By setting the gain to zero, the codebook output will not be interpreted by the receiving vocoder. In this way, it is possible to transmit additional information in a way that is totally transparent to the vocoder. Applications for this technique of sending "secret" messages include, but is not limited to, transmitting parameters for generating non-speech signals. As an example, information to generate call waiting tones, DTMF, or TTY/T'DD characters can be clandestinely embedded in the compressed bit stream so that these non-speech tones can be regenerated.
Brief Description of the Drawings FIG. 1 is a block diagram of a typical vocoder;
FIG. 2 illustrates the major functions of encoder 14 of vocoder 10; and FIG. 3 is a functional block diagram of decoder 20 of vocoder 10.
Detailed Description FIG. 1 illustrates a block diagram of a typical vocoder. Vocoder 10 receives digitized speech on input 12. The digitized speech is an analog speech signal that has been passed through an analog to digitized converter, and has been broken into frames where each frame is typically on the order of 20 nnilliseconds. The signal at input 12 is passed to encoder section 14 which encodes the speech so as decrease the amount of bandwidth used to transmit the speech.
The encoded speech is made available at output 16. 'The encoded speech is received by the decode section of a similar vocoder at the other end of a communication channel. The decoder S. A. Benno 1 at the other end of the communication channel is similar or identical to the decoder portion of vocoder 10. Encoded speech is received by vocoder 10 through input 18, and is passed to decoder section 20. Decoder section 20 uses the encoded signals received from the transmitting vocoder to produce digitized speech at output 22.
Vocoders are well known in the communications arts. For example, vocoders are described in "Speech and audio coding for wireless and network applications,"
edited by Bishnu S. Atal, Vladimir Cuperman, and Allen Gersho, 1993, by Kluwer Academic Publishers.
Vocoders are widely available and manufactured by companies such as Qualcomm Incorporated of San Diego, California, and Lucent Technologies Inc., of Murray Hill, New Jersey.
FIG. 2 illustrates the major functions of encoder 14 of vocoder 10. A
digitized speech signal is received at input 12, and is passed to linear predictive coder 40.
Linear predictive coder 40 performs a linear predictive analysis of the incoming speech once per frame. Linear predictive analysis is well l~:nown in the art and produces a linear predictive synthesis model of the vocal tract based on the input speech signal. The linear predictive parameters or coefficients describing this model are transmitted as part of the encoded speech signal through output 16. Coder 40 uses this model to produce a residual speech signal which represents the excitation that the model uses to reproduce the input speech signal. The residual speech signal is made available at output 42. The residual speech from output 42 is provided to input 48 of open-loop pitch search unit 50 to an input of adaptive codebook unit 72 and to fixed codebook unit 82.
Impulse response unit 60 receives the linear predictive parameters from coder 40 and generates the impulse response of the model generated in coder 40. This impulse response is used in the adaptive and fixed codebook units.
Open loop pitch search unit 50 uses the residual speech signal from coder 40 to model its pitch and provides a pitch, or what is commonly called the pitch period or pitch delay signal, at output 52. The pitch delay signal from output 52 and the impulse response signal from output 64 of impulse response unit 60 are received by input 70 of adaptive codebook unit 72.
Adaptive codebook unit 72 produces a pitch gain output and a pitch index output which become part of encoded speech output 16 of vocoder 10. Output 74 of adaptive codebook 72 also provides the pitch gain and :pitch index signals to input 80 of fixed codebook unit 82.

S. A. Benno 1 Additionally, adaptive codebook 72 provides an excitation signal and an adaptive codebook target signal to input 80.
The adaptive codebook 72 produces its outputs using the digitized speech signal from input 12 and the residual speech signal produced by linear predictive coder 40. Adaptive :i codebook 72 uses the digitized speech signal and linear predictive coder 40's residual speech signal to form an adaptive c;odebook target signal. The adaptive codebook target signal is used as an input to fixed codebook 82, and as an input to a computation that produces the pitch gain, pitch index and excitation outputs of adaptive codebook unit 72. Additionally, the adaptive codebook target signal, the pitch delay signal from open loop pitch search unit 50, and the 1 (1 impulse response from impulse response unit 60 are used to produced the pitch index, the pitch gain and excitation signals which are passed to fixed codebook unit 82. The manner in which these signals are computed is well known in the vocoder art.
Fixed codebook 82 uses the inputs received from input 80 to produce a fixed gain output and a fixed index output which are used as part of the encoded speech at output 16. The 15 fixed codebook unit attempts to model the stochastic part of the linear predictive coder 40's residual speech signal. A target for the fixed codebook search is produced by determining the error between the current adaptive codebook target signal and the residual speech signal. The fixed codebook search produces the fixed gain and fixed index signal for excitation pulses so as to minimize this error. The manner in which the fixed gain and fixed index signals are 20 computed using the outputs from adaptive codebook unit 72 are well known in the vocoder art.
Switches 90 and 92 are used to send data in place of the bits that are used to send the fixed codebook output and the adaptive codebook output, respectively. When the contacts of the switches are in position "A'", the associated codebook output is replaced by data or other information and the associal:ed codebook gain is set to zero or substantially zero. As a result, 25 the scaled codebook output ~or excitation produced at a receiver will be zero or substantially zero and therefore will not have an adverse affect on the filter being used by the receiving vocoder to model the speech that is normally transmitted.
FIG. 3 illustrates a functional block diagram of decoder 20 of vocoder 10.
Encoded speech signals are received at input 18 of encoder 20. The encoded speech signals are received 30 by decoder 100. Decoder 100 produces fixed and adaptive code vectors corresponding to the fixed index and pitch index aignals, respectively. These code vectors are passed to the S. A. Benno 1 excitation construction portion of unit 110 along with the pitch gain and the fixed gain signals.
The pitch gain signal is used to scale the adaptive vector which was produced using the pitch index signal, and the fixed ;gain signal is used to scale the fixed vector which was obtained using the fixed index signali. Decoder 100 passes the linear predictive code parameters to the _'c filter or model synthesis section of unit 110. Unit 110 then uses the scaled vectors to excite the f lter that is synthesized using the linear predictive coefficients produced by linear predictive coder 40, and produces an output signal which is representative of the digitized speech originally received at input 12. Optionally, post filter 120 may be used to shape the spectrum of the digitized speech signal that is produced at output 20.
1 Ci When data rather than speech information is being transmitted, the pitch index (adaptive codebook output) and/or the fixed index (the fixed codebook output) are used to receive the data. The affect of non-data signals on the filter synthesize by unit 110 are eliminated because the gain value associated with the pitch or code index is zero.
The functional block diagrams can be implemented in various forms. Each block can be 15 implemented individually using microprocessors or microcomputers, or they can be implemented using a single microprocessor or microcomputer. It is also possible to implement each or all of the functional blocks using programmable digital signal processing devices or specialized devices received from the aforementioned manufacturers or other semiconductor manufacturers.

Claims (5)

1. A method for transmitting non-speech information over a speech channel, CHARACTERIZED BY the steps of:
transmitting non-speech information in place of pitch index information; and transmitting a pitch gain value having a value of substantially zero.
2. The method of claim 1, CHARACTERIZED IN THAT the non-speech information is DTMF information.
3. The method of claim 1, CHARACTERIZED IN THAT the non-speech information is TTY/TDD information.
4. A method for transmitting non-speech information over a speech channel, CHARACTERIZED BY the steps of:
transmitting first non-speech information in place of fixed index information;
and transmitting a index gain value having a value of substantially zero.
5. The method of claim 4, further CHARACTERIZED BY the steps of:
transmitting second non-speech information in place of pitch index information; and transmitting a pitch gain value having a value of substantially zero.
CA002293165A 1999-01-11 1999-12-30 Method for transmitting data in wireless speech channels Abandoned CA2293165A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US22810299A 1999-01-11 1999-01-11
US09/228,102 1999-01-11

Publications (1)

Publication Number Publication Date
CA2293165A1 true CA2293165A1 (en) 2000-07-11

Family

ID=22855803

Family Applications (1)

Application Number Title Priority Date Filing Date
CA002293165A Abandoned CA2293165A1 (en) 1999-01-11 1999-12-30 Method for transmitting data in wireless speech channels

Country Status (7)

Country Link
EP (1) EP1020848A2 (en)
JP (1) JP2000209663A (en)
KR (1) KR20000053407A (en)
CN (1) CN1262577A (en)
AU (1) AU6533799A (en)
BR (1) BR0000002A (en)
CA (1) CA2293165A1 (en)

Families Citing this family (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
AU2002228646A1 (en) * 2000-11-07 2002-05-21 Ericsson Inc. Method of and apparatus for detecting tty type calls in cellular systems
US7310596B2 (en) 2002-02-04 2007-12-18 Fujitsu Limited Method and system for embedding and extracting data from encoded voice code
JP4330346B2 (en) * 2002-02-04 2009-09-16 富士通株式会社 Data embedding / extraction method and apparatus and system for speech code
US7932851B1 (en) * 2002-10-15 2011-04-26 Itt Manufacturing Enterprises, Inc. Ranging signal structure with hidden acquisition code
US7970606B2 (en) 2002-11-13 2011-06-28 Digital Voice Systems, Inc. Interoperable vocoder
US7634399B2 (en) 2003-01-30 2009-12-15 Digital Voice Systems, Inc. Voice transcoder
EP1455509A3 (en) * 2003-03-03 2005-01-05 FREQUENTIS GmbH Method and system for speech recording
US8359197B2 (en) 2003-04-01 2013-01-22 Digital Voice Systems, Inc. Half-rate vocoder
FR2859566B1 (en) * 2003-09-05 2010-11-05 Eads Telecom METHOD FOR TRANSMITTING AN INFORMATION FLOW BY INSERTION WITHIN A FLOW OF SPEECH DATA, AND PARAMETRIC CODEC FOR ITS IMPLEMENTATION
US7752039B2 (en) 2004-11-03 2010-07-06 Nokia Corporation Method and device for low bit rate speech coding
DE102007007627A1 (en) * 2006-09-15 2008-03-27 Rwth Aachen Method for embedding steganographic information into signal information of signal encoder, involves providing data information, particularly voice information, selecting steganographic information, and generating code word
US8036886B2 (en) 2006-12-22 2011-10-11 Digital Voice Systems, Inc. Estimation of pulsed speech model parameters
US11270714B2 (en) 2020-01-08 2022-03-08 Digital Voice Systems, Inc. Speech coding using time-varying interpolation

Also Published As

Publication number Publication date
JP2000209663A (en) 2000-07-28
AU6533799A (en) 2000-07-13
BR0000002A (en) 2002-01-02
CN1262577A (en) 2000-08-09
KR20000053407A (en) 2000-08-25
EP1020848A2 (en) 2000-07-19

Similar Documents

Publication Publication Date Title
EP0920693B1 (en) Method and apparatus for improving the voice quality of tandemed vocoders
JP4927257B2 (en) Variable rate speech coding
KR100923891B1 (en) Method and apparatus for interoperability between voice transmission systems during speech inactivity
EP0848374B1 (en) A method and a device for speech encoding
KR100487943B1 (en) Speech coding
EP1202251A2 (en) Transcoder for prevention of tandem coding of speech
EP1535277B1 (en) Bandwidth-adaptive quantization
JPH11126098A (en) Voice synthesizing method and device therefor, band width expanding method and device therefor
US8055499B2 (en) Transmitter and receiver for speech coding and decoding by using additional bit allocation method
CN1200404C (en) Relative pulse position of code-excited linear predict voice coding
EP1020848A2 (en) Method for transmitting auxiliary information in a vocoder stream
JPH11259100A (en) Method for encoding exciting vector
US20030065507A1 (en) Network unit and a method for modifying a digital signal in the coded domain
AU6203300A (en) Coded domain echo control
JP2001265397A (en) Method and device for vocoding input signal
EP1397655A1 (en) Method and device for coding speech in analysis-by-synthesis speech coders
US20050102136A1 (en) Speech codecs
Choudhary et al. Study and performance of amr codecs for gsm
JP3496618B2 (en) Apparatus and method for speech encoding / decoding including speechless encoding operating at multiple rates
KR20050027272A (en) Speech communication unit and method for error mitigation of speech frames
EP0930608A1 (en) Vocoder with efficient, fault tolerant excitation vector encoding
KR20050059572A (en) Apparatus for changing audio level and method thereof

Legal Events

Date Code Title Description
EEER Examination request
FZDE Discontinued