GB2332841A - Speech communication systems - Google Patents

Speech communication systems Download PDF

Info

Publication number
GB2332841A
GB2332841A GB9727179A GB9727179A GB2332841A GB 2332841 A GB2332841 A GB 2332841A GB 9727179 A GB9727179 A GB 9727179A GB 9727179 A GB9727179 A GB 9727179A GB 2332841 A GB2332841 A GB 2332841A
Authority
GB
United Kingdom
Prior art keywords
voice message
words
signal
string
speech
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
GB9727179A
Other versions
GB9727179D0 (en
GB2332841B (en
Inventor
Howard Thomas
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Motorola Solutions UK Ltd
Original Assignee
Motorola Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Motorola Ltd filed Critical Motorola Ltd
Priority to GB9727179A priority Critical patent/GB2332841B/en
Publication of GB9727179D0 publication Critical patent/GB9727179D0/en
Publication of GB2332841A publication Critical patent/GB2332841A/en
Application granted granted Critical
Publication of GB2332841B publication Critical patent/GB2332841B/en
Anticipated expiration legal-status Critical
Expired - Fee Related legal-status Critical Current

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04BTRANSMISSION
    • H04B1/00Details of transmission systems, not covered by a single one of groups H04B3/00 - H04B13/00; Details of transmission systems not characterised by the medium used for transmission
    • H04B1/66Details of transmission systems, not covered by a single one of groups H04B3/00 - H04B13/00; Details of transmission systems not characterised by the medium used for transmission for reducing bandwidth of signals; for improving efficiency of transmission
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
    • G10L19/0018Speech coding using phonetic or linguistical decoding of the source; Reconstruction using text-to-speech synthesis

Abstract

Apparatus for compressing a voice message prior to radio transmission includes a speech recogniser 3 and a comparator 6 for identifying those words which are redundant to the meaning of the message and eliminating them from the transmitted signal. Alternatively longer words may be replaced with shorter synonyms. A syntax extractor 4, produces an error check signal which is received at the receiver to verify the accuracy of the received message and for relaying this back to the transmitter. The speech recogniser 3 compares each word of a digital bit stream from encoder 1 with stored templates and recognised words are compared at 6 with templates of redundant words stored at 7. The compressed word chain from comparator 6 along with the output of syntax extractor 4 are modulated at 8 and transmitted. In the receiver (fig.2 not shown), the received signal is demodulated and decoded for reconstruction as speech.

Description

SPEECH COMMUNICATION SYSTEMS This invention relates to digital communication systems and particularly to the transmission and reception of voice data over an air interface.
In a typical digital radio communication system, there are components that digitally encode and decode speech for communication over radio frequencies. For example, in the GSM (Global System for Mobile Communications) a speech transcoder provides the encoding and decoding ability in one component and is sometimes referred to as a speech codec.
Speech encoders are designed to utilise techniques which exploit the redundancy in the speech signal in order to reduce the number of bits required to represent the speech signal. This is an important consideration when large quantities of speech are to be held in storage media (such as a voice mail system) or when a limited bandwidth is available to transmit the speech signal over a telecommunications channel.
The present invention aims to provide a more efficient means for transmission of voice messages than is presently possible using known techniques. This is achieved by extracting the essential meaning of a voiced message or spoken text and rejecting any redundant information prior to transmission.
Accordingly, the invention comprises, in a first aspect, voice message transmission apparatus including: means for digitising an input analogue speech signal comprising the voice message to produce a first digital output signal representing a string of words, means for recognising individual words comprising said string of words, and means for eliminating from said string of words those words which are redundant to the meaning of the voice message to produce a second digital output signal representing a modified string of words.
In one embodiment, the apparatus further includes means for modulating onto a carrier said second digital output signal for onward transmission.
Thus, the invention has the advantage that the information content of a voice message can be described and transmitted in a highly compressed manner with a high degree of redundancy being removed from any spoken text.
By this means, the efficiency of transmission of a voice message is greatly enhanced. This contributes to a reduced call-time (or air time in the case of radio communications) and a consequent cost saving. It also lessens the power requirements on transmission and receiving equipment.
The means for digitising an input analogue speech signal may comprise any suitable known speech encoder which, for example, utilises the well-known two-stage process of sampling and quantisisation (ie pulsecode modulation). Codecs which utilise a non-linear quantisation process or devices which employ adaptive quantisation or differential quantisation, for example, are equally suitable.
The above-mentioned speech encoders all operate directly on the time domain speech signal. Also suitable are other known devices which operate by encoding a modified or transformed version of the speech signal, for example, linear predictive coders.
The number of bits used for encoding the speech signal may be chosen to be greater for those words which are essential to the meaning of the voice message than those which are not. These essential words would then be less effected by transmission errors or corruption.
The means for recognising individual words may comprise any suitable speech recogniser. Devices which utilise dynamic time warping or hidden Markov modelling, for example, are equally suitable. Such devices operate a pattern matching process by comparing a processed signal comprising the input word with stored word patterns.
Techniques for modulating the second digital output (which represents a modified voice message) onto a carrier may comprise any one of several known methods. For example, frequency shift keying.
The modulated signal may then be transmitted over a communication channel in accordance with the usual practice which may utilise multiplexing techniques such as frequency division multiple access or time division multiple access.
The modulated, transmitted signal may then be detected and demodulated by conventional receiving apparatus to reproduce the voice message. Alternatively, the voice message may be audibly reconstructed by means of a speech synthesizer. One known way of generating synthetic speech firstly requires a set of control parameters for producing a particular utterance. These parameters can conveniently be derived by analysis of the original voice message by the speech encoder. In the case of linear prediction, the analysis process is automatic and provided that the prediction error signal is adequately reproduced, the resulting resynthesised speech can be of high quality and virtually indistinguishable from the original.
In one embodiment, the voice transmission apparatus further includes speech understanding means. Known speech understanding systems typically incorporate several interacting knowledge sources such as acoustic, phonetic and syntactic knowledge sources. This knowledge is usually in the form of a set of rules for each knowledge source. One type of speech understanding system suitable for this application is a speech processor which is adapted to extract from the first digital output signal information relating to the "prosody" of the voice message. In this context "prosody" means and includes, for example, pitch, syntax, intonation, emphasis, rhythm and spectral characteristics. Some or all of this information may be encoded and transmitted along with the transmitted modified voice message for reception by the receiving apparatus.
Therefore, in a second aspect, the present invention comprises voice message receiving apparatus including: means for receiving via a communication link, a communications signal comprising a first data signal representing a voice message and a second data signal representing "prosody" of the voice message, means for decoding the first data signal for reproducing the voice message, and means for comparing the first and second data signals to generate an error signal.
By virtue of this second aspect of the invention, any corruption of the original voice message which has occurred in the transmission process and can be detected in a receiver using the prosody information content of the original message and comparing this with the received voice message.
In a further embodiment, the error signal may be retransmitted back to the transmission apparatus. In this case and on reception of the error signal, the transmission apparatus is adapted to, for example, retransmit the voice message over a different communications channel or adjust the channel modulation level or the relevant encoding parameters, eg, bit-rate and transmit a reconfigured signal. This process can be repeated until an acceptable error signal is derived at the receiving apparatus. This error measurement can be used as a similar error rate parameter to R x qual in a GSM system, including handover, power control and network quality.
Some embodiments of the invention will now be described, by way of example only, with reference to the following drawings of which: Figure 1 is a schematic block diagram of voice message transmission apparatus in accordance with the invention and Figure 2 is a schematic block diagram of voice message receiving apparatus in accordance with the invention.
In Figure 1, a speech encoder 1 has an input on line 2 comprising a voice message in analogue form. An output of the speech encoder 1 is connected to a speech recogniser 3, a syntax extractor 4 and a memory 5.
An output of the speech recogniser 3 is connected to a first input of a comparator 6. The second input of the comparator 6 is connected to word store 7 and a third input to an output of the syntax extractor 4. The output of the comparator 6 is connected to the memory 5 whose output along with a second output from the syntax extractor 4 is connected to modulator 8.
The modulator's output signal on line 9 is transmitted over an air interface by means of an antenna 10 via a transmit/receive duplexer 11, for reception by the apparatus of Figure 2. A second output of the duplexer 11 is connected to an error rate detector 12 whose output is fed to a second input of the speech encoder 1.
In Figure 2 an incoming signal is received by a second antenna 13 and passed through a second transmit/receive duplex 14 to a demodulator 15. The demodulator 15 has two outputs. A first output on line 16 (relating to syntax) is input to an error signal generator 17. A second output on line 18 representing voice data is connected to a decoder 19 and to a second input of the error signal generator 17. The output of the decoder 19 is connected via line 20 to a loud speaker (not shown) via suitable conditioning electronics (not shown).
In operation, with reference to Figure 1, a speech signal comprising a voice message is encoded into digital form by the speech encoder 1. The output from the speech encoder 1 consists of a digital bit stream which represents a string of words comprising the voice message. This bitstream is operated on by the syntax extractor 4 which generates an output error check signal for transmission via the modulator 8. The syntax extractor is programmed to extract the meaning of the voice message.
The bit-stream is also operated on by the speech recogniser 3 which performs a pattern matching exercise comparing each word (digitally represented) in the word string with stored templates. Each recognised word is then fed into the comparator 6 for filtering out unnecessary utterances. Also connected to the comparator 6 is the word store 7 in which are stored templates of words which are generally redundant to the meaning of a spoken message. In this single example, the words stored are "the" "on", "in" "at" and meaningless utterances such as "um", "er".
Words such as these are looked for in the voice message word string by the comparator 6 and rejected.
As an example, consider the voice message "John will meet you at the conference in Cannes on February 17". Deleting the words "at", "the", "in", "on" still leaves the message comprehensible and accurate yet considerably reduces the overall duration of the message. (The syntax extractor output will also be indicative of the sense of a modified message).
Words that are to be retained in the message are held in the memory 5 and those to be rejected are not stored. The action of the memory 5 is under the control of the comparator 6.
When the comparator 6 has completed its word filtering process, the modified voice message (now considerably compressed into fewer words and therefore fewer bits of data) is fed out of the memory 5 and modulated onto a carrier by the modulator 8. The output from the syntax extractor 4 is also modulated onto a carrier by the modulator 8 and the resulting RF signals are transmitted via the duplexer 11 and antenna 10 in accordance with conventional techniques.
With reference to Figure 2 the transmitted signal is received and demodulated in the demodulator 15. The digital data stream output by the demodulator 15 and representing the voice message is decoded by the speech decoder 19 for reconstruction as an audible voice message. The digital data stream representing the error check signal is sent to the error signal generator 17. This device uses the information received from the syntax extractor 4 to search for corruption of the received voice message which is also input to the error signal generator 17 from the demodulator 15. If a syntax error is detected, then this fact is relayed back to the transmission apparatus of Figure 1. The voice message is then retransmitted using different transmission parameters. One option is to retransmit the same message over a different communications channel.
Another option is to re-encode the voice message by increasing the bit rate of the speech encoder 1. A further option is to alter the modulation level in the modulator 8.
On reception of the retransmitted signal, the error signal generator 17 performs a further error analysis. Depending on the result, the error signal generator 17 can signal to the transmitting apparatus and to the recipient of the reconstructed audio message that the transmission is valid or it can request the transmitting apparatus to change the variable parameters and retransmit until the error level is acceptable.
In an alternative embodiment, the word store 7 is also provided with synonyms which may be used to replace longer words having the same meaning.

Claims (11)

  1. Claims 1. Voice message transmission apparatus including; means for digitising an input analogue speech signal comprising the voice message to produce a first digital output signal representing a string of words, means for recognising individual words comprising said string of words, and means for eliminating from string of words those words which are redundant to the meaning of voice message to produce a second digital output signal representing a modified string of words.
  2. 2. Voice message transmission apparatus according to Claim 1 and further including speech processing means for extracting from the first digital output signal, information relating to the prosody of the voice message.
  3. 3. Voice message transmission apparatus according to Claim 1 and further including speech processing means for extracting from the first digital output signal, information relating to the syntax of the voice message.
  4. 4. Voice transmission apparatus according to any preceding Claim and further including means for modulating onto a carrier digital data representative of the voice message for radio transmission.
  5. 5. Voice message receiving apparatus including: means for receiving via a communication link a communications signal comprising a first data signal representing a voice message and a second data signal representing prosody of the voice message, means for decoding the first data signal for reproducing the voice message, and means for comparing the first and second data signals to generate an error signal.
  6. 6. Voice message receiving apparatus including: means for receiving via a communication link a first data signal representing a transmitted voice message and a second data signal relating to the syntax of the transmitted voice message, and means for extracting syntax information from the transmitted voice message as received and comparing the extracted syntax information with the second data signal.
  7. 7. A method for compressing a voice message signal for transmission and reception via a telecommunications channel including the steps: digitising an input analogue speech signal comprising the voice message signal to produce a first digital output signal representing a string of words, recognising individual words comprising said string of words, and eliminating from said string of words those words which are redundant to the meaning of the voice message to produce a second digital output signal representive of compressed version of the voice message.
  8. 8. A method accord to Claim 5 and further including steps of: extracting prosody information from the first digital output signal, and comparing said compressed version of the voice message with said prosody information to produce an error signal.
  9. 9. Voice message transmission apparatus substantially as hereinbefore described with reference to Figure 1.
  10. 10. Voice message receiving apparatus substantially as hereinbefore described with reference to Figure 2.
  11. 11. A method of transmitting and receiving a voice message signal substantially as hereinbefore described with reference to the drawings.
GB9727179A 1997-12-24 1997-12-24 Speech communication systems Expired - Fee Related GB2332841B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
GB9727179A GB2332841B (en) 1997-12-24 1997-12-24 Speech communication systems

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
GB9727179A GB2332841B (en) 1997-12-24 1997-12-24 Speech communication systems

Publications (3)

Publication Number Publication Date
GB9727179D0 GB9727179D0 (en) 1998-02-25
GB2332841A true GB2332841A (en) 1999-06-30
GB2332841B GB2332841B (en) 2002-10-30

Family

ID=10824124

Family Applications (1)

Application Number Title Priority Date Filing Date
GB9727179A Expired - Fee Related GB2332841B (en) 1997-12-24 1997-12-24 Speech communication systems

Country Status (1)

Country Link
GB (1) GB2332841B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP1103954A1 (en) * 1998-10-26 2001-05-30 Lancaster Equities Limited Digital speech acquisition, transmission, storage and search system and method

Non-Patent Citations (5)

* Cited by examiner, † Cited by third party
Title
in Telecommunications vol.2.no.7,P.38-9, July 1968,USA. *
INSPEC abstract of Journal article "System for transmitting high quality digital voice messages..." *
INSPEC abstract of Journal article"Reduction of data flow.."by Becker in International Conference on *
JAPIO abstract of Japanese patent JP 040077044 (NIPPON) 11.3.92 *
remote data processing p.65 by Colloque International sur laTeleinformatique, Paris,France, 1969 *

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP1103954A1 (en) * 1998-10-26 2001-05-30 Lancaster Equities Limited Digital speech acquisition, transmission, storage and search system and method

Also Published As

Publication number Publication date
GB9727179D0 (en) 1998-02-25
GB2332841B (en) 2002-10-30

Similar Documents

Publication Publication Date Title
US5778335A (en) Method and apparatus for efficient multiband celp wideband speech and music coding and decoding
CN1878001B (en) Apparatus and method of encoding audio data, and apparatus and method of decoding encoded audio data
KR100357254B1 (en) Method and Apparatus for Generating Comfort Noise in Voice Numerical Transmission System
EP1101289B1 (en) Method for inserting auxiliary data in an audio data stream
EP1498873B1 (en) Improved excitation for higher band coding in a codec utilizing frequency band split coding methods
US5873059A (en) Method and apparatus for decoding and changing the pitch of an encoded speech signal
CN1809872B (en) Apparatus and method for encoding an audio signal and apparatus and method for decoding an encoded audio signal
KR100574031B1 (en) Speech Synthesis Method and Apparatus and Voice Band Expansion Method and Apparatus
US4815134A (en) Very low rate speech encoder and decoder
RU2144261C1 (en) Transmitting system depending for its operation on different coding
MXPA05000285A (en) Method and device for efficient in-band dim-and-burst signaling and half-rate max operation in variable bit-rate wideband speech coding for cdma wireless systems.
EP1262956A3 (en) Signal encoding method and apparatus
JP2006099124A (en) Automatic voice/speaker recognition on digital radio channel
KR970022701A (en) Voice encoding method and apparatus
US20080052085A1 (en) Sound encoder and sound decoder
JP4805506B2 (en) Predictive speech coder using coding scheme patterns to reduce sensitivity to frame errors
US7016832B2 (en) Voiced/unvoiced information estimation system and method therefor
US6104994A (en) Method for speech coding under background noise conditions
Ramprashad A two stage hybrid embedded speech/audio coding structure
EP1020848A2 (en) Method for transmitting auxiliary information in a vocoder stream
GB2332841A (en) Speech communication systems
EP1159738B1 (en) Speech synthesizer based on variable rate speech coding
Ding Wideband audio over narrowband low-resolution media
KR100469270B1 (en) Device and Method for processing speech signal in communication system
WO1996011531A2 (en) Transmission system utilizing different coding principles

Legal Events

Date Code Title Description
PCNP Patent ceased through non-payment of renewal fee

Effective date: 20061224