WO2000046795A1 - Speech synthesizer based on variable rate speech coding - Google Patents
Speech synthesizer based on variable rate speech coding Download PDFInfo
- Publication number
- WO2000046795A1 WO2000046795A1 PCT/US2000/002900 US0002900W WO0046795A1 WO 2000046795 A1 WO2000046795 A1 WO 2000046795A1 US 0002900 W US0002900 W US 0002900W WO 0046795 A1 WO0046795 A1 WO 0046795A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- rate
- speech
- variable
- variable rate
- parameters
- Prior art date
Links
- 238000000034 method Methods 0.000 claims abstract description 25
- 238000004891 communication Methods 0.000 claims description 33
- 230000004044 response Effects 0.000 claims description 7
- 230000002194 synthesizing effect Effects 0.000 claims description 4
- 230000015572 biosynthetic process Effects 0.000 abstract description 15
- 238000003786 synthesis reaction Methods 0.000 abstract description 15
- 230000008901 benefit Effects 0.000 description 7
- 238000012545 processing Methods 0.000 description 7
- 235000019800 disodium phosphate Nutrition 0.000 description 6
- 230000000694 effects Effects 0.000 description 6
- 238000010586 diagram Methods 0.000 description 4
- 238000004519 manufacturing process Methods 0.000 description 4
- 238000006243 chemical reaction Methods 0.000 description 3
- 230000006835 compression Effects 0.000 description 2
- 238000007906 compression Methods 0.000 description 2
- 238000000605 extraction Methods 0.000 description 2
- 230000009471 action Effects 0.000 description 1
- 230000003466 anti-cipated effect Effects 0.000 description 1
- 238000005311 autocorrelation function Methods 0.000 description 1
- 230000001413 cellular effect Effects 0.000 description 1
- 230000003247 decreasing effect Effects 0.000 description 1
- 238000013461 design Methods 0.000 description 1
- 230000006870 function Effects 0.000 description 1
- 230000007774 longterm Effects 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 230000001915 proofreading effect Effects 0.000 description 1
- 230000009467 reduction Effects 0.000 description 1
- 230000005236 sound signal Effects 0.000 description 1
- 238000001228 spectrum Methods 0.000 description 1
- 238000012546 transfer Methods 0.000 description 1
- 230000007704 transition Effects 0.000 description 1
- 210000001260 vocal cord Anatomy 0.000 description 1
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L19/00—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
- G10L19/04—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using predictive techniques
- G10L19/16—Vocoder architecture
- G10L19/18—Vocoders using multiple modes
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L19/00—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
- G10L19/04—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using predictive techniques
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L19/00—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
- G10L19/04—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using predictive techniques
- G10L19/16—Vocoder architecture
- G10L19/18—Vocoders using multiple modes
- G10L19/24—Variable rate codecs, e.g. for generating different qualities using a scalable representation such as hierarchical encoding or layered encoding
Definitions
- the present invention relates to speech synthesis. More particularly, the present invention relates to synthesis of speech encoded by a variable rate vocoder. The invention further relates to use of speech synthesis with wireless communication devices.
- Electronic speech synthesis is useful in a number of applications. More and more, computers and other electronic equipment are providing the option of voiced prompts as a user interface. For example, speech may be utilized for reading electronic mail messages, for generating spoken prompts in a voice response system, or for providing directions to a driver in a vehicle.
- TTS text-to-speech
- grammar based A TTS based system converts ordinary text into intelligible and natural sounding speech. It is useful for applications needing an automatic conversion of arbitrary input text into intelligible and natural sounding speech output. It is especially useful where large vocabularies and /or dynamically changing data are involved.
- the TTS system is useful in applications such as providing automatic voice alerts and prompts, proofreading, telephone access to databases, and conversion of electronic mail to voice mail or audio output. Because TTS is flexible and powerful, it offers utility in many applications.
- TTS Transmission Control Protocol
- a machine tone if synthesizer doesn't simulate human speech intonation closely. Accordingly, TTS is not a practical choice for applications with limited memory and processing resources, such as in small portable wireless devices, remotely located communication devices or computers, and the like.
- a second type of speech synthesizer is Voice Coder (Vocoder) based.
- a vocoder compresses voiced speech, or audio signals, by extracting parameters that relate to a model of human speech generation. Vocoders have been developed to compress input speech that has been digitally converted to a rate of 64 kilo bits per second (kbps) down to 13 kbps, 8 kbps, or even lower rates.
- a vocoder based speech synthesizer generates certain parameters of or for the speech to be synthesized. The parameters are stored in some type of memory, preferably flash type, and are decoded upon speech synthesis. Because the parameters of all words to be synthesized need to be stored in memory, vocoder based speech synthesizers are more suitable for applications that do not require large vocabularies. They are especially suitable for systems having limited memory and processing resources.
- the present invention is an apparatus and method for speech synthesis based on variable rate vocoding.
- the speech to be synthesized is encoded by a variable rate vocoder.
- a variable rate vocoder encodes a frame of speech at one of a set of predetermined rates based on the speech activity taking place within the frame of speech.
- the variable rate vocoder is a code excited linear prediction (CELP) encoder having four bit rates.
- CELP code excited linear prediction
- an input speech signal is encoded into speech parameters at one of the four rates using a CELP encoding scheme for the selected rate.
- the speech parameters are generally provided to a decoder which performs a variable rate decoding scheme corresponding to the variable rate encoding scheme utilized.
- the decoder produces speech samples, which are provided to a coder-decoder or codec for digital-to-analog conversion.
- the resulting analog signal generated by the codec is then broadcast through a speaker or other known audio output device as synthesized speech.
- the speech synthesizer of the present invention is especially suitable for use in wireless communication systems in which variable rate vocoding is already implemented.
- the existing vocoding resources may be employed for speech synthesis.
- DSP elements already present or easily incorporated, can be used in conjunction with a small amount of memory to provide the speech synthesizer function.
- a speech synthesizer based on variable rate vocoding is able to provide good speech quality without requiring a large amount of memory.
- the level of compression provided by a variable rate vocoder makes it suitable for applications with limited memory.
- FIG. 1 is a block diagram of a variable rate vocoder
- FIG. 2 is a block diagram of the speech synthesizer of the present invention.
- the present invention provides an apparatus and method for synthesizing speech which is very useful when used with wireless communication equipment.
- the invention can take advantage of existing signal processing resources in wireless communication equipment or a minimum of additional hardware to synthesize speech in a manner that provides high speech quality and requires a small memory size.
- the present invention is very useful when employed in conjunction with a variety of known communication devices or systems, and it is described below in relation to a CDMA wireless communication system.
- it is contemplated that it is particularly well suited for specific applications, such as hands-free car kits used to mount and operate wireless devices in vehicles.
- this is not a limitation of the present invention, and that it can be used with other types of communication devices including those communicating in wired, wire line, or optical cable type systems, and those using other signal modulation techniques.
- An exemplary wireless communication system makes use of code division multiple access (CDMA) modulation techniques.
- CDMA code division multiple access
- TDMA time division multiple access
- FDMA frequency division multiple access
- AM amplitude modulation
- a speech synthesizer may be implemented in wireless communication devices or equipment for a number of reasons.
- speech synthesis may be part of a voice recognition system in a wireless telephone or a "hands- free" carkit used to support operation in a vehicle.
- a speech synthesizer can provide information in audible form when a device user or operator cannot visually observe an output screen or indicators on the device. For example, information can be provided to allow device operation or output when a vehicle driver or machinery operator cannot safely look at the communication device, closely.
- the speech synthesizer would also allow for hands free operation of devices by providing voice prompts for operations to be performed.
- the speech synthesizer may ask for the name of a person to be called, allowing the device to automatically dial a telephone number, or ask for a command to be implemented, such as dialing, storing, opening mail, terminating a call attempt, or shutting down.
- the speech synthesizer of the present invention makes use of the vocoder circuitry already present in a number of wireless devices such as wireless telephones and other products used by communication service subscribers to generate voiced speech.
- the speech synthesizer is based on a variable rate vocoder.
- a variable rate vocoder uses speech activity to vary its instantaneous data rate.
- the vocoder encoder uses a large number of bits to encode the speech samples.
- the vocoder encoder uses few or fewer bits to encode the background noise.
- An exemplary embodiment of a variable rate vocoder is described in U.S. Patent No. 5,414,796, entitled "Variable Rate Vocoder/' assigned to the assignee of the present invention and incorporated herein by reference.
- Variable rate vocoders are commonly used in CDMA type communication systems to increase system capacity by decreasing the number of bits generally used by each communication signal.
- a variable rate vocoder may, for example, be implemented in the CDMA communication system of Patent No. 4,901,307 discussed above.
- different users communicate using the same bandwidth, but using different code channels.
- a variable rate vocoder in a CDMA communication system takes advantage of the fact that a user is only speaking actively about 40% of the time on any given channel. By sending fewer bits when a user is silent, the variable rate vocoder allows more users to share the same bandwidth.
- a schematic block diagram of a typical variable rate vocoder is shown in
- FIG. 1 and is indicated generally by 100.
- the vocoder shown in FIG. 1 uses four different data rates, although it should be understood that a different number of data rates may be employed instead, as would be known in the art. In the set of four rates, if the peak rate is 13.2 kbps, then full rate corresponds to 13.2 kbps, 1/2 rate corresponds to approximately 6.2 kbps, 1/4 rate corresponds to approximately 2.7 kbps, and 1/8 rate corresponds to approximately 1.0 kbps. Note that the actual bit rate for rates other than the full rate are approximate because of the use of overhead bits, as is well understood in the art. Referring still to FIG. 1, it can be seen that variable rate vocoder 100 includes an encoder 102 and a decoder 104.
- Encoder 102 receives speech samples for frames of speech data as an input, for example, as 8-bit PCM samples at a 64 kbps data rate, in either mu-law or a-law format. Encoder 102 encodes these speech samples into speech parameters at one of four data rates, depending on the speech activity. The input speech samples are also provided to rate determination element 106.
- Rate determination element 106 may implement any of a number of rate decision algorithms.
- energy thresholds relative to the background noise energy level are used to determine the speech activity, and thereby the rate, at which the input samples are to be encoded. If the energy of the current frame of speech samples is far above the background noise energy, then the rate determination element 106 will determine that the frame is to be encoded at full rate. If the energy of the current frame is close to the background noise energy, then rate determination element 106 will determine that the frame is to be encoded at eighth rate, and so forth, as is known.
- a first mode measure is the target matching signal to noise ratio (TMSNR) from the previous encoding frame, which provides information on how well the encoding model is performing by comparing a synthesized speech signal with the input speech signal.
- TMSNR target matching signal to noise ratio
- a second mode measure is the normalized autocorrelation function (NACF), which measures periodicity in the speech frame.
- NACF normalized autocorrelation function
- ZC zero crossings
- PWD prediction gain differential
- a fifth measure is the energy differential (ED), which compares the energy in the current frame to an average frame energy.
- rate determination logic selects an encoding rate for each frame of input speech data.
- the values for the various modes select one of say four or more modes in which to operate. That is, the values detected for each mode measure relative to a threshold or other criteria determines which encoding rate is selected, based on a preselected pattern or hiearchy. For example, if the value for NACF is less than a pre- selected threshold and ZC is greater than a second pre-selected threshold one rate could be selected. However, if these conditions are not met but ED is lower than a third threshold, then a quarter rate might be selected.
- rate determination element 106 may be adopted by rate determination element 106.
- a signal indicating the data rate determined by rate determination element 106 is provided to a switch 108.
- Switch 108 selects an element for encoding a frame of input speech samples from among a full rate encoding element 110, a half rate encoding element 112, a quarter rate encoding element 114, and an eighth rate encoding element 116, as designated by the data rate signal.
- the selected encoding element encodes the speech samples to produce a signal of an encoded data packet.
- Rate determination element 106 also provides a signal indicating the data rate to a switch 118, which selects the same encoding element as switch 108 so that the signal of the encoded data packet generated by the selected encoding element can be provided to an output of the variable rate vocoder.
- Each of the encoding elements 110, 112, 114, and 116 is configured to encode speech using a predetermined encoding scheme.
- a linear-prediction- based encoding scheme such as the Code Excited Linear Predictive (CELP) encoder, is used in a preferred embodiment.
- CELP coder is described in the paper "A 4.8 Kbps Code Excited Linear Predictive Coder/' by Thomas E. Tremain el al., Proceedings of the Mobile Satellite Conference, 1988.
- Linear- prediction-based encoders compress speech by removing the natural redundancies inherent in speech. Speech typically exhibits short term redundancies resulting from the mechanical action of the lips and tongue, and long term redundancies resulting from the vibration of the vocal cords.
- Linear predictive schemes model these operations as filters, remove the redundancies, and then model the resulting residual signal as white gaussian noise.
- Linear predictive coders therefore, achieve a reduced bit rate by transmitting filter coefficients and quantized noise rather than a full bandwidth speech signal.
- a linear predictive coding scheme that employs variable rates offers further reductions in bit rate without compromising the quality of speech.
- the full rate encoding element 110 encodes the parameters of the input speech signal using more bits to better preserve the characteristics of the input.
- eighth rate encoding element 116 encodes the parameters using fewer bits since there is typically little detail or useful information to be captured. Transitions between periods of active speech and periods with no detected speech are encoded by half rate encoding element 112 and quarter rate encoding element 114.
- decoder 104 receives a signal of the encoded speech parameters as well as a signal indicating the rate used to encode the speech.
- a rate extraction element 128 receives this input signal and determines the data rate of the speech.
- a signal of the data rate is also_provided to a switch 130, which selects the decoding element from a set of decoding elements to properly decode the input parameters.
- four decoding elements, full rate decoding element 120, half rate decoding element 122, quarter rate decoding element 124, and eighth rate decoding element 126 are provided for decoding the speech parameters at the four possible rates.
- the selected decoding element decodes the input parameters based on the data rate to produce a signal of decoded samples, which typically are 64 kbps pulse code modulated (PCM) samples.
- a signal of the data rate determined by rate extraction element 128 is also provided to a switch 132.
- Switch 132 selects the same decoding element as switch 130 so that a signal of the decoded samples is provided to an output of the vocoder.
- FIG. 2 a block diagram of a speech synthesis system operating according to the principles of the present invention, which incorporates a variable rate vocoder, is shown.
- the speech synthesis system comprises a variable rate encoder 202 and a speech synthesizer 204.
- An example of the variable rate encoder 202 is encoder 102 of FIG. 1.
- Variable rate encoder 202 receives a speech signal as input, and encodes the speech at one of a set of predetermined rates.
- variable rate encoder 202 is a CELP encoder that generates speech parameters at one of the rates based on the speech activity in the input segment of speech.
- variable rate decoder is an enhanced variable rate decoder such as described in relation to the IS127 standard.
- encoding rate decisions are based on "mode measures,” as discussed above.
- mode measures The different combinations of criteria used to make rate selections are used to create what is termed “reduced rate mode” or “modes,” and referred to more simply as mode 0, mode 1, mode 2, and so forth, as would be understood by those skilled in the art.
- the present invention can take advantage of such modes for purposes of speech synthesis.
- the speech received by variable rate encoder 202 may be a word or a phrase from a pre-selected vocabulary that a communication device such as a wireless telephone, carkit, or other communication device is designed to synthesize.
- the vocabulary would include prompts and alerts to be given to a device user. For example, by extracting and synthesizing five individual vocabulary words: 'call', 'redial', 'program', 'or' and 'exit', the speech synthesizer may be designed to provide the prompts "call, redial, program, or exit" in solicitation of a response from the user.
- the speech synthesizer may be designed to provide previously stored information, such as in phone books, look-up tables, or databases, to a device user in response to various device inputs, including audio.
- the speech received by variable rate encoder 202 is encoded, and the encoded parameters are provided to a memory element or circuit 206 of the speech synthesizer 204 for storage.
- Memory 206 is intended to hold or store the parameters over some time for operation of the desired device. However, it is also generally desirable to have the parameters stored in a manner that makes them updateable or replaceable, such as when the vocabulary needs to be changed for changing conditions or upgrades to device features. Therefore, memory 206 is configured in the form of non-volatile but re-writable memory, which can be accomplished using flash type memory elements, as is well known in the art.
- variable rate encoder 202 may receive a speech signal input during operation of the communication device. For example, in response to a prompt from the speech synthesizer, the user may provide a spoken response.
- Variable rate encoder 202 will then encode the user's speech, and the encoded parameters may be provided to flash memory 206 for storage, and /or provided to a voice recognizer (not shown) for voice recognition purposes. In this manner, the parameters are input post manufacture such as immediately upon the device entering useful service or over time, such as by building a personal vocabulary library for each device (vocoder) user, related to that user's requirements.
- vocoder personal vocabulary library
- Flash memory 206 should be of a size that is sufficient to store the parameters of the pre-selected vocabulary as well as the parameters of speech anticipated from the user. Thus, the size of flash memory 206 may vary based on the requirements of the specific application. Post manufacture storage may have an advantage of reducing memory requirements where each device user does not require as extensive a vocabulary as compared to what a manufacturer would have to install to cover an entire larger device market.
- the speech synthesizer can record names or other words, like 'Fred Smith' by detecting the endpoints of the target or desired phrase or speech, removing silence or redundancies, and encoding it. Therefore, speech can be recorded "on-line” and used later to synthesize speech output.
- variable rate encoder 202 may be configured based on the available memory and the voice quality required. In the system having four rates wherein the full rate is 13 kbps, the average rate will generally be 5.88 kbps based on 40% voice activity. The use of the variable rates will provide high speech quality. If, however, the memory size is limited, variable rate encoder 202 mav be configured to operate at, say, a fixed half-rate of approximately 800 bvtes per second. Otherwise, the rate may be selected from a subset of the predetermined set of rates instead of the whole set of rates. For example, the reduced rate modes discussed above can be used to select various rates. In one embodiment of the invention, the rates are divided into a set of four modes, labeled as modes 0, 1, 2, and 3.
- variable rate encoder 202 may switch between different modes of operation (variable rate, all half-rate, a subset of the variable rates, etc.) based on the instantaneous requirements of the application. Because there may be a trade off between voice quality and memory size, the configuration to be adopted will depend on the application being implemented.
- variable rate decoder 208 The speech parameters stored in flash memory 206 will be provided to a variable rate decoder 208 when speech synthesis is desired.
- the variable rate decoder 208 is configured to decode the parameters generated by corresponding variable rate encoder 202.
- variable rate decoder 208 will be implemented as part of a digital signal processor (DSP)_used within the communication device.
- DSP digital signal processor
- Such DSPs are used as or to form the processing elements for signal coding/decoding, combining, CDMA coding, power adjustment, and so forth. Since such elements are typically used in wireless devices, and many other devices in which the invention may find use, advantage can be taken of their presence to very cost effectively implement the present invention.
- a stand-alone decoder within or using a DSP requires a very small amount of memory (both program and data) to attain speech synthesis capability.
- the speech synthesizer can be implemented using well known DSP circuits and devices such as commercially available from Analog Devices and Qualcomm Incorporated.
- the decoded parameters typically in the form of pulse code modulated
- PCM samples are then provided to a codec 210.
- Codec 210 converts the PCM samples from a digital format to an analog signal.
- the analog signal is provided to speaker or other known audio output device 212, which projects or broadcasts synthesized speech into the surrounding device environment where it can be heard.
- a speech synthesizer based on variable rate vocoding is provided by the present invention.
- the speech synthesizer is especially suitable for use in wireless communication devices that already comprise a variable rate vocoder.
- an existing variable rate vocoder that may be employed by the speech synthesizer, through the use of appropriate changes in program or operational instructions, or using control hardware.
- the compression achieved may allow a pre-determined vocabulary to be stored in a memory of limited size associated with the wireless device or other equipment with which it interfaces.
- the trade off between voice quality and memory size may be considered in configuring the variable rate vocoder to provide a speech synthesizer with the desired voice quality and memory size.
- the present invention can find application in a variety of communication devices and interface equipment.
- wireless communication devices such as, but not limited to, cellular and satellite telephones, often referred to as user terminals, subscriber units, mobile stations, or simply "users," “mobiles,” or “subscribers”.
- other devices are also contemplated, such as message receivers and data transfer devices (e.g., portable computers, personal data assistants, modems, machinery controllers), or interfaces for public telephone networks or dedicated communications channels.
- the invention can be implemented using separate circuits in the form of dedicated components or application specific integrated circuits (ASIC) to form a speech synthesizer which is installed within a desired device. Alternatively, it can be incorporated within other ASICs and devices by using a small amount of additional memory to work with existing digital signal processing elements.
- ASIC application specific integrated circuits
Landscapes
- Engineering & Computer Science (AREA)
- Human Computer Interaction (AREA)
- Acoustics & Sound (AREA)
- Signal Processing (AREA)
- Health & Medical Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Multimedia (AREA)
- Physics & Mathematics (AREA)
- Computational Linguistics (AREA)
- Quality & Reliability (AREA)
- Compression, Expansion, Code Conversion, And Decoders (AREA)
- Transmission Systems Not Characterized By The Medium Used For Transmission (AREA)
- Mobile Radio Communication Systems (AREA)
- Reduction Or Emphasis Of Bandwidth Of Signals (AREA)
- Devices For Executing Special Programs (AREA)
Abstract
Description
Claims
Priority Applications (5)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
EP00914511A EP1159738B1 (en) | 1999-02-08 | 2000-02-04 | Speech synthesizer based on variable rate speech coding |
JP2000597796A JP4503853B2 (en) | 1999-02-08 | 2000-02-04 | Speech synthesizer based on variable rate speech coding |
DE60027140T DE60027140T2 (en) | 1999-02-08 | 2000-02-04 | LANGUAGE SYNTHETIZER BASED ON LANGUAGE CODING WITH A CHANGING BIT RATE |
AU35891/00A AU3589100A (en) | 1999-02-08 | 2000-02-04 | Speech synthesizer based on variable rate speech coding |
HK02104772.4A HK1042980B (en) | 1999-02-08 | 2002-06-27 | Speech synthesizer based on variable rate speech coding |
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US24660599A | 1999-02-08 | 1999-02-08 | |
US09/246,605 | 1999-02-08 |
Publications (2)
Publication Number | Publication Date |
---|---|
WO2000046795A1 true WO2000046795A1 (en) | 2000-08-10 |
WO2000046795A9 WO2000046795A9 (en) | 2001-10-18 |
Family
ID=22931374
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/US2000/002900 WO2000046795A1 (en) | 1999-02-08 | 2000-02-04 | Speech synthesizer based on variable rate speech coding |
Country Status (10)
Country | Link |
---|---|
EP (1) | EP1159738B1 (en) |
JP (2) | JP4503853B2 (en) |
KR (1) | KR100648872B1 (en) |
CN (1) | CN1212604C (en) |
AT (1) | ATE322731T1 (en) |
AU (1) | AU3589100A (en) |
DE (1) | DE60027140T2 (en) |
ES (1) | ES2263459T3 (en) |
HK (1) | HK1042980B (en) |
WO (1) | WO2000046795A1 (en) |
Families Citing this family (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP4867076B2 (en) * | 2001-03-28 | 2012-02-01 | 日本電気株式会社 | Compression unit creation apparatus for speech synthesis, speech rule synthesis apparatus, and method used therefor |
KR100425982B1 (en) * | 2001-12-29 | 2004-04-06 | 엘지전자 주식회사 | Voice Data Rate Changing Method in IMT-2000 Network |
KR100651731B1 (en) * | 2003-12-26 | 2006-12-01 | 한국전자통신연구원 | Apparatus and method for variable frame speech encoding/decoding |
CN101692685B (en) * | 2009-10-29 | 2012-05-30 | 中国电信股份有限公司 | Method and system for improving acoustics of polyphonic ringtone |
JP5677470B2 (en) * | 2011-02-03 | 2015-02-25 | パナソニックIpマネジメント株式会社 | Voice reading device, voice output device, voice output system, voice reading method and voice output method |
CN106952651A (en) * | 2017-02-17 | 2017-07-14 | 福建星网智慧科技股份有限公司 | A kind of voice processing apparatus transmits the method and system of voice |
US11404045B2 (en) | 2019-08-30 | 2022-08-02 | Samsung Electronics Co., Ltd. | Speech synthesis method and apparatus |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
EP0762711A2 (en) * | 1995-09-12 | 1997-03-12 | Nokia Mobile Phones Ltd. | Speech storage in a portable cellular telephone |
US5657420A (en) * | 1991-06-11 | 1997-08-12 | Qualcomm Incorporated | Variable rate vocoder |
DE29717372U1 (en) * | 1997-09-29 | 1997-11-27 | Siemens AG, 80333 München | Integrated circuit for a mobile radio with answering machine function |
Family Cites Families (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
EP0331858B1 (en) * | 1988-03-08 | 1993-08-25 | International Business Machines Corporation | Multi-rate voice encoding method and device |
JP3081300B2 (en) * | 1991-10-01 | 2000-08-28 | 三洋電機株式会社 | Residual driven speech synthesizer |
TW271524B (en) * | 1994-08-05 | 1996-03-01 | Qualcomm Inc | |
JPH08263099A (en) * | 1995-03-23 | 1996-10-11 | Toshiba Corp | Encoder |
US6137840A (en) * | 1995-03-31 | 2000-10-24 | Qualcomm Incorporated | Method and apparatus for performing fast power control in a mobile communication system |
US5914950A (en) * | 1997-04-08 | 1999-06-22 | Qualcomm Incorporated | Method and apparatus for reverse link rate scheduling |
-
2000
- 2000-02-04 KR KR1020017009887A patent/KR100648872B1/en not_active IP Right Cessation
- 2000-02-04 AT AT00914511T patent/ATE322731T1/en not_active IP Right Cessation
- 2000-02-04 WO PCT/US2000/002900 patent/WO2000046795A1/en active IP Right Grant
- 2000-02-04 EP EP00914511A patent/EP1159738B1/en not_active Expired - Lifetime
- 2000-02-04 ES ES00914511T patent/ES2263459T3/en not_active Expired - Lifetime
- 2000-02-04 JP JP2000597796A patent/JP4503853B2/en not_active Expired - Fee Related
- 2000-02-04 CN CNB00803589XA patent/CN1212604C/en not_active Expired - Fee Related
- 2000-02-04 AU AU35891/00A patent/AU3589100A/en not_active Abandoned
- 2000-02-04 DE DE60027140T patent/DE60027140T2/en not_active Expired - Lifetime
-
2002
- 2002-06-27 HK HK02104772.4A patent/HK1042980B/en not_active IP Right Cessation
-
2009
- 2009-10-30 JP JP2009250670A patent/JP2010092059A/en active Pending
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US5657420A (en) * | 1991-06-11 | 1997-08-12 | Qualcomm Incorporated | Variable rate vocoder |
EP0762711A2 (en) * | 1995-09-12 | 1997-03-12 | Nokia Mobile Phones Ltd. | Speech storage in a portable cellular telephone |
DE29717372U1 (en) * | 1997-09-29 | 1997-11-27 | Siemens AG, 80333 München | Integrated circuit for a mobile radio with answering machine function |
WO1999017516A1 (en) * | 1997-09-29 | 1999-04-08 | Siemens Aktiengesellschaft | Integrated circuit for a mobile radio telephone with an answerphone function |
Also Published As
Publication number | Publication date |
---|---|
WO2000046795A9 (en) | 2001-10-18 |
EP1159738B1 (en) | 2006-04-05 |
KR20020012157A (en) | 2002-02-15 |
HK1042980B (en) | 2005-12-23 |
AU3589100A (en) | 2000-08-25 |
JP4503853B2 (en) | 2010-07-14 |
DE60027140T2 (en) | 2007-01-11 |
CN1347548A (en) | 2002-05-01 |
ATE322731T1 (en) | 2006-04-15 |
KR100648872B1 (en) | 2006-11-24 |
JP2002536693A (en) | 2002-10-29 |
CN1212604C (en) | 2005-07-27 |
HK1042980A1 (en) | 2002-08-30 |
JP2010092059A (en) | 2010-04-22 |
DE60027140D1 (en) | 2006-05-18 |
EP1159738A1 (en) | 2001-12-05 |
ES2263459T3 (en) | 2006-12-16 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
KR100923891B1 (en) | Method and apparatus for interoperability between voice transmission systems during speech inactivity | |
US6615169B1 (en) | High frequency enhancement layer coding in wideband speech codec | |
JP5149217B2 (en) | Method and apparatus for reducing undesirable packet generation | |
US5251261A (en) | Device for the digital recording and reproduction of speech signals | |
KR100574031B1 (en) | Speech Synthesis Method and Apparatus and Voice Band Expansion Method and Apparatus | |
JP2006099124A (en) | Automatic voice/speaker recognition on digital radio channel | |
KR100351484B1 (en) | Speech coding apparatus and speech decoding apparatus | |
JP2010092059A (en) | Speech synthesizer based on variable rate speech coding | |
ES2371455T3 (en) | PRE-PROCESSING OF DIGITAL AUDIO DATA FOR MOBILE AUDIO CODECS. | |
KR20000053407A (en) | Method for transmitting data in wireless speech channels | |
JP2001242896A (en) | Speech coding/decoding apparatus and its method | |
KR100911278B1 (en) | Sound source supply device and sound source supply method | |
KR100498177B1 (en) | Signal quantizer | |
KR101011320B1 (en) | Identification and exclusion of pause frames for speech storage, transmission and playback | |
JP5199281B2 (en) | System and method for dimming a first packet associated with a first bit rate into a second packet associated with a second bit rate | |
Choudhary et al. | Study and performance of amr codecs for gsm | |
JP3496618B2 (en) | Apparatus and method for speech encoding / decoding including speechless encoding operating at multiple rates | |
US6728344B1 (en) | Efficient compression of VROM messages for telephone answering devices | |
JP2000078246A (en) | Radio telephone system | |
KR20010038033A (en) | Apparatus and Method for generating a receiving ring in a mobile communication system | |
JPH06120889A (en) | Telephone signal transmission method/system in cordless telephone |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
WWE | Wipo information: entry into national phase |
Ref document number: 00803589.X Country of ref document: CN |
|
AK | Designated states |
Kind code of ref document: A1 Designated state(s): AE AL AM AT AU AZ BA BB BG BR BY CA CH CN CR CU CZ DE DK DM EE ES FI GB GD GE GH GM HR HU ID IL IN IS JP KE KG KP KR KZ LC LK LR LS LT LU LV MA MD MG MK MN MW MX NO NZ PL PT RO RU SD SE SG SI SK SL TJ TM TR TT TZ UA UG UZ VN YU ZA ZW |
|
AL | Designated countries for regional patents |
Kind code of ref document: A1 Designated state(s): GH GM KE LS MW SD SL SZ TZ UG ZW AM AZ BY KG KZ MD RU TJ TM AT BE CH CY DE DK ES FI FR GB GR IE IT LU MC NL PT SE BF BJ CF CG CI CM GA GN GW ML MR NE SN TD TG |
|
121 | Ep: the epo has been informed by wipo that ep was designated in this application | ||
DFPE | Request for preliminary examination filed prior to expiration of 19th month from priority date (pct application filed before 20040101) | ||
WWE | Wipo information: entry into national phase |
Ref document number: 2000914511 Country of ref document: EP |
|
ENP | Entry into the national phase |
Ref document number: 2000 597796 Country of ref document: JP Kind code of ref document: A |
|
WWE | Wipo information: entry into national phase |
Ref document number: 1020017009887 Country of ref document: KR |
|
AK | Designated states |
Kind code of ref document: C2 Designated state(s): AE AL AM AT AU AZ BA BB BG BR BY CA CH CN CR CU CZ DE DK DM EE ES FI GB GD GE GH GM HR HU ID IL IN IS JP KE KG KP KR KZ LC LK LR LS LT LU LV MA MD MG MK MN MW MX NO NZ PL PT RO RU SD SE SG SI SK SL TJ TM TR TT TZ UA UG UZ VN YU ZA ZW |
|
AL | Designated countries for regional patents |
Kind code of ref document: C2 Designated state(s): GH GM KE LS MW SD SL SZ TZ UG ZW AM AZ BY KG KZ MD RU TJ TM AT BE CH CY DE DK ES FI FR GB GR IE IT LU MC NL PT SE BF BJ CF CG CI CM GA GN GW ML MR NE SN TD TG |
|
COP | Corrected version of pamphlet |
Free format text: PAGES 1/2-2/2, DRAWINGS, REPLACED BY NEW PAGES 1/2-2/2 |
|
WWP | Wipo information: published in national office |
Ref document number: 2000914511 Country of ref document: EP |
|
REG | Reference to national code |
Ref country code: DE Ref legal event code: 8642 |
|
WWP | Wipo information: published in national office |
Ref document number: 1020017009887 Country of ref document: KR |
|
WWG | Wipo information: grant in national office |
Ref document number: 2000914511 Country of ref document: EP |
|
WWG | Wipo information: grant in national office |
Ref document number: 1020017009887 Country of ref document: KR |