WO2002025636A2 - Distributed speech recognition using dynamically determined feature vector codebook size - Google Patents
Distributed speech recognition using dynamically determined feature vector codebook size Download PDFInfo
- Publication number
- WO2002025636A2 WO2002025636A2 PCT/EP2001/010720 EP0110720W WO0225636A2 WO 2002025636 A2 WO2002025636 A2 WO 2002025636A2 EP 0110720 W EP0110720 W EP 0110720W WO 0225636 A2 WO0225636 A2 WO 0225636A2
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- string
- recognition
- codebook
- size
- bits
- Prior art date
Links
- 239000013598 vector Substances 0.000 title claims abstract description 52
- 238000004891 communication Methods 0.000 claims abstract description 17
- 238000007906 compression Methods 0.000 claims abstract description 13
- 230000006835 compression Effects 0.000 claims abstract description 13
- 238000000034 method Methods 0.000 claims description 22
- 230000005540 biological transmission Effects 0.000 claims description 9
- 238000010295 mobile communication Methods 0.000 claims description 6
- 230000008569 process Effects 0.000 claims description 3
- 230000015572 biosynthetic process Effects 0.000 claims 4
- 238000011156 evaluation Methods 0.000 abstract description 11
- 238000013139 quantization Methods 0.000 abstract description 11
- 239000000284 extract Substances 0.000 abstract description 5
- 230000006870 function Effects 0.000 description 10
- 239000005441 aurora Substances 0.000 description 6
- 230000003044 adaptive effect Effects 0.000 description 3
- 230000007423 decrease Effects 0.000 description 3
- 238000012549 training Methods 0.000 description 3
- 230000003247 decreasing effect Effects 0.000 description 2
- 238000009432 framing Methods 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 230000008520 organization Effects 0.000 description 2
- 230000008447 perception Effects 0.000 description 2
- 238000012545 processing Methods 0.000 description 2
- 230000009467 reduction Effects 0.000 description 2
- 238000013459 approach Methods 0.000 description 1
- 230000003190 augmentative effect Effects 0.000 description 1
- 238000011511 automated evaluation Methods 0.000 description 1
- 230000015556 catabolic process Effects 0.000 description 1
- 238000006731 degradation reaction Methods 0.000 description 1
- 238000000605 extraction Methods 0.000 description 1
- 238000009472 formulation Methods 0.000 description 1
- 239000000203 mixture Substances 0.000 description 1
- 230000003252 repetitive effect Effects 0.000 description 1
- 230000004044 response Effects 0.000 description 1
- 230000002123 temporal effect Effects 0.000 description 1
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/28—Constructional details of speech recognition systems
- G10L15/30—Distributed recognition, e.g. in client-server systems, for mobile phones or network applications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/02—Feature extraction for speech recognition; Selection of recognition unit
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/22—Procedures used during a speech recognition process, e.g. man-machine dialogue
- G10L2015/226—Procedures used during a speech recognition process, e.g. man-machine dialogue using non-speech characteristics
- G10L2015/228—Procedures used during a speech recognition process, e.g. man-machine dialogue using non-speech characteristics of application context
Definitions
- the present invention relates to distributed speech recognition (DSR) systems, devices, methods, and signals where speech recognition feature parameters are extracted from speech and encoded at a near or front end, and electromagnetic signals carrying the feature parameters are transmitted to a far or back end where speech recognition is completed.
- DSR distributed speech recognition
- the present invention relates to distributed speech recognition where the front end is provided in a wireless mobile communications terminal and the back end is provided via the communications network.
- DSR Distributed speech recognition
- ETSI European Telecommunications Standards Institute
- DSR is under consideration by ETSI for mobile communications systems since the performance of speech recognition systems using speech signals obtained after transmission over mobile channels can be significantly degraded when compared to using a speech signal which has not passed through an intervening mobile channel. The degradations are a result of both the low bit rate speech coding by the vocoder and channel transmission errors.
- a DSR system overcomes these problems by eliminating the speech coding and the transmission errors normally acceptable for speech for human perception, as opposed to speech to be recognized (STBR) by a machine, and instead sends over an error protected channel a parameterized representation of the speech which is suitable for such automatic recognition.
- a speech recognizer is split into two parts: a first or front end part at the terminal or mobile station which extracts recognition feature parameters, and a second or back end part at the network which completes the recognition from the extracted feature parameters.
- the first part of the recognizer chops an utterance into time intervals called "frames", and for each frame extracts feature parameters, to produce from an utterance a sequence or array of feature parameters.
- the second part of the recognizer feeds the sequence of feature parameters into a Hidden Markov Model (HMM) for each possible word of vocabulary, each HMM for each word having been previously trained by a number of sample sequences of feature parameters from different utterances by the same speaker, or by different speakers if speaker-independence is applied.
- HMM evaluation gives, for each evaluated word, a likelihood that a current utterance is the evaluated word.
- the second part of the recognizer chooses the most likely word as its recognition result.
- DSR in accordance with the Aurora Project does not employ vector quantization (NQ)
- NQ vector quantization
- NQ vector quantization
- a codebook e.g. when sending such data over a channel, wherein each vector is replaced by a corresponding codebook index representing the vector.
- a temporal sequence of vectors is converted to a sequence or string of indices.
- the same codebook is used to recover the sequence of vectors from the sequence or string of indices.
- the present invention is based on the idea that the expected ultimate recognition rate for both discrete and continuous speech recognition decreases as vocabulary size increases, but increases as the number of bits per codebook index or the associated codebook size increases. Yet vocabulary size may vary significantly from one dialogue to another. Consequently, it is possible to conserve network resources while maintaining a sufficient expected recognition rate by dynamically adjusting the number of bits per codebook index or the associated codebook size in dependence on the number of possible words or utterances which can be spoken and recognized within the framework of a dialogue. In a preferred approach a tradeoff between bitrate and expected recognition rate is accomplished by optimizing a metric, e.g. minimizing a cost function, which is a function of both bitrate and expected recognition rate. An upper limit on a bitrate of codebook indicies is readily determined as the number of bits per codebook index divided by the framing interval for which the codebook index is generated.
- a speech coding method in accordance with the invention for coding speech to be recognized (STBR) at a near end for completion of word-level recognition by a machine at a far end in relation to a dialogue between the near and far ends having an associated vocabulary size (V) comprises extracting recognition feature vectors frame-wise from received speech to be recognized, choosing a number of bits in codebook indicies representing recognition feature vectors or an associated codebook size corresponding to the dialogue or associated vocabulary size from among a plurality of choices, selecting indicies from entries of the codebook having the associated size corresponding to the extracted recognition feature vectors, and forming signals for transmission to the far end, which signals are derived from a string of the selected indices.
- a communication device in accordance with the invention comprises a feature vector extractor, a decision block, a coder for selecting indices from a codebook, and a signal former, wherein the decision block chooses a number of bits per index or associated codebook size corresponding to the dialogue or associated vocabulary size from among a number of choices.
- the formed signals to be transmitted include an indication of the number of bits per codebook index or associated codebook size.
- a speech recognition method at a far end comprises receiving signals which are derived from a string of the indices selected from entries in a codebook corresponding to recognition feature vectors extracted framewise from speech to be recognized, which signals include an indication of the number of bits per codebook index or associated codebook size, obtaining the string of indices from the received signals, obtaining the corresponding recognition feature vectors from the string of indices using a codebook having the associated size, and applying the recognition feature vectors to a word-level recognition process.
- an electromagnetic signal in accordance with the invention is configured such that it has encoded therein first data derived from a string of indicies corresponding to entries from a codebook, which entries correspond to recognition feature vectors extracted from speech, and second data indicating a number of bits per codebook index or an associated codebook size.
- Figure 1 shows a distributed speech recognition system including a front or near end speech recognition stage at a mobile station and a far or back end speech recognition stage accessed via the network infrastructure;
- FIGS 2A and 2B show the front or near end speech recognition stage and far or back end stages of Figure 1, respectively, in accordance with the invention
- Figures 3 A and 3B show the form of the relationship between recognition rate (RR) and the size (Sz) of the codebook for speech recognition feature vectors, or number of bits (B) needed for an index therefrom, for discrete and continuous speech recognition, respectively;
- Figure 4 shows a flowchart for finding the number of bits (B), within a predetermined range, needed for a codebook index which optimizes a cost function in accordance with the invention
- Figure 5 shows the organization of data over time in a signal transmitted between the near and far ends in accordance with the invention.
- the present invention proposes a man-to-machine communication protocol, which the inventor has termed "Wireless Speech Protocol” (WSP) to compress speech to be transmitted from a near end to a far end over a wireless link and recognized automatically at the far end in a manner useful for automatic speech recognition rather than speech for human perception.
- WSP employs the concept of distributed speech recognition (DSR), in which the speech recognizer is split into two parts, one at the near end and the other at the far end.
- DSR distributed speech recognition
- Front end unit 14 is essentially the portion of a traditional word recognizer either for discrete speech, i.e speech spoken in a manner to pause briefly between words, or for natural or continuous speech, which extracts recognition feature vector vectors from speech inputted from the mobile station microphone 15. It may be implemented by running ROM based software on the usual processing resources (not shown) within mobile station 12 comprising a digital signal processor (DSP) and a microprocessor.
- DSP digital signal processor
- Communication system 10 further includes a plurality of base stations having different geographical coverage areas, of which base stations 16 and 18 are shown.
- mobile station 12 is shown in communication with base station base station 16 via a communications link 17, although as is known, when mobile station 12 moves from the coverage area of base station 16 to the coverage area of base station 18, a handover coordinated or steered via a base station controller 20 which is in communication with base stations 16 and 18 takes place causing the mobile station 12 to establish a communication link (not shown) with base station 18 and discontinue the communication link 17 with base station 16.
- Data originating at mobile station 12, including data derived from the output of front end unit 14, is communicated from mobile station 12 to the base station 16, with which the mobile station is currently in communication, and also flows to base station controller 20 and then to a network controller 22 which is coupled to various networks including a data network 24 and other resources, e.g. plain old telephone system (POTS) 26.
- POTS plain old telephone system
- Data derived from the output of front end unit 14 may be carried over wireless link 17 to base station 16 by being multiplexed into a data channel, or a General Packet Radio System (GPRS) channel, or be sent over a Short Message Service (SMS) or similar channel.
- Data network 24 is coupled to an application server 28 which includes a back end speech recognition unit or stage 30.
- Back end unit 30 is essentially the portion of a traditional word recognizer for discrete or natural speech which forms word level recognition on the extracted recognition feature vectors extracted by front end unit 14, typically using a Hidden Markov Model (HMM).
- Application server 28 may take the form of, or may act in concert with, a gateway, router or proxy server (not shown) coupled to the public Internet 32.
- the result of speech recognition in back end unit 10 causes data and/or speech obtained from application server 28, or by application server 28 from accessible sources such as the public Internet 32, to be sent to mobile station 12 via data network 24, network controller 22, base station controller 20 and base station 16.
- That data may be, for example, voice XML web pages which define the possible utterances in the current dialogue and the associated vocabulary size Sz, which pages are used by a voice controlled microbrowser 34 or other suitable front end client implemented, e.g. by running ROM based software on the aforementioned processing resources at mobile station 12.
- the speech recognition algorithm divided between front end unit 14 and back end unit 30 may be based on the known Mel-Cepstrum algorithm, which performs well when there is only a low level of background noise at the front end, or such other algorithm as is appropriate for more demanding background noise environment as may be encountered when using a mobile telephone in an automobile.
- the search for and evaluation of suitable algorithms for distributed search recognition in the mobile telephony context are work items of the aforementioned Aurora project of ETSI. That project has a current target bitrate of 4.8 kbits/sec. However, the inventor believes that an average bitrate of about a tenth of the
- Aurora target bitrate could be achieved using the present invention in which the quantization of the recognition feature vector space, or number of bits needed to encode vector quantization codebook indices is adapted based upon vocabulary size in a current dialogue.
- the two main types of speech recognizers Discrete Hidden Markov Model (HMM) and Continuous Hidden Markov Model (HMM), use different methods to "store" speech characteristics on feature space.
- HMM Discrete Hidden Markov Model
- STBR frame-wise compression of speech to be recognized
- VQ Vector-Quantization
- Sz log 2
- the codebook size Sz in VQ is already optimized for the speech recognition task, and any reduction of the number of bits B per codebook index q will down-grade the recognition rate (RR), theoretically.
- RR recognition rate
- ROC Receiver Operator Characteristic
- the output of VQ for one frame is a vector.
- a sequence or array of vectors is produced which can be directly fed into a Continuous HMM evaluation stage.
- the number of bits B per codebook index q is required to be large enough to maintain the best recognition rate RR for all kinds of possible recognition tasks.
- the cost of the transmission should be considered.
- the wireless transmission resources are limited and expensive, and a lower number of bits per codebook index results in a lower transmitted bitrate BR. Accordingly, in order to tradeoff between bitrate BR and recognition rate RR, a suitable metric is used which a function of both of these parameters.
- Cost BR - w* RR; where, w is a tradeoff weight between the average transmitted bitrate (BR) for the whole utterance and the recognition rate (RR).
- BR transmitted bitrate
- RR recognition rate
- the average bitrate BR prior to a later-described time- wise compression of a string of codebook indices (q-string) is readily calculated as the number of bits B per codebook index divided by the known fixed interval between the starts of successive frames.
- the cost function is optimized on a dialogue-by-dialogue basis, i.e. separately with respect to each "dialogue” instead of with respect to the whole recognition task which could involve a series or tree of different dialogues.
- the grammar rules attached to each dialogue can greatly reduce the complexity of recognition, and relatively we can reduce bitrate BR or number of bits B per codebook index without affecting RR too much, and thus lower the cost.
- This can be done using the Receiver Operator Characteristics Language Modeling (ROC-LM) technique. This technique is described in the article "Automated Evaluation of Language Models based on Receiver-Operator-Characteristics Analysis", ICSLP 96, by Yin-Pin Yang and John Deller.
- RR [ f(x ⁇ c)[[ f(y ⁇ w)dyf- 1 ⁇ c
- c) is the probability distributed function (p.d.f.) of word-level HMM evaluation results (likelihood) when correct words are fed into their own word templates (HMM)
- w) is the p.d.f. of word-level HMM evaluation results when wrong words are fed into any randomly-picked word template (HMM).
- is the vocabulary size assuming this is a word recognizer.
- the recognition rate RR is plotted on the vertical axis and the number of bits B (or the corresponding codebook size Sz) on the horizontal axis.
- Figs; 3A and 3B for discrete and continuous speech recognition, respectively.
- q-string the string of codebook indices generated for an utterance. Due to the continuity property of q values in a q-string, we may use a run-length coding scheme to reduce the bitrate by adding additional bits indicating a run length of a particular q- value.
- front end speech recognition unit is seen to comprise a block 40 which chops speech to be recognized (STBR) into frames and extracts a set of recognition feature parameters for each frame, followed by an adaptive codebook vector quantization block 42 which converts each set of feature parameters for a frame to a feature vector and outputs a codebook index q representing the feature vector.
- STBR speech to be recognized
- the output from feature parameter extraction block may be sent without any intervening vector quantization, in accordance with a mode of operation indicated herein as "Layer 1 ", whereas the mode of operation utilizing adaptive codebook vector quantization in accordance with the invention is indicate as "Layer 2".
- the size Sz of the codebook used by adaptive codebook block 42, or the number of bits B per codebook index q, is decided in decision block 44 in response to the vocabulary size
- the B value is initialized to the lowest value in the range, namely 4.
- the recognition rate RR is calculated from the B value and from the vocabulary size
- the expected average bitrate BR is calculated from the B value. If the nonlinear relationship between the expected bitrate BR and the B value is not available, then the linear relationship that bitrate BR is the B value divided by the framing interval may be substituted since it constitutes an upper limit on the actual bitrate. As will appear as the discussion proceeds the actual bitrate is reduced from this upper limit as a result of "time- wise" compression in block 46 of Figure 2 A.
- the Cost is calculated as a function of recognition rate RR and bitrate BR.
- the variable Cost_MAX is set equal to the calculated Cost and the variable B_opt is set equal to the current B value.
- the combination of blocks 40 and 42 effectively compresses or quantizes
- the sequence of sets of feature parameters is inputted to continuous HMM evaluation block 66, and evaluation output is supplied to block 68 wherein the recognition decision is made.
Landscapes
- Engineering & Computer Science (AREA)
- Computational Linguistics (AREA)
- Health & Medical Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Human Computer Interaction (AREA)
- Physics & Mathematics (AREA)
- Acoustics & Sound (AREA)
- Multimedia (AREA)
- Compression, Expansion, Code Conversion, And Decoders (AREA)
- Transmission Systems Not Characterized By The Medium Used For Transmission (AREA)
Abstract
Description
Claims
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
EP01974259A EP1323157A2 (en) | 2000-09-25 | 2001-09-14 | Distributed speech recognition using dynamically determined feature vector codebook size |
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US09/668,541 US6934678B1 (en) | 2000-09-25 | 2000-09-25 | Device and method for coding speech to be recognized (STBR) at a near end |
US09/668,541 | 2000-09-25 |
Publications (2)
Publication Number | Publication Date |
---|---|
WO2002025636A2 true WO2002025636A2 (en) | 2002-03-28 |
WO2002025636A3 WO2002025636A3 (en) | 2002-07-11 |
Family
ID=24682732
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/EP2001/010720 WO2002025636A2 (en) | 2000-09-25 | 2001-09-14 | Distributed speech recognition using dynamically determined feature vector codebook size |
Country Status (3)
Country | Link |
---|---|
US (2) | US6934678B1 (en) |
EP (1) | EP1323157A2 (en) |
WO (1) | WO2002025636A2 (en) |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US11020297B2 (en) | 2015-12-22 | 2021-06-01 | Stryker Corporation | Powered side rail for a patient support apparatus |
US11090209B2 (en) | 2017-06-20 | 2021-08-17 | Stryker Corporation | Patient support apparatus with control system and method to avoid obstacles during reconfiguration |
US11213443B2 (en) | 2017-12-29 | 2022-01-04 | Stryker Corporation | Indication system to identify open space beneath patient support apparatus |
Families Citing this family (11)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US6934678B1 (en) * | 2000-09-25 | 2005-08-23 | Koninklijke Philips Electronics N.V. | Device and method for coding speech to be recognized (STBR) at a near end |
US7353176B1 (en) * | 2001-12-20 | 2008-04-01 | Ianywhere Solutions, Inc. | Actuation system for an agent oriented architecture |
US20040128400A1 (en) * | 2002-12-31 | 2004-07-01 | Krishnamurthy Srinivasan | Method and apparatus for automated gathering of network data |
US7668712B2 (en) * | 2004-03-31 | 2010-02-23 | Microsoft Corporation | Audio encoding and decoding with intra frames and adaptive forward error correction |
US7177804B2 (en) * | 2005-05-31 | 2007-02-13 | Microsoft Corporation | Sub-band voice codec with multi-stage codebooks and redundant coding |
US7831421B2 (en) * | 2005-05-31 | 2010-11-09 | Microsoft Corporation | Robust decoder |
US7707034B2 (en) * | 2005-05-31 | 2010-04-27 | Microsoft Corporation | Audio codec post-filter |
JP4316583B2 (en) * | 2006-04-07 | 2009-08-19 | 株式会社東芝 | Feature amount correction apparatus, feature amount correction method, and feature amount correction program |
US7680664B2 (en) * | 2006-08-16 | 2010-03-16 | Microsoft Corporation | Parsimonious modeling by non-uniform kernel allocation |
US8239195B2 (en) * | 2008-09-23 | 2012-08-07 | Microsoft Corporation | Adapting a compressed model for use in speech recognition |
CN107885756B (en) | 2016-09-30 | 2020-05-08 | 华为技术有限公司 | Deep learning-based dialogue method, device and equipment |
Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
EP0720307A2 (en) * | 1994-12-28 | 1996-07-03 | Sony Corporation | Digital audio signal coding and/or decoding method |
Family Cites Families (11)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
EP0475759B1 (en) * | 1990-09-13 | 1998-01-07 | Oki Electric Industry Co., Ltd. | Phoneme discrimination method |
JP2818362B2 (en) * | 1992-09-21 | 1998-10-30 | インターナショナル・ビジネス・マシーンズ・コーポレイション | System and method for context switching of speech recognition device |
US5627939A (en) * | 1993-09-03 | 1997-05-06 | Microsoft Corporation | Speech recognition system and method employing data compression |
US5699456A (en) * | 1994-01-21 | 1997-12-16 | Lucent Technologies Inc. | Large vocabulary connected speech recognition system and method of language representation using evolutional grammar to represent context free grammars |
JPH10190865A (en) | 1996-12-27 | 1998-07-21 | Casio Comput Co Ltd | Mobile terminal voice recognition/format sentence preparation system |
JPH11112669A (en) | 1997-10-06 | 1999-04-23 | Hitachi Ltd | Paging system and voice recognition device used for the same |
US6070136A (en) * | 1997-10-27 | 2000-05-30 | Advanced Micro Devices, Inc. | Matrix quantization with vector quantization error compensation for robust speech recognition |
US6067515A (en) * | 1997-10-27 | 2000-05-23 | Advanced Micro Devices, Inc. | Split matrix quantization with split vector quantization error compensation and selective enhanced processing for robust speech recognition |
US6256607B1 (en) * | 1998-09-08 | 2001-07-03 | Sri International | Method and apparatus for automatic recognition using features encoded with product-space vector quantization |
US6347297B1 (en) * | 1998-10-05 | 2002-02-12 | Legerity, Inc. | Matrix quantization with vector quantization error compensation and neural network postprocessing for robust speech recognition |
US6934678B1 (en) * | 2000-09-25 | 2005-08-23 | Koninklijke Philips Electronics N.V. | Device and method for coding speech to be recognized (STBR) at a near end |
-
2000
- 2000-09-25 US US09/668,541 patent/US6934678B1/en not_active Expired - Fee Related
-
2001
- 2001-09-14 EP EP01974259A patent/EP1323157A2/en not_active Withdrawn
- 2001-09-14 WO PCT/EP2001/010720 patent/WO2002025636A2/en not_active Application Discontinuation
-
2005
- 2005-06-08 US US11/147,951 patent/US7219057B2/en not_active Expired - Fee Related
Patent Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
EP0720307A2 (en) * | 1994-12-28 | 1996-07-03 | Sony Corporation | Digital audio signal coding and/or decoding method |
Non-Patent Citations (3)
Title |
---|
BUHRKE E R ET AL: "Application of vector quantized hidden Markov modeling to telephone network based connected digit recognition" ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 1994. ICASSP-94., 1994 IEEE INTERNATIONAL CONFERENCE ON ADELAIDE, SA, AUSTRALIA 19-22 APRIL 1994, NEW YORK, NY, USA,IEEE, 19 April 1994 (1994-04-19), pages I-105-I-108, XP010133582 ISBN: 0-7803-1775-0 * |
DIGALAKIS V V ET AL: "QUANTIZATION OF CEPSTRAL PARAMETERS FOR SPEECH RECOGNITION OVER THEWORLD WIDE WEB" IEEE JOURNAL ON SELECTED AREAS IN COMMUNICATIONS, IEEE INC. NEW YORK, US, vol. 17, no. 1, January 1999 (1999-01), pages 82-90, XP000800684 ISSN: 0733-8716 * |
ETSI: "Speech Processing, Transmission and Quality aspects (STQ); Distributed speech recognition ; Front-end feature extraction algorithm ; Compression algorithms" ETSI ES201108 V1.1.2, April 2000 (2000-04), pages 1-20, XP002193738 cited in the application * |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US11020297B2 (en) | 2015-12-22 | 2021-06-01 | Stryker Corporation | Powered side rail for a patient support apparatus |
US11998499B2 (en) | 2015-12-22 | 2024-06-04 | Stryker Corporation | Powered side rail for a patient support apparatus |
US11090209B2 (en) | 2017-06-20 | 2021-08-17 | Stryker Corporation | Patient support apparatus with control system and method to avoid obstacles during reconfiguration |
US11213443B2 (en) | 2017-12-29 | 2022-01-04 | Stryker Corporation | Indication system to identify open space beneath patient support apparatus |
Also Published As
Publication number | Publication date |
---|---|
EP1323157A2 (en) | 2003-07-02 |
US20050267753A1 (en) | 2005-12-01 |
US7219057B2 (en) | 2007-05-15 |
US6934678B1 (en) | 2005-08-23 |
WO2002025636A3 (en) | 2002-07-11 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US7219057B2 (en) | Speech recognition method | |
KR100391287B1 (en) | Speech recognition method and system using compressed speech data, and digital cellular telephone using the system | |
US5812965A (en) | Process and device for creating comfort noise in a digital speech transmission system | |
KR100594670B1 (en) | Automatic speech/speaker recognition over digital wireless channels | |
US6810379B1 (en) | Client/server architecture for text-to-speech synthesis | |
US6119086A (en) | Speech coding via speech recognition and synthesis based on pre-enrolled phonetic tokens | |
US7941313B2 (en) | System and method for transmitting speech activity information ahead of speech features in a distributed voice recognition system | |
US8050911B2 (en) | Method and apparatus for transmitting speech activity in distributed voice recognition systems | |
FI119533B (en) | Coding of audio signals | |
CN100371988C (en) | Method and apparatus for speech reconstruction within a distributed speech recognition system | |
US20110153326A1 (en) | System and method for computing and transmitting parameters in a distributed voice recognition system | |
CN1653521B (en) | Method for adaptive codebook pitch-lag computation in audio transcoders | |
JPH0683400A (en) | Speech-message processing method | |
JP2001356792A (en) | Method and device for performing automatic speech recognition | |
JPH07311598A (en) | Generation method of linear prediction coefficient signal | |
JP2003501675A (en) | Speech synthesis method and speech synthesizer for synthesizing speech from pitch prototype waveform by time-synchronous waveform interpolation | |
JPH09204199A (en) | Method and device for efficient encoding of inactive speech | |
KR20010022714A (en) | Speech coding apparatus and speech decoding apparatus | |
WO2001095312A1 (en) | Method and system for adaptive distributed speech recognition | |
Ion et al. | A novel uncertainty decoding rule with applications to transmission error robust speech recognition | |
EP1435086B1 (en) | A method and apparatus to perform speech recognition over a voice channel | |
KR101011320B1 (en) | Identification and exclusion of pause frames for speech storage, transmission and playback | |
JP2003005949A (en) | Server client type voice recognizing device and method | |
US20050276235A1 (en) | Packet loss concealment based on statistical n-gram predictive models for use in voice-over-IP speech transmission | |
TW541516B (en) | Distributed speech recognition using dynamically determined feature vector codebook size |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
AK | Designated states |
Kind code of ref document: A2 Designated state(s): CN IN JP KR |
|
AL | Designated countries for regional patents |
Kind code of ref document: A2 Designated state(s): AT BE CH CY DE DK ES FI FR GB GR IE IT LU MC NL PT SE TR |
|
WWE | Wipo information: entry into national phase |
Ref document number: IN/PCT/2002/765/CHE Country of ref document: IN |
|
AK | Designated states |
Kind code of ref document: A3 Designated state(s): CN IN JP KR |
|
AL | Designated countries for regional patents |
Kind code of ref document: A3 Designated state(s): AT BE CH CY DE DK ES FI FR GB GR IE IT LU MC NL PT SE TR |
|
WWE | Wipo information: entry into national phase |
Ref document number: 2001974259 Country of ref document: EP |
|
121 | Ep: the epo has been informed by wipo that ep was designated in this application | ||
WWP | Wipo information: published in national office |
Ref document number: 2001974259 Country of ref document: EP |
|
NENP | Non-entry into the national phase |
Ref country code: JP |
|
WWW | Wipo information: withdrawn in national office |
Ref document number: 2001974259 Country of ref document: EP |