US6697776B1 - Dynamic signal detector system and method - Google Patents
Dynamic signal detector system and method Download PDFInfo
- Publication number
- US6697776B1 US6697776B1 US09/628,891 US62889100A US6697776B1 US 6697776 B1 US6697776 B1 US 6697776B1 US 62889100 A US62889100 A US 62889100A US 6697776 B1 US6697776 B1 US 6697776B1
- Authority
- US
- United States
- Prior art keywords
- signal
- encoding
- classification
- voice
- digitized signal
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Expired - Lifetime, expires
Links
- 238000000034 method Methods 0.000 title claims description 38
- 238000001514 detection method Methods 0.000 claims abstract description 27
- 238000011156 evaluation Methods 0.000 claims abstract description 19
- 230000008859 change Effects 0.000 claims abstract description 11
- 230000003247 decreasing effect Effects 0.000 claims abstract description 5
- 230000005540 biological transmission Effects 0.000 claims description 15
- 239000013598 vector Substances 0.000 claims description 12
- 230000005236 sound signal Effects 0.000 claims description 10
- 230000000694 effects Effects 0.000 claims description 5
- 238000004891 communication Methods 0.000 abstract description 8
- 230000001419 dependent effect Effects 0.000 abstract description 7
- 239000000284 extract Substances 0.000 abstract description 3
- 230000008569 process Effects 0.000 description 4
- 238000012545 processing Methods 0.000 description 4
- 238000010586 diagram Methods 0.000 description 3
- 230000003044 adaptive effect Effects 0.000 description 2
- 230000015556 catabolic process Effects 0.000 description 2
- 238000006731 degradation reaction Methods 0.000 description 2
- 230000001755 vocal effect Effects 0.000 description 2
- 230000008901 benefit Effects 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 238000013210 evaluation model Methods 0.000 description 1
- 238000012854 evaluation process Methods 0.000 description 1
- 230000010354 integration Effects 0.000 description 1
- 230000002452 interceptive effect Effects 0.000 description 1
- 238000005259 measurement Methods 0.000 description 1
- 230000007246 mechanism Effects 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000000737 periodic effect Effects 0.000 description 1
- 230000009467 reduction Effects 0.000 description 1
- 230000001360 synchronised effect Effects 0.000 description 1
- 230000007723 transport mechanism Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L19/00—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
- G10L19/04—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using predictive techniques
- G10L19/16—Vocoder architecture
- G10L19/18—Vocoders using multiple modes
- G10L19/20—Vocoders using multiple modes using sound class specific coding, hybrid encoders or object based coding
Definitions
- the field of this invention relates to signal processing which identifys the type of signal received in order to optimize the transmission and reception of said signal. More particularly, the field of this invention relates to audio signal processing through an encoder selected to optimize the quality of the signal on decoding and optimize the use of bandwidth.
- the related art is replete with detectors and encoders which encodes audio signals which are related to speech. Speech signals are processed and parameters developed in the form of feature vectors which may transmitted in digital form and later combined in a decoder to reconstruct the speech.
- Digital speech signals operate on data transmission media having limited available bandwidth. Accordingly, data transmission rates are minimized using various techniques which are geared to optimize speech signals to maintain a high perceptual quality. These systems include all transmission modes such as wireless, Voice Over IP, direct wire, cable, ISDN, modems and the like.
- the International Telecommunication Union has established a number of standards for speech processing. Among these are G.729 standard which processes speech at 8 Kbits/second
- G.729 standard provides good quality transmission of speech while minimizing band width.
- This standard presents a standard way of performing the integration and expansion of speech signals to optimize speech quality and ensures communication quality.
- the G.729 standard has been expanded so as to include music processing capability (Annex E at 11.8 Kbits/second, G.729E). Furthermore, the standards now include DTX (Annex G) functionality for 11.8 Kbits/second CS-ACELP algorithm in Annex E.
- the G.729G standard provides for music detection immediately following Voice Activity Detection (VAD). The music detection algorithm corrects the decision from the VAD in the presence of music signals.
- VAD Voice Activity Detection
- the present invention provides a system where the bit rate encoding or the associated transport mechanism can be changed dynamically to provide encoding for different types of signals at bit rates or encoding methods optimized to properly reconstruct the input signal whether speech or non-speech.
- non-speech signals can include modem signals and facsimile signals.
- the application is driven through a change of parameters that can make the system a speech or music recognizer over an IP gateway, for example, dependent what signal is to be listened for.
- IP gateway for example, dependent what signal is to be listened for.
- voice over IP it is equally applicable to other transmission systems, such as wireless, DSI, voice over cable systems and other transmission systems and may be operated on a continuous, incremental or packetized/frame basis.
- the dynamic signal detector of the present invention includes three basic components a recognizing module which categorizes the type of input signal, an evaluation or classification module which evaluates the quality of the signal based on the category and a recommendation module which makes a recommendation based on the quality of the signal to change the standard used to encode the signals received to improve quality.
- the dynamic signal detector receives the digitized input signal and uses an algorithm to extract the feature vectors parameters for evaluation. These parameters are tested and a determination made if a switch of encoding standard or a modification of the transport parameters are required to improve the reconstructed signal. External signals may also be available for evaluation dependent on the particular system.
- the dynamic signal detector may be present at both ends of the communication channel. Each is located on the encoder side which detects the digitized signal in the first instance and evaluates the feature vectors to determine the character of the signal. The dynamic signal detector determines whether a quality signal can be generated by the then current encoder and selects a decreased or increased bitrate or other encoding format as required.
- the signal is music a higher bitrate standard than voice is applied. If the signal is voice a lower bandwidth standard will do. If the signal is a modem or a facsimile and modem or facsimile format is applied.
- This evaluation, recommendation and change can occur on a continuous basis or on a frame by frame or packet by packet basis dependent on the nature of the signal.
- Statistical techniques for evaluation of frames or packets and their associated recommendations can also be applied over an arbitrary number of samples, or by whatever other means is suitable for the application.
- FIG. 1 is a graph of the relationship of bit rate of various types of signals to quality.
- FIG. 2 is a chart relating signal complexity to various encoding standards.
- FIG. 3 is a block diagram of the dynamic signal detector.
- FIG. 4 is a block diagram of a typical PSTN system having an integrated voice over IP system.
- FIG. 5 is a schematic of a packet of data with a header and a payload.
- FIGS. 6A and B are a flow chart of the recognition, classification and recommendation system.
- Quality is a subjective measurement and such techniques as Mean Opinion Score (MOS) or an E-model (Evaluation Model) for speech, or other mechanisms are used to indicate quality.
- MOS Mean Opinion Score
- E-model Evaluation Model
- Perceptible quality speech based on the Mean Opinion Score (MOS) is as set forth in Table I below, of at least 3 or higher to be tolerable.
- FIG. 1 illustrates the different quality considerations for various speech signal such as clean speech, 101 , Speech with background noise, 102 , speech with heavy background noise, 103 as compared to music, 100 with existing speech coding systems.
- the present invention comprises a recognition module, an evaluation module and a recommendation module. Because the significant cascade quality drop for low bit-rate speech codecs when used with music signals, it is essential to be able to detect the nature of the incoming signals as being music, active speech or background noise (silence being a special case of background noise).
- the role of the recognition module which is model the perceived quality of an audio signal by extracting the feature vectors,
- the evaluation module For the evaluation module, its role is to identify where would be best tradeoff point given the nature of the incoming signal. For example, if the incoming signal is active speech without background noise, then it is known that coding it as G.723.1 at 6.3 kb/s or above will result in sufficient quality because the quality curve of FIG. 1 is fairly flat after that point (the saturation region), but if the incoming signal is active speech with background noise, then the evaluation module may need to identify the type of noise (room noise, car noise, street noise, interference talker, stationary or non-stationary noises, etc.) and the noise level. An evaluation of the feature vectors resulting in a given circumstance may need to be determined on a limited trial and error basis. If the incoming signal is vocal music, composed music, or something else.
- the evaluation module might consider other input such as desired tradeoff from a network planning point of view. For example, one user might decide that quality is the most important factor to be considered in the evaluation process, while another user might decide that some degradation is acceptable provided that there is a bit-rate reduction.
- the recommendation module can be updated with the characteristics of various speech coding systems available from time to time and recommend the best usage of a particular speech coding system, considering the outcome of the evaluation module and the availability of various speech coding systems.
- FIG. 2 gives an example of the relative ordering of various signals of a complexity rating of 1 to 10 where 10 is the highest complexity signal compared to the relative complexity of the encoding standards.
- Silence being the lowest complexity signal would be encoded using G.723.1A while true music would be encoded using G.728 or G.726 ADPCM.
- G.711 could be used to encode any signal but since it is at 64 Kbits/s it does not provide any bit rate savings.
- the purpose of the present invention is to provide a dynamic way to evaluate and encode signals to take advantage of the application of a standard which is adequate to encode the signal dependent on its complexity.
- the VAD module and the music detector found in the G.729, Annex G standard returns basically a three level indication: (1) music, (2) active speech and music, and (3) background noise.
- a very simple evaluation module could be found in the TIA IS 127 (cdma EVRC) standard, which is incorporated herein by reference or other standards or techniques which are or may be available from time to time.
- the evaluation or classification module will analyze the complexity of the incoming signal based on a set of predetermined criteria. This module can be viewed as being a finer signal classifier that will return a much finer multi-level indication.
- the recommendation module of the present invention will take the particular classification and will recommend the use of the best standard available at the time for optimum encoding of the signal evaluated.
- the specific embodiment of the present invention is described in the form of a Voice over IP system which bypasses a typical PSTN network.
- the invention described may be applied to a wireless network, LAN, WAN, direct line network, or virtually any other point to point transmission system, and can apply also to other media like fax over packet, modem over packet, and other communication systems and is not intended to be limited to the specific embodiment described nor indeed is the invention limited to a packetized system.
- FIG. 3 illustrates a recognizing module 2 which generates parameters representative of the signal or signal frame being processed.
- the parameters are passed to the Evaluation Module 3 which evaluates the audio signal based on the parameters to determine the class of the signals as set forth in FIG. 2 . This is accomplished by the evaluation of the parameters (feature vectors) and classifying the signal as silence, background noise, active speech without noise, active speech with background noise, or music. Some trial and error is required to adjust the parameter levels to provide the perceived optimum performance dependent on the particular application.
- a recommendation module 4 makes a recommendation based on the classification of the complexity of the signal as to which codex is to be used to code the signal.
- the present invention detects that a music signal is present in accordance with the G.729G standard. That signal is evaluated and a determination made that a higher bandwidth than that being currently used is required.
- the recommendation module 4 then recommends switching the encoding standard to a higher bit rate such as G.726 ADPCM at 24, 32, or 40 Kbits/second, all of which are very adequate for music.
- Other voice standards exist such as G.723.1 at 5.3 and 6.3 Kbits/second and most recently G.729E at 11.2 Kbits/second as noted above.
- the present invention detects the higher bit rate signal requirements by determining the character of the feature vectors of the signal either on a frame by frame basis or as a continuous signal dependent on the system and classifies the nature of the signal on the continuum of FIG. 2 . Based on the users desired quality v/s bit rate evaluation as noted above specific classes of signals can be used to make a recommendation to change the bit rate capability for input digital audio signals that require higher bit rate data to be properly reconstructed in accordance with user goals such as optimizing bit rate and quality or the best quality regardless of bit rate. Music signals are but one example of such signals.
- FIG. 4 shows a typical telephone set 5 connected over a twisted wire pair to a central office 6 , which communicates through a standard analog PSTN network 7 to another central office 8 which communicates with another telephone set 9 over a twisted wire pair.
- the PSTN is a dedicated bandwidth which is a synchronous stream due to allocated channels from one end to the other.
- FIG. 4 further shows the central office including a Time Division Multiplex module (TDM) which multiplexes the data into time segments which are individually evaluated by the dynamic signal detector 1 of the present invention which is usually co-located with the other components of the gateway 12 its functionality may be located elsewhere where necessary or appropriate.
- TDM Time Division Multiplex module
- the gateway 12 selects the encoder 12 a from a group of encoding standards 14 based on the recommendation of the dynamic signal detector 1 and encodes the signal.
- the gateway uses a packetizer 12 b to convert the encoded signal data into packetized data which is then applied to the voice over IP gateway 12 .
- the IP gateway 12 is connected to the IP space 13 and then communicates with another gateway 12 ′ which extracts or de-packetizes using a de-packetizer 12 c ′ and the de-packetized data is decoded by a decoder 12 d ′ and is coupled to a TDM demultiplexor module 19 ′ which demultiplexes the decoded signal and communicates with the central office 8 and then to the telephone set 9 .
- gateway 12 which extracts or de-packetizes the packet using a de-packetizer 12 c and the de-packetized data is decoded by a decoder 12 d and coupled to a TDM demultiplexor 19 which communicates with the central office 6 and then to the telephone set 5 .
- TDM multiplexing and demultiplexing is one of many choices known in the art to time divide multiplex the data and the present invention is not intended to be restricted to TDM.
- the dynamic signal detector 1 incorporated into the gateway 12 , and the gateway 12 ′ respectively for each side of the network although in certain embodiments, e.g. those which do not involve a gateway, the dynamic signal detector 1 may be elsewhere.
- the IP packets 15 which are generated by the gateway 12 and the gateway 12 ′ include a header 16 and a payload 17 .
- the header includes information regarding the environment for the packet, that is, the address and other routing information as well as parametric information.
- the payload 17 contains the encoded data for a given half-duplex (i.e., one-way communication) channel. Two such channels are usually required for a full-duplex communication, as is required for normal interactive communication.
- the IP network is a shared bandwidth network which means that the bandwidth may be significantly narrower than in the case of a dedicated network. Accordingly, other standards such as G.723.1 which runs at a bandwidth 6.3 Kbits/sec or as G.729. at 8 Kbits per second are used for speech.
- Packets are not as safe as information over a dedicated network because a voice packet may be lost. If a packet gets dropped the audio must be rebuilt or played without the missing data. This results in audio performance degradation. Multiple identical packets may be sent in the event that the loss is unacceptable to enable the receipt of sufficient packets required for acceptable speech.
- Table III shows the various PCM format standards which can be utilized to encode audio signals.
- Each of the standards includes parametric information (feature vectors) and the process for detecting and coding required by the standard.
- the encoded signal is inserted into the packet 15 payload 17 and parametric information including formatting information is inserted into header 16 and the encoded packetized audio is output 26 . It should be noted that as the packet traverses the IP network, additional headers may be added during routing.
- initial detection of music or voice is accomplished by the VAD but many other systems could be used to perform this function. Whatever system is used the parameters derived must be sufficient to permit the signal evaluator (classification) module to output data useful in selecting encoders.
- Signal detection schemes are defined in the most recent G.729G recommended standard, in the Telecommunication Standardization Sector COM 16 ⁇ no.>-E entitled ITU-T G.729 Annex G proposed for decision: DTX functionality for G.729 Annex E which is attached hereto and incorporated herein by reference and the detection algorithm of the detector includes a section to compute relevant parameters and a section to generate a classification based on such parameters.
- Music detection for example is in accordance with G.729G is based on the determination of the following parameters as set forth in Table II.
- Vad_dec VAD decision of the current frame.
- Vad_deci VAD decision of the previous frame.
- Lpc_mod flag indicator of either forward or backward adaptive LPC of the previous frame.
- Rc reflection coefficients from LPC analysis.
- Lag_buf buffer of corrected open loop pitch lags of last 5 frames.
- Pgain_buf buffer of closed loop pitch gain of last 5 subframes.
- Energy first autocorrelation coefficient R(0) from LPC analysis.
- LLenergy normalized log energy from VAD module.
- Frm_count counter of the number of processed signal frames. Rate, selection of speech coder
- G.729G is useful in detecting non-periodic audio such as music which is useful in selecting different encoding formats.
- G.729G includes detection for VAD and G.729E parameters.
Landscapes
- Engineering & Computer Science (AREA)
- Computational Linguistics (AREA)
- Signal Processing (AREA)
- Health & Medical Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Human Computer Interaction (AREA)
- Physics & Mathematics (AREA)
- Acoustics & Sound (AREA)
- Multimedia (AREA)
- Data Exchanges In Wide-Area Networks (AREA)
Abstract
Description
TABLE I |
Mean Opinion Score |
MOS | QUALITY |
5 | Excellent |
4 | Toll- |
3 | Some |
2 | |
1 | Unintelligible |
TABLE III |
Audio Coding Standards |
Fre- | |||||
Input | quency | Frame | Bit- | ||
Sample | Band- | size | rate | ||
Standard | Rate | width | (ms) | (kbps) | Technology |
G.711 | 8 KHz | 4 KHz | 0.125 | 64 | Non-linear |
PCM | |||||
G.721 | 8 KHz | 4 KHz | 0.125 | 32 | ADPCM |
G.722 | 16 KHz | 7 KHz | 64 | ADPCM | |
G.723 | 8 KHz | 4 KHz | 0.125 | 24, 40 | ADPCM |
G.723.1 | 8 KHz | 4 KHz | 30 | 5.3, 6.3- | CELP |
Main body | |||||
0(DTX), 0.8- | |||||
Annex A | |||||
G.726 | 8 KHz | 4 KHz | 0.125 | 16, 24, 32, 40 | ADPCM |
G.727 | 8 KHz | 4 KHz | 0.125 | 16, 24, 32, 40 | Embedded |
ADPCM | |||||
G.728 | 8 KHz | 4 KHz | 2.5 | 16 | LD-CELP |
G.729 | 8 KHz | 4 KHz | 10 | 8-Mainbody | CELP |
8-Annex A | |||||
0(DTX), 1.5- | |||||
Annex B | |||||
Floating-pt, | |||||
Annex C | |||||
6.4-Annex D | |||||
11.2-Annex E | |||||
D + B = | |||||
Annex F | |||||
E + B = | |||||
Annex G | |||||
D + E = | |||||
Annex H | |||||
Main + A + | |||||
B +D + E = | |||||
Annex 1 | |||||
IS-54 | 8 KHz | 4 KHz | 20 | 7.95 | VSELP |
IS-96 | 8 KHz | 4 KHz | 20 | 0.8, 2.0, 4.0, | CELP, VBR |
8.5 | |||||
IS-733 | 8 KHz | 4 KHz | 20 | 1.0, 2.8, 6.2, | CELP, VBR |
13.3 | |||||
IS-127 | 8 KHz | 4 KHz | 20 | 0.8, 4.0, 8.5 | RCELP, |
VBR | |||||
IS-641 | 8 KHz | 4 KHz | 20 | 7.4 | ACELP |
GSMFR | 8 KHz | 4 KHz | 20 | 13 | RP-LTP |
GSM EFR | 8 KHz | 4 KHz | 20 | 12.2 | ACELP |
GSM | 8 KHz | 4 KHz | 20 | 4.75, 5.15, 5.9, | ACELP |
AMR | 6.7, 7.4, 7.95, | ||||
10.2, 12.2 | |||||
Note: | |||||
CELP = Code Excited Linear Prediction | |||||
VSELP = Vector-sum excited linear prediction | |||||
ACELP = Algebraic CELP | |||||
LD-CELP = Low-delay CELP | |||||
RCELP = Relaxed CELP | |||||
VBR = Variable bit rate | |||||
FR = Full Rate | |||||
EFR = Enhanced Full-Rate | |||||
AMR = Adaptive Multi-Rate | |||||
IS- = Interim Standard | |||||
DTX = Discontinuous Transmission |
TABLE II |
Signal Feature Parameters |
Vad_dec, VAD decision of the current frame. | ||
Vad_deci, VAD decision of the previous frame. | ||
Lpc_mod, flag indicator of either forward or backward adaptive | ||
LPC of the previous frame. | ||
Rc, reflection coefficients from LPC analysis. | ||
Lag_buf, buffer of corrected open loop pitch lags of last 5 frames. | ||
Pgain_buf, buffer of closed loop pitch gain of last 5 subframes. | ||
Energy, first autocorrelation coefficient R(0) from LPC analysis. | ||
LLenergy, normalized log energy from VAD module. | ||
Frm_count, counter of the number of processed signal frames. | ||
Rate, selection of speech coder | ||
Claims (16)
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US09/628,891 US6697776B1 (en) | 2000-07-31 | 2000-07-31 | Dynamic signal detector system and method |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US09/628,891 US6697776B1 (en) | 2000-07-31 | 2000-07-31 | Dynamic signal detector system and method |
Publications (1)
Publication Number | Publication Date |
---|---|
US6697776B1 true US6697776B1 (en) | 2004-02-24 |
Family
ID=31496201
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US09/628,891 Expired - Lifetime US6697776B1 (en) | 2000-07-31 | 2000-07-31 | Dynamic signal detector system and method |
Country Status (1)
Country | Link |
---|---|
US (1) | US6697776B1 (en) |
Cited By (18)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20020042254A1 (en) * | 2000-08-11 | 2002-04-11 | Alcatel | Method of evaluating the quality of a radio link in a mobile radiocommunication system |
US20020141392A1 (en) * | 2001-03-30 | 2002-10-03 | Yasuo Tezuka | Gateway apparatus and voice data transmission method |
US20030061036A1 (en) * | 2001-05-17 | 2003-03-27 | Harinath Garudadri | System and method for transmitting speech activity in a distributed voice recognition system |
US20030061042A1 (en) * | 2001-06-14 | 2003-03-27 | Harinanth Garudadri | Method and apparatus for transmitting speech activity in distributed voice recognition systems |
US20030135363A1 (en) * | 2001-11-02 | 2003-07-17 | Dunling Li | Speech coder and method |
US20040081176A1 (en) * | 2000-10-30 | 2004-04-29 | Elwell John Robert | End-to-end voice over ip streams for telephone calls established via legacy switching systems |
US20050047422A1 (en) * | 2003-08-27 | 2005-03-03 | Mindspeed Technologies, Inc. | Method and system for detecting facsimile communication during a VoIP session |
US20050055201A1 (en) * | 2003-09-10 | 2005-03-10 | Microsoft Corporation, Corporation In The State Of Washington | System and method for real-time detection and preservation of speech onset in a signal |
US20070011006A1 (en) * | 2005-07-05 | 2007-01-11 | Kim Doh-Suk | Speech quality assessment method and system |
WO2007016107A2 (en) * | 2005-08-02 | 2007-02-08 | Dolby Laboratories Licensing Corporation | Controlling spatial audio coding parameters as a function of auditory events |
US20080071523A1 (en) * | 2004-07-20 | 2008-03-20 | Matsushita Electric Industrial Co., Ltd | Sound Encoder And Sound Encoding Method |
US20090099851A1 (en) * | 2007-10-11 | 2009-04-16 | Broadcom Corporation | Adaptive bit pool allocation in sub-band coding |
US20090281812A1 (en) * | 2006-01-18 | 2009-11-12 | Lg Electronics Inc. | Apparatus and Method for Encoding and Decoding Signal |
US20090304032A1 (en) * | 2003-09-10 | 2009-12-10 | Microsoft Corporation | Real-time jitter control and packet-loss concealment in an audio signal |
CN101141644B (en) * | 2007-10-17 | 2010-12-08 | 清华大学 | Encoding integration system and method and decoding integration system and method |
CN104040626A (en) * | 2012-01-13 | 2014-09-10 | 高通股份有限公司 | Multiple coding mode signal classification |
US20150223110A1 (en) * | 2014-02-05 | 2015-08-06 | Qualcomm Incorporated | Robust voice-activated floor control |
US9564136B2 (en) | 2014-03-06 | 2017-02-07 | Dts, Inc. | Post-encoding bitrate reduction of multiple object audio |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US5991442A (en) * | 1995-05-10 | 1999-11-23 | Canon Kabushiki Kaisha | Method and apparatus for pattern recognition utilizing gaussian distribution functions |
US6070137A (en) * | 1998-01-07 | 2000-05-30 | Ericsson Inc. | Integrated frequency-domain voice coding using an adaptive spectral enhancement filter |
US6161089A (en) * | 1997-03-14 | 2000-12-12 | Digital Voice Systems, Inc. | Multi-subframe quantization of spectral parameters |
US6418412B1 (en) * | 1998-10-05 | 2002-07-09 | Legerity, Inc. | Quantization using frequency and mean compensated frequency input data for robust speech recognition |
-
2000
- 2000-07-31 US US09/628,891 patent/US6697776B1/en not_active Expired - Lifetime
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US5991442A (en) * | 1995-05-10 | 1999-11-23 | Canon Kabushiki Kaisha | Method and apparatus for pattern recognition utilizing gaussian distribution functions |
US6161089A (en) * | 1997-03-14 | 2000-12-12 | Digital Voice Systems, Inc. | Multi-subframe quantization of spectral parameters |
US6070137A (en) * | 1998-01-07 | 2000-05-30 | Ericsson Inc. | Integrated frequency-domain voice coding using an adaptive spectral enhancement filter |
US6418412B1 (en) * | 1998-10-05 | 2002-07-09 | Legerity, Inc. | Quantization using frequency and mean compensated frequency input data for robust speech recognition |
Cited By (36)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20020042254A1 (en) * | 2000-08-11 | 2002-04-11 | Alcatel | Method of evaluating the quality of a radio link in a mobile radiocommunication system |
US7099637B2 (en) * | 2000-08-11 | 2006-08-29 | Alcatel | Method of evaluating the quality of a radio link in a mobile radiocommunication system |
US20040081176A1 (en) * | 2000-10-30 | 2004-04-29 | Elwell John Robert | End-to-end voice over ip streams for telephone calls established via legacy switching systems |
US7848315B2 (en) * | 2000-10-30 | 2010-12-07 | Siemens Enterprise Communications Limited | End-to-end voice over IP streams for telephone calls established via legacy switching systems |
US20020141392A1 (en) * | 2001-03-30 | 2002-10-03 | Yasuo Tezuka | Gateway apparatus and voice data transmission method |
US7941313B2 (en) | 2001-05-17 | 2011-05-10 | Qualcomm Incorporated | System and method for transmitting speech activity information ahead of speech features in a distributed voice recognition system |
US20030061036A1 (en) * | 2001-05-17 | 2003-03-27 | Harinath Garudadri | System and method for transmitting speech activity in a distributed voice recognition system |
US20030061042A1 (en) * | 2001-06-14 | 2003-03-27 | Harinanth Garudadri | Method and apparatus for transmitting speech activity in distributed voice recognition systems |
US20070192094A1 (en) * | 2001-06-14 | 2007-08-16 | Harinath Garudadri | Method and apparatus for transmitting speech activity in distributed voice recognition systems |
US8050911B2 (en) | 2001-06-14 | 2011-11-01 | Qualcomm Incorporated | Method and apparatus for transmitting speech activity in distributed voice recognition systems |
US7203643B2 (en) * | 2001-06-14 | 2007-04-10 | Qualcomm Incorporated | Method and apparatus for transmitting speech activity in distributed voice recognition systems |
US20030135363A1 (en) * | 2001-11-02 | 2003-07-17 | Dunling Li | Speech coder and method |
US7386447B2 (en) * | 2001-11-02 | 2008-06-10 | Texas Instruments Incorporated | Speech coder and method |
WO2005024547A3 (en) * | 2003-08-27 | 2006-05-11 | Mindspeed Tech Inc | Method and system for detecting facsimile communication during a voip session |
WO2005024547A2 (en) * | 2003-08-27 | 2005-03-17 | Mindspeed Technologies, Inc. | Method and system for detecting facsimile communication during a voip session |
US20050047422A1 (en) * | 2003-08-27 | 2005-03-03 | Mindspeed Technologies, Inc. | Method and system for detecting facsimile communication during a VoIP session |
US7545818B2 (en) | 2003-08-27 | 2009-06-09 | Mindspeed Technologies, Inc. | Method and system for detecting facsimile communication during a VoIP session |
US7412376B2 (en) * | 2003-09-10 | 2008-08-12 | Microsoft Corporation | System and method for real-time detection and preservation of speech onset in a signal |
US20090304032A1 (en) * | 2003-09-10 | 2009-12-10 | Microsoft Corporation | Real-time jitter control and packet-loss concealment in an audio signal |
US20050055201A1 (en) * | 2003-09-10 | 2005-03-10 | Microsoft Corporation, Corporation In The State Of Washington | System and method for real-time detection and preservation of speech onset in a signal |
US20080071523A1 (en) * | 2004-07-20 | 2008-03-20 | Matsushita Electric Industrial Co., Ltd | Sound Encoder And Sound Encoding Method |
US7873512B2 (en) * | 2004-07-20 | 2011-01-18 | Panasonic Corporation | Sound encoder and sound encoding method |
US7856355B2 (en) * | 2005-07-05 | 2010-12-21 | Alcatel-Lucent Usa Inc. | Speech quality assessment method and system |
US20070011006A1 (en) * | 2005-07-05 | 2007-01-11 | Kim Doh-Suk | Speech quality assessment method and system |
WO2007016107A3 (en) * | 2005-08-02 | 2008-08-07 | Dolby Lab Licensing Corp | Controlling spatial audio coding parameters as a function of auditory events |
US20090222272A1 (en) * | 2005-08-02 | 2009-09-03 | Dolby Laboratories Licensing Corporation | Controlling Spatial Audio Coding Parameters as a Function of Auditory Events |
WO2007016107A2 (en) * | 2005-08-02 | 2007-02-08 | Dolby Laboratories Licensing Corporation | Controlling spatial audio coding parameters as a function of auditory events |
TWI396188B (en) * | 2005-08-02 | 2013-05-11 | Dolby Lab Licensing Corp | Controlling spatial audio coding parameters as a function of auditory events |
US20090281812A1 (en) * | 2006-01-18 | 2009-11-12 | Lg Electronics Inc. | Apparatus and Method for Encoding and Decoding Signal |
US20110057818A1 (en) * | 2006-01-18 | 2011-03-10 | Lg Electronics, Inc. | Apparatus and Method for Encoding and Decoding Signal |
US20090099851A1 (en) * | 2007-10-11 | 2009-04-16 | Broadcom Corporation | Adaptive bit pool allocation in sub-band coding |
CN101141644B (en) * | 2007-10-17 | 2010-12-08 | 清华大学 | Encoding integration system and method and decoding integration system and method |
CN104040626A (en) * | 2012-01-13 | 2014-09-10 | 高通股份有限公司 | Multiple coding mode signal classification |
US20150223110A1 (en) * | 2014-02-05 | 2015-08-06 | Qualcomm Incorporated | Robust voice-activated floor control |
US9564136B2 (en) | 2014-03-06 | 2017-02-07 | Dts, Inc. | Post-encoding bitrate reduction of multiple object audio |
US9984692B2 (en) | 2014-03-06 | 2018-05-29 | Dts, Inc. | Post-encoding bitrate reduction of multiple object audio |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US6697776B1 (en) | Dynamic signal detector system and method | |
US7203638B2 (en) | Method for interoperation between adaptive multi-rate wideband (AMR-WB) and multi-mode variable bit-rate wideband (VMR-WB) codecs | |
US7657427B2 (en) | Methods and devices for source controlled variable bit-rate wideband speech coding | |
JP6546897B2 (en) | Method of performing coding for frame loss concealment for multi-rate speech / audio codecs | |
US9053702B2 (en) | Systems, methods, apparatus, and computer-readable media for bit allocation for redundant transmission | |
AU2005246538B2 (en) | Supporting a switch between audio coder modes | |
US8069034B2 (en) | Method and apparatus for encoding an audio signal using multiple coders with plural selection models | |
US8032370B2 (en) | Method, apparatus, system and software product for adaptation of voice activity detection parameters based on the quality of the coding modes | |
KR100798668B1 (en) | Method and apparatus for coding of unvoiced speech | |
KR100395458B1 (en) | Method for decoding an audio signal with transmission error correction | |
US7054809B1 (en) | Rate selection method for selectable mode vocoder | |
EP2202726B1 (en) | Method and apparatus for judging dtx | |
KR100614496B1 (en) | An apparatus for coding of variable bit-rate wideband speech and audio signals, and a method thereof | |
KR20010080455A (en) | Low bit-rate coding of unvoiced segments of speech | |
WO2007140724A1 (en) | A method and apparatus for transmitting and receiving background noise and a silence compressing system | |
US20050143979A1 (en) | Variable-frame speech coding/decoding apparatus and method | |
Kovesi et al. | A scalable speech and audio coding scheme with continuous bitrate flexibility | |
Ahmadi et al. | On the architecture, operation, and applications of VMR-WB: The new cdma2000 wideband speech coding standard | |
Markovic | Speech compression-recent advances and standardization | |
KR20080091305A (en) | Audio encoding with different coding models | |
Somasundaram et al. | Source Codec for Multimedia Data Hiding | |
Beritelli et al. | Intrastandard hybrid speech coding for adaptive IP telephony | |
Babich et al. | The new generation of coding techniques for wireless multimedia: a performance analysis and evaluation |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
AS | Assignment |
Owner name: CONEXANT SYSTEMS, INC., CALIFORNIA Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:SU, HUAN-YU;FAYAD, GILLES G.;REEL/FRAME:011018/0616 Effective date: 20000731 |
|
AS | Assignment |
Owner name: MINDSPEED TECHNOLOGIES, INC., CALIFORNIA Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNOR:CONEXANT SYSTEMS, INC.;REEL/FRAME:014568/0275 Effective date: 20030627 |
|
AS | Assignment |
Owner name: CONEXANT SYSTEMS, INC., CALIFORNIA Free format text: SECURITY AGREEMENT;ASSIGNOR:MINDSPEED TECHNOLOGIES, INC.;REEL/FRAME:014546/0305 Effective date: 20030930 |
|
STCF | Information on status: patent grant |
Free format text: PATENTED CASE |
|
FPAY | Fee payment |
Year of fee payment: 4 |
|
FPAY | Fee payment |
Year of fee payment: 8 |
|
AS | Assignment |
Owner name: MINDSPEED TECHNOLOGIES, INC, CALIFORNIA Free format text: RELEASE OF SECURITY INTEREST;ASSIGNOR:CONEXANT SYSTEMS, INC;REEL/FRAME:031494/0937 Effective date: 20041208 |
|
AS | Assignment |
Owner name: JPMORGAN CHASE BANK, N.A., AS ADMINISTRATIVE AGENT Free format text: SECURITY INTEREST;ASSIGNOR:MINDSPEED TECHNOLOGIES, INC.;REEL/FRAME:032495/0177 Effective date: 20140318 |
|
AS | Assignment |
Owner name: GOLDMAN SACHS BANK USA, NEW YORK Free format text: SECURITY INTEREST;ASSIGNORS:M/A-COM TECHNOLOGY SOLUTIONS HOLDINGS, INC.;MINDSPEED TECHNOLOGIES, INC.;BROOKTREE CORPORATION;REEL/FRAME:032859/0374 Effective date: 20140508 Owner name: MINDSPEED TECHNOLOGIES, INC., CALIFORNIA Free format text: RELEASE BY SECURED PARTY;ASSIGNOR:JPMORGAN CHASE BANK, N.A.;REEL/FRAME:032861/0617 Effective date: 20140508 |
|
FPAY | Fee payment |
Year of fee payment: 12 |
|
AS | Assignment |
Owner name: MINDSPEED TECHNOLOGIES, LLC, MASSACHUSETTS Free format text: CHANGE OF NAME;ASSIGNOR:MINDSPEED TECHNOLOGIES, INC.;REEL/FRAME:039645/0264 Effective date: 20160725 |
|
AS | Assignment |
Owner name: MACOM TECHNOLOGY SOLUTIONS HOLDINGS, INC., MASSACH Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNOR:MINDSPEED TECHNOLOGIES, LLC;REEL/FRAME:044791/0600 Effective date: 20171017 |