CN1123866C - Dual subframe quantization of spectral magnitudes - Google Patents
Dual subframe quantization of spectral magnitudes Download PDFInfo
- Publication number
- CN1123866C CN1123866C CN98105557A CN98105557A CN1123866C CN 1123866 C CN1123866 C CN 1123866C CN 98105557 A CN98105557 A CN 98105557A CN 98105557 A CN98105557 A CN 98105557A CN 1123866 C CN1123866 C CN 1123866C
- Authority
- CN
- China
- Prior art keywords
- parameter
- vector
- subframes
- surplus
- subframe
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Expired - Lifetime
Links
- 230000003595 spectral effect Effects 0.000 title claims abstract description 97
- 238000013139 quantization Methods 0.000 title claims abstract description 24
- 230000009977 dual effect Effects 0.000 title 1
- 239000013598 vector Substances 0.000 claims abstract description 158
- 230000005540 biological transmission Effects 0.000 claims abstract description 14
- 238000000034 method Methods 0.000 claims description 54
- 238000001228 spectrum Methods 0.000 claims description 48
- 238000006243 chemical reaction Methods 0.000 claims description 37
- 230000005284 excitation Effects 0.000 claims description 16
- 238000011002 quantification Methods 0.000 claims description 13
- 230000009466 transformation Effects 0.000 claims description 10
- 230000015572 biosynthetic process Effects 0.000 claims description 8
- 230000008901 benefit Effects 0.000 claims description 4
- 238000004364 calculation method Methods 0.000 claims description 2
- 238000004891 communication Methods 0.000 abstract description 12
- 108091006146 Channels Proteins 0.000 description 16
- 238000004458 analytical method Methods 0.000 description 8
- 238000010586 diagram Methods 0.000 description 7
- 238000003786 synthesis reaction Methods 0.000 description 7
- 238000004422 calculation algorithm Methods 0.000 description 6
- 238000005516 engineering process Methods 0.000 description 6
- 235000018084 Garcinia livingstonei Nutrition 0.000 description 5
- 240000007471 Garcinia livingstonei Species 0.000 description 5
- 206010038743 Restlessness Diseases 0.000 description 4
- 238000012937 correction Methods 0.000 description 4
- 238000001514 detection method Methods 0.000 description 4
- 230000008569 process Effects 0.000 description 4
- 238000012545 processing Methods 0.000 description 4
- 238000012546 transfer Methods 0.000 description 4
- 229910052741 iridium Inorganic materials 0.000 description 3
- GKOZUEZYRPOHIO-UHFFFAOYSA-N iridium atom Chemical compound [Ir] GKOZUEZYRPOHIO-UHFFFAOYSA-N 0.000 description 3
- 238000010295 mobile communication Methods 0.000 description 3
- 230000008929 regeneration Effects 0.000 description 3
- 238000011069 regeneration method Methods 0.000 description 3
- 238000005070 sampling Methods 0.000 description 3
- 241001439211 Almeida Species 0.000 description 2
- 239000004606 Fillers/Extenders Substances 0.000 description 2
- 230000006835 compression Effects 0.000 description 2
- 238000007906 compression Methods 0.000 description 2
- 238000013461 design Methods 0.000 description 2
- 230000000694 effects Effects 0.000 description 2
- 230000004044 response Effects 0.000 description 2
- 238000013519 translation Methods 0.000 description 2
- 230000005534 acoustic noise Effects 0.000 description 1
- 150000001875 compounds Chemical class 0.000 description 1
- 238000002474 experimental method Methods 0.000 description 1
- 238000001914 filtration Methods 0.000 description 1
- 238000009499 grossing Methods 0.000 description 1
- 230000006872 improvement Effects 0.000 description 1
- 230000007774 longterm Effects 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 230000002969 morbid Effects 0.000 description 1
- 238000011084 recovery Methods 0.000 description 1
- 230000002787 reinforcement Effects 0.000 description 1
- 238000004088 simulation Methods 0.000 description 1
- 238000005728 strengthening Methods 0.000 description 1
- 238000010189 synthetic method Methods 0.000 description 1
- 238000012360 testing method Methods 0.000 description 1
Images
Classifications
-
- H—ELECTRICITY
- H03—ELECTRONIC CIRCUITRY
- H03M—CODING; DECODING; CODE CONVERSION IN GENERAL
- H03M7/00—Conversion of a code where information is represented by a given sequence or number of digits to a code where the same, similar or subset of information is represented by a different sequence or number of digits
- H03M7/30—Compression; Expansion; Suppression of unnecessary data, e.g. redundancy reduction
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L19/00—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
- G10L19/04—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using predictive techniques
- G10L19/08—Determination or coding of the excitation function; Determination or coding of the long-term prediction parameters
- G10L19/12—Determination or coding of the excitation function; Determination or coding of the long-term prediction parameters the excitation function being a code excitation, e.g. in code excited linear prediction [CELP] vocoders
- G10L19/135—Vector sum excited linear prediction [VSELP]
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L19/00—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L19/00—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
- G10L19/04—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using predictive techniques
- G10L19/08—Determination or coding of the excitation function; Determination or coding of the long-term prediction parameters
- G10L19/10—Determination or coding of the excitation function; Determination or coding of the long-term prediction parameters the excitation function being a multipulse excitation
Landscapes
- Engineering & Computer Science (AREA)
- Computational Linguistics (AREA)
- Signal Processing (AREA)
- Health & Medical Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Human Computer Interaction (AREA)
- Physics & Mathematics (AREA)
- Acoustics & Sound (AREA)
- Multimedia (AREA)
- Theoretical Computer Science (AREA)
- Compression, Expansion, Code Conversion, And Decoders (AREA)
- Transmission Systems Not Characterized By The Medium Used For Transmission (AREA)
- Radio Relay Systems (AREA)
Abstract
Speech is encoded into a 90 millisecond frame of bits for transmission across a satellite communication channel. A speech signal is digitized into digital speech samples that are then divided into subframes. Model parameters that include a set of spectral magnitude parameters that represent spectral information for the subframe are estimated for each subframe. Two consecutive subframes from the sequence of subframes are combined into a block and their spectral magnitude parameters are jointly quantized. The joint quantization includes forming predicted spectral magnitude parameters from the quantized spectral magnitude parameters from the previous block, computing the residual parameters as the difference between the spectral magnitude parameters and the predicted spectral magnitude parameters, combining the residual parameters from both of the subframes within the block, and using vector quantizers to quantize the combined residual parameters into a set of encoded spectral bits. Redundant error control bits may be added to the encoded spectral bits from each block to protect the encoded spectral bits within the block from bit errors. The added redundant error control bits and encoded spectral bits from two consecutive blocks may be combined into a 90 millisecond frame of bits for transmission across a satellite communication channel.A speech signal is digitized into digital speech samples that are then divided into subframes 300,305. Model parameters that include a set of spectral magnitude parameters Mo.....Me that represent spectral information for the subframe are estimated for each subframe. Two consecutive subframes from the sequence of subframes are combined into a block and their spectral magnitude parameters are jointly quantized 320. The joint quantization includes forming predicted spectral magnitude parameters from the quantized spectral magnitude parameters from the previous block.
Description
Technical field
The present invention relates to voice coding and decoding.
Background technology
The Code And Decode of voice has a large amount of application and extensive studies has been arranged.Usually, a kind of voice coding such as compress speech, is sought under the prerequisite that does not reduce voice quality and the property understood practically, can reduce and express the required data transfer rate of voice signal.Voice compression technique can realize with speech coder.
Speech coder generally comprises encoder.Scrambler generates the bit stream that compresses by digitized voice signal, and for example the simulating signal of a microphone generating is by a signal that analog/digital converter generated.The numeral expression mode that demoder becomes voice with the bit stream translation of compression is reappeared voice signal by D/A converter and loudspeaker being suitable for.In many application, encoder is separated, the channel of bit stream between the two.
A key parameter of speech coder is the decrement that scrambler reaches, and it can be weighed with the bit rate that scrambler produces bit stream.The bit rate of scrambler generally is the function of required fidelity (that is: voice quality) and used speech coder type.Dissimilar scramblers is designed at two-forty (more than the 8kbs), middle speed (3~8kbs) and down work of low rate (being lower than 3kbs).Recently, the speech coder of middle speed and low rate is coming on the scene in the mobile communication application (for example: cell phone, satellite phone, land mobile radio words and aloft phone) on a large scale.These use the influence that typically needs high-quality speech and tolerance to be caused by acoustic noise and channel noise (as: bit-errors).
Vocoder is the speech coder that a kind of obvious utmost point is suitable for mobile communication.Vocoder becomes the response to an excitation in short time interval of a system with speech simulation.The example of vocoder comprises lipreder, homomorphic vocoder, channel vocoder, Sine Transform Coding device (" STC "), many band excitation (" MBE ") vocoders and the many band excitations of improvement (" IMBE ") vocoders.In these vocoders, voice are divided into many short sections, and (representative value is 10~40ms), and every section is characterized by a group model parameter.These parameters are generally expressed several elementary cells of each voice segments, as: section tone, sound status and spectrum envelope.Vocoder can be expressed in these parameters each with one of a large amount of known method.For example tone can be expressed as pitch period, fundamental frequency or long-term forecasting delay.Similarly, sound status can be expressed as one or more voiced/unvoiced judgements, and sound probability metrics or periodicity energy are to the ratio of randomness energy.Spectrum envelope also can be expressed as one group of spectrum amplitude or other frequency spectrum metric through being often expressed as the all-pole filter response.
Since allow to express a voice segments with few parameters, so based on the speech coder of model, such as vocoder, generally can under low data transfer rate, move.Yet, depend on the precision of bottom model based on the quality of the system of model.So, if require these speech coders to obtain the model that high voice quality just must be used high fidelity.
By many bands excitation speech models of Griffin and Lim exploitation show people's performance be can provide high-quality voice and can on low bit rate, work well.This model adopts an acoustic structure flexibly, and this structure allows it to produce and sounds more natural voice, and more can tolerate the appearance of acoustic background noise.These characteristics make the MBE speech model be adopted by a large amount of commercial mobile communications.
The MBE speech model is expressed voice segments with a fundamental frequency, metric (metric) and one group of spectral amplitude of one group of scale-of-two voiced/unvoiced (V/UV).The MBE model is on phonetic representation for a basic advantage than conventional model.The MBE model is extended to one group of judgement with every section traditional single V/UV judgement, and the sound status on the special frequency band is represented in each judgement.The dirigibility of this increase makes the MBE model can adapt to morbid sound better in the speech model, such as some fricatives.In addition, the dirigibility of this increase makes and can be expressed more accurately by the voice that acoustic background noise polluted.Test has widely shown that this popularization has improved the quality of voice and the property understood.
Estimate a group model parameter based on the scrambler in the MBE speech coder for each voice segments.The MBE model parameter comprises: a fundamental frequency (inverse of pitch period), one group of V/UV metric or judgement and one group of spectral amplitude that characterizes spectrum envelope that characterizes sound status.For every section estimated the MBE model parameter after, scrambler carries out digitizing to produce a data bit frame to parameter.Staggered handle and the transmission institute bit stream that generates before the respective decoder, scrambler can optionally be protected some with error correcting code/error-detecging code.
Demoder returns the bit stream translation of receiving to each frame.As the part of this conversion, demoder can carry out that release of an interleave is handled and error control decoding with EDC error detection and correction.Then, a demoder position frame reconstruct MBE model parameter, demoder utilizes these parameters to synthesize a voice signal, and the utmost point resembles former voice signal on this signal impression.Demoder can synthesize each voiced sound and voiceless sound part, can increase voiced sound then and the voiceless sound composition produces final voice signal.
In the system based on MBE, scrambler characterizes the spectrum envelope of each harmonic wave of estimated fundamental frequency with spectral amplitude.Typically, whether be decided to be voiced sound or voiceless sound, each harmonic wave has been designated voiced sound or voiceless sound according to the frequency band that comprises corresponding harmonic wave.Scrambler is estimated a spectral amplitude for each harmonic frequency then.When a harmonic frequency has been decided to be voiced sound, scrambler can use the amplitude Estimation device, used amplitude Estimation device when this estimator is different from a harmonic frequency and has been decided to be voiceless sound.Demoder one side, discern the harmonic wave of voiced sound and voiceless sound, and each voiced sound is synthetic with different programs with the voiceless sound composition.Voiceless sound composition available weights overlap-add method is synthesized, with the filtering white noise signal.The frequency field that this wave filter is set to be decided to be the voiced sound part makes zero, and other zone and the spectral amplitude that is decided to be the voiceless sound part are mated.The voiced sound composition synthesizes with a tunable oscillator group, wherein distributes an oscillator to each harmonic wave that is identified as voiced sound.Instantaneous amplitude, frequency and phase place are carried out interpolation mates with the relevant parameter with adjacent segment.
Speech coder based on MBE comprises IMBE
TMSpeech coder and AMBE
Speech coder.AMBE
Speech coder is as developing based on the modified of MBE technology in early days.It comprises a more strong method of estimating excitation parameters (fundamental frequency and V/UV judgement), and the method can be followed the tracks of variation and the noise that occurs in the actual speech better.AMBE
Speech coder has adopted a bank of filters and a nonlinear method to produce one group of passage output, and this bank of filters generally comprises 16 passages.Can estimate excitation parameters reliably by passage output.In conjunction with and treatment channel export and estimate fundamental frequency, handle these outputs of each frequency band in several (as: eight) voiceband then, estimate a V/UV judgement (or other sound metric) of each voiced segments.
AMBE
Speech coder can not rely on sound yet and adjudicates and estimate spectral amplitude.Do this step, speech coder will be done fast Fourier transform (FFT) for the voice subframe of each windowing, then frequency values for the frequency range of the multiple of the fundamental frequency estimated in average energy.The method may further include compensation deals, to remove the artificial factor of being introduced by the FFT sampling interval in the spectral amplitude of estimating.
AMBE
Speech coder also can comprise the synthetic composition of a phase place, is not clearly transmitting from the scrambler to the demoder under the situation of phase information, and regeneration is used for the synthetic phase information of voiced speech.With IMBE
TMThe situation of speech coder is similar, can use based on the random phase of V/UV judgement synthetic.On the other hand, demoder can carry out a level and smooth core operation (smoothing kernel) to the spectral amplitude of reconstruct to produce phase information, and the signal of Chan Shenging is sensuously more approaching former voice than the signal that produces with the method that produces phase information at random like this.
Above mentioned these technology are described in following document to some extent: Flanagan, " speech analysis, synthetic and identification ", Springer-Verlag, 1972,378 pages~386 pages (describing the speech analysis-synthesis system based on frequency); Jayant etc., " numerical coding of waveform ", Prentice-Hall, 1984, (describing common voice coding); United States Patent (USP) 4,885, No. 790 (describing a sinusoidal disposal route); United States Patent (USP) 5,054, No. 072 (describing a sinusoidal coding method); Almeida etc., " the non-fixed point model of voiced speech ", IEEE TASSP, Vol.ASSP-31, No.3, June 1983,664-667 page or leaf, (describing harmonic-model and relevant scrambler); Almeida etc., " variable-frequency synthesis: an improved harmonic coding scheme ", IEEE Proc.ICASSP 84,27.5.1-27.5.4 page or leaf (describing a polynomial voiced sound synthetic method); Quatieri etc., " phonetic modification of representing based on sine ", IEEE TASSP, Vol, ASSP34, No.6, Dec.1986,1449-1986 page or leaf (describing an analysis-synthetic technology of representing based on sine); McAulay etc., " based on the middle rate coding of the sinusoidal expression of voice ", Proc.ICASSP 85,945-948 page or leaf, Tampa, FL, March 26-29,1985 (describing the speech coder of a sine transform); Griffin, " being with voice-excited vocoder ", PhD dissertation, M.I.T., 1987 (describing the MBE speech coder of many band excitation (MBE) speech models and a 8000bps) more; Hardwick, " many band excitation speech coders of a 4.8kbps ", Master's thesis, M.I.T., May 1988 (describing many band excitation speech coders of a 4800bps); Telecommunications industry federation (TIA), " description of APCO scheme 25 vocoders ", Version 1.3, and July 15,1993, and IS102BABA (describes the IMBE at APCO scheme 25 substandard 7.2kbps
TMSpeech coder); United States Patent (USP) 5,081, No. 681 (description IMBE
TMRandom phase is analyzed); United States Patent (USP) 5,247, No. 579 (describe a kind of method that alleviates channel errors and based on the resonance peak intensifying method of MBE speech coder); United States Patent (USP) 5,226, No. 084 (quantification and the mistake described based on the MBE speech coder alleviate method); United States Patent (USP) 5,517, No. 511 (describing position processed and FEC error control method) based on the MBE speech coder.
Summary of the invention
The purpose of this invention is to provide a kind of new AMBE that is used for satellite communication system
Speech coder, it can generate high-quality voice from a low data rate bit stream through the mobile-satellite Channel Transmission.This speech coder has low data rate simultaneously, high sound quality and to the tolerance power of ground unrest and channels bits mistake.The present invention is hopeful to improve the technical merit aspect the voice coding of mobile satellite communication.The new Shuangzi frame spectral magnitude quantizer of new speech scrambler utilization realizes high-performance, and quantizer wherein is the spectral amplitude of continuous two subframes of going out of quantitative estimation uniformly.The fidelity that this quantizer reaches can be comparable with the prior art systems of front, and that it is used for the figure place of quantized spectrum range parameter is less.AMBE
Speech coder has general the description in following document: U.S. Patent application No.08/222, and 119, the applying date is April 4,1994, title is " estimation of excitation parameters "; U.S. Patent application No.08/392,188, the applying date is February 22,1995, title is " spectral representations of many band excitation speech coders "; With U.S. Patent application No.08/392,099, the applying date is February 22,1995, and title is " utilizing the phonetic synthesis of regeneration phase information ", and these documents of listing are for reference.
A general features of the present invention is, and is a kind of with the method for voice coding for 90 milliseconds of position frames transmitting in satellite channel.Voice signal is digitized as a column of figure speech samples, and it is in 22.5 milliseconds the row subframe, to estimate a group model parameter for each subframe simultaneously that digital voice sample is assigned to the nominal time interval.The model parameter of a subframe comprises one group of spectral amplitude parameter of representing the subframe spectrum information.Two continuous subframes in this sequence of subframes are combined into one, and the spectral amplitude parameter of two subframes is quantized uniformly in one.Unified quantization comprises the spectral amplitude parameter with the spectral amplitude parameter generation forecast of the quantification in last, calculating is as the spectral amplitude parameter of this piece and the surplus parameter of the difference of prediction spectral amplitude parameter, with the surplus parameter combination of two subframes in, and with the surplus parameter quantification spectrum position of a group coding with vector quantizer.Then, redundant Error Control position is added on the coding spectrum bit of each piece to prevent that bit-errors from appearring in the coding spectrum bit in this piece.Then, additional redundancy Error Control position in two continuous blocks and coding spectrum bit are incorporated into the position frame of 90 milliseconds of being used for transmitting at satellite channel.
Embodiments of the invention can comprise following one or more characteristics.The combination of the surplus parameter of two subframes can comprise the surplus parameter of each subframe is assigned in each frequency chunks in one, surplus parameter in the frequency chunks is implemented linear transformation, to generate one group of conversion surplus coefficient of each subframe, with the synthetic PRBA vector of the minority conversion surplus coefficient set in whole frequency chunks, and with a HOC vector of conversion surplus coefficient sets cost frequency chunks remaining in each frequency chunks.The PRBA vector of each subframe is implemented conversion can produce the PRBA transformation vector, and can calculate the vector of PRBA transformation vector in one the subframe and poor, come associative transformation PRBA vector.Similarly, also can calculate the vector of each frequency chunks and poor, come two HOC vectors in conjunction with two subframes of this frequency chunks.
The spectral amplitude parameter can represent that many bands encourage the logarithmic spectrum amplitude of estimating in (" MBE ") speech models.Spectral amplitude can not rely on the spectrum that sound status calculates from one and estimates.The gain that the spectral amplitude parameter of prediction can apply less than one by the linear interpolation to the quantized spectrum amplitude of last last frame forms.
The available block code that comprises Golay (Gray) sign indicating number and Hamming (Hamming) sign indicating number in every Error Control position generates.For example: these sign indicating numbers can comprise one [24,12] expansion Golay sign indicating number, three [23,12] Golay sign indicating numbers and two [15,11] Hamming codes.
To each frequency chunks, adopt discrete cosine transform (DCT) back on the DCT of two lowest-order coefficient, to carry out a linearity 2 * 2 conversion, can calculate conversion surplus coefficient.Four frequency chunks can be used for this to be calculated, and the length of each frequency chunks number of spectral amplitude parameter approximate and in the subframe is directly proportional.
Vector quantizer can comprise one for the PRBA vector with adopt 8 to add 6 three vector quantizers along separate routes that add 7, comprises that also one is adopted 8 two phase quantizer along separate routes that add 6 for the PRBA phasor difference.The position frame can comprise that expression is by additional bit wrong in the conversion surplus coefficient of vector quantizer introducing.
Another general features of the present invention is that a kind of is the system of 90 milliseconds of position frames that transmit in satellite channel with voice coding.This system comprises: an Aristogrid converts voice signal to the digital voice sample sequence; A sub-frame generator is assigned to digital voice sample in the sequence of subframes, and each subframe comprises many digital voice samples; A model parameter estimation device is estimated the group model parameter that comprises one group of spectral amplitude parameter of each subframe; A colligator is combined into a piece with two continuous subframes in the sequence of subframes; A two frame spectral magnitude quantizer quantizes the parameter of two subframes in this piece uniformly.Unified quantization comprises following process: by last quantized spectrum range parameter generation forecast spectral amplitude parameter, calculating is as the surplus parameter of the difference of spectral amplitude parameter and prediction spectral amplitude parameter, in conjunction with the surplus parameter of two subframes in, be group coding spectrum position with the surplus parameter quantification of combination with vector quantizer.This system also comprises: an error code scrambler, and it is added to the coding spectrum bit of each piece with redundant Error Control position, does not have bit-errors to guarantee middle at least a portion coding spectrum bit; Also have a colligator, it is combined into one with the additional redundancy Error Control position of two continuous blocks and coding spectrum bit and is used for 90 milliseconds of position frames transmitting at satellite channel.
General features more of the present invention is, and is as indicated above, a kind of from 90 milliseconds of frames of coding the method for decoded speech.Decode procedure comprises: a position frame is divided into two position pieces, and wherein each piece is represented two voice subframes.Error control decoding is applied to each piece, adopts the redundant Error Control position in this piece to generate the error-decoded position that prevents bit-errors at least in part, this position.It is one two subframes reconstruct spectral amplitude parameter uniformly that the error-decoded position is used for.Unified reconstruct comprises:, can thus be two subframes and calculate each surplus parameter in conjunction with the surplus parameter with one group of vector quantizer code book reconstruct; Generation forecast spectral amplitude parameter from last reconstruct spectral amplitude parameter; And each surplus parameter is added to prediction spectral amplitude parameter, generates the reconstruct spectral amplitude parameter of each subframe in this piece.Reconstruct spectral amplitude parameter with subframe is the synthetic digital voice sample of each subframe then.
Of the present invention another as feature be a kind of demoder from decoded speech through 90 milliseconds of position frames that satellite channel receives.Demoder comprises a dispenser, and the position frame is divided into two position pieces.Each piece is represented two voice subframes.Error control decoder carries out error-decoded with the redundant Error Control position that is contained in the piece to each piece, to generate the error-decoded position that prevents bit-errors at least in part.Two frame spectral amplitude reconstructor are two subframes reconstruct spectral amplitude parameter uniformly in, and wherein unified reconstruct comprises:, can thus be two subframes and calculate each surplus parameter in conjunction with the surplus parameter with one group of vector quantizer code book reconstruct; Generation forecast spectral amplitude parameter from last reconstruct spectral amplitude parameter; And each surplus parameter is added to prediction spectral amplitude parameter, to generate the reconstruct spectral amplitude parameter of each frame in this piece.Compositor is the synthetic digital voice sample of each subframe with the reconstruct spectral amplitude parameter of subframe.
Description of drawings
Other characteristics of the present invention and advantage can be below description with reference to the accompanying drawings and the appended claim book in find out significantly.
Fig. 1 is the satellite system simplified block diagram.
Fig. 2 is the block diagram of a communication link of system shown in Figure 1.
Fig. 3 and Fig. 4 are the block diagrams of the encoder of system shown in Figure 1.
Fig. 5 is the general diagram of encoder component shown in Figure 3.
Fig. 6 is the sound of scrambler and the process flow diagram of single-tone detection function.
Fig. 7 is the block diagram of the Shuangzi frame amplitude quantizer of scrambler shown in Figure 5.
Fig. 8 is the block diagram of the mean vector quantizer of amplitude quantizing device shown in Figure 7.
Embodiment
Embodiments of the invention are described as a new AMBE speech coder in context, or vocoder, are used in IRIDIUM
On the mobile satellite communication system 30, as shown in Figure 1.IRIDIUM
Be a Globale Mobile Satellite Communication, it is made up of 66 low earth-orbit satellites 40.IRIDIUM
By hand-held or airborne user terminal (such as: mobile phone) 45 provide voice communication.
With reference to figure 2, the user terminal on information transmission summit with the frequency sampling voice 50 of 8kHz, is finished voice 50 digitizing work by microphone 60 and modulus (A/D) transducer 70, realizes audio communication.Digitized voice signal obtains handling by the speech coder of hereinafter addressing 80.Transmitter 90 is delivered to signal on the communication link then.At the other end of communication link, receiver 100 receives signal and deliver to demoder 110.Demoder is synthetic audio digital signals to conversion of signals.Then, digital-to-analogue (D/A) transducer 120 will synthesize audio digital signals and be converted to analog voice signal, and this signal is converted to the voice 140 that can listen by loudspeaker 130.
Communication link transmits the frame of a 90ms with burst transmissions time division multiplex (TDMA).Support two kinds of different voice data rates: the full-rate mode (624 of the frames of every 90ms) of the half-rate mode of 3467bps (312 of the frames of every 90ms) and 6933bps.The position of every frame is divided into the probability of the bit-errors that voice coding and forward error correction (" FEC ") coding often occurs when reducing channel via satellite.
With reference to figure 3, the speech coder of each terminal comprises a scrambler 80 and a demoder 110.Scrambler comprises three main functional blocks: speech analysis 200, parameter quantification 210 and error correcting coding 220.Similarly, as shown in Figure 4, demoder is divided into error correcting demoder 230, and parameter reconstruct 240 (such as: re-quantization) and functional block such as phonetic synthesis 250.
Speech coder can be worked under two different data transfer rates: the full rate of 4933bps and the half rate of 2289bps.These data transfer rate representative voices or position, source and disregard the FEC position.The FEC position makes the vocoder data rate of full rate and half rate bring up to 6933bps and 3467bps respectively, as mentioned above.System uses the size of the voiced frame of a 90ms, and this frame is divided into the subframe of four 22.5ms.Speech analysis is based on synthetic that subframe carries out, and quantizes and the FEC coding is based on the quantize block execution of the 45ms that comprises two subframes.Use to quantize and 45ms piece that FEC encodes causes in the half-speed systems every to have 103 sound positions to add 53 FEC positions, and in the full rate system every have 222 sound positions to add 90 FEC positions.On the other hand, the number of sound position and FEC position only can be adjusted in the mild scope of performance impact effect.In the half-speed systems, when corresponding the adjustment done in the FEC position in 76 to 36 scope, just can realize the adjustment in 80 to 120 scopes in sound position.Similarly, in the full rate system, the FEC position is when changing for 132 to 52, and the sound position just can be adjusted in 180 to 260 scope.Sound position in the quantize block and FEC position in conjunction with and form the frame of a 90ms.
Scrambler 80 at first carries out speech analysis 200.The first step of speech analysis is that the bank of filters of every frame is handled, and and then is the estimation of the MBE model parameter of every frame.This step comprises the subframe that input signal is divided into overlapping 22.5ms with analysis window.For each 22.5ms subframe, a MBE subframe parameter estimator estimates one group of model parameter that comprises a fundamental frequency (inverse of pitch period), one group of voiced/unvoiced judgement (V/UV) and one group of spectral amplitude.These parameters produce with the AMBE technology.AMBE
Speech coder has general the description in following document: U.S. Patent application No.08/222, and 119, the applying date is April 4,1994, title is " estimation of excitation parameters "; U.S. Patent application No.08/392,188, the applying date is February 22,1995, title is " spectral representations of many band excitation speech coders "; With U.S. Patent application No.08/392,099, the applying date is February 22,1995, and title is " with regeneration phase information synthetic speech ", and all documents of listing are for reference.
In addition, full rate vocoder comprises a timeslice ID, and to help to be identified in the TDMA bag of the unordered arrival of receiver end, receiver can be adjusted to correct order with information with this information before decoding.The speech parameter of comprehensively having described voice signal is sent to 210 of the quantizers of scrambler, to further process.
With reference to figure 5, as long as be that two continuous 22.5ms subframes in the frame estimate subframe model parameter 300 and 305, fundamental frequency and voiced sound quantizer 310 be just being the sequence that fundamental frequency that two subframes estimate is encoded to a fundamental frequency position, and voiced/unvoiced (V/UV) judgement (or other sound metric) is encoded to the sound bit sequence.
In described embodiment, quantize and encode two fundamental frequencies with ten.Typically, fundamental frequency is estimated to be limited in the scope of about [0.008,0.05] by basic, and 1.0 is nyquist frequency (8kHz) herein, and the basic quantization device is limited in a similar scope.Since the reciprocal of the quantification fundamental frequency of a given subframe generally is directly proportional with L, L is the spectrum amplitude number of degrees of subframe (L=bandwidth/fundamental frequency) for this reason, the highest significant position of fundamental frequency (MSB) generally has susceptibility to bit-errors, so give high level priority in the FEC coding.
The foregoing description when half rate with eight and come acoustic information coding with sixteen bit when the full rate to two subframes.The position that the utilization of sound quantizer distributes coding binary sound status (as: 1=voiced sound, 0=voiceless sound) on each frequency band of eight selected voicebands, the sound metric of estimating when sound status is by speech analysis is herein determined.These sound bit-by-bit mistakes have the susceptibility of moderate, so just distributed intermediate priority when FEC encodes.
In colligator 330, in conjunction with fundamental frequency position harmony phoneme with by the quantized spectrum amplitude position of Shuangzi frame amplitude quantizer 320, and be the piece execution forward error correction (FEC) of 45ms.Then, form the frame of 90ms in colligator 340, it is combined into an individual frames 350 with the quantize block of two continuous 45ms.
Scrambler contains a self-adaptation voice activity detector (VAD), and it is categorized as sound class, ground unrest class or single-tone class with program 600 with the subframe of each 22.5ms.As shown in Figure 6, vad algorithm is distinguished sound subframe and ground unrest (step 605) with local information.If two subframes of each 45ms piece are divided into noise class (step 610), scrambler is quantized into specific noise piece (step 615) with current ground unrest so.When two 45ms pieces forming a 90ms frame are divided into the noise time-like simultaneously, this frame can be selected not transmit by system will fill up the frame of losing with the noise data that received in the past to demoder and demoder.The active transmission technology of this voice has improved the performance of system with the voiced frame that only transmission is necessary and the method for other noise frame.
The characteristics of this scrambler also are to support the single-tone detection and the transmission of DTMF, call proceeding (as: dialing, the line is busy and ring-back) and single single-tone.Scrambler checks that each 22.5ms subframe is to determine whether current subframe comprises an effective tone signal.If detect single-tone (step 620) in one in two subframes in the 45ms piece, scrambler just quantizes detected single-tone parameter (amplitude and index) (step 625) in a specific single-tone piece as shown in table 1, and carries out the FEC coding before making subsequent analysis this piece being transferred to demoder.If do not detect single-tone, just the sound chunk to a standard quantizes, (step 630) as described below.
Table 1: bit representation in the single-tone piece
Half rate | Full rate | ||
B[] unit # | Value | B[] unit # | Value |
0-3 4-9 10-12 13-14 15-19 20-27 28-35 36-43 . . | The single-tone index that the single-tone index that the single-tone index that 5 LSB of 3 | 0-7 8-15 16-18 19-20 21-25 26-33 34-41 42-49 . . | The single-tone index that the single-tone index that the single-tone index that 5 LSB of 3 |
84-91 92-99 100-102 | The single- | 194-201 202-209 210-221 | The single- |
Vocoder comprises VAD and single-tone detection is divided into following a few class with the piece with each 45ms: standard voice piece, specific single-tone piece, or specific noise piece.When a 45ms piece is not categorized as specific single-tone piece, right (being determined by VAD) sound and the noise information of subframe of forming this piece so is quantized.Model parameter and FEC coding are distributed in available position (half rate is 156, and full rate is 312), and as shown in table 2, timeslice ID is a specific parameter that is used for the full rate receiver herein, the proper order of the frame that is not in the right order in the time of can determining to receive with it.After recovery is used for the position of excitation parameters (fundamental frequency harmony tone rule), FEC coding and timeslice ID, in half-speed systems, there are 85 to offer spectral amplitude, in the full rate system, then have 183 to offer spectral amplitude.For supporting to have the full rate system of least additional complexity, full rate amplitude quantizing device uses the quantizer identical with half-speed systems, adds one and quantizes the mistake quantizer of the difference exported with encode non-quantized spectrum amplitude and half rate quantizer of scalar quantization.
The position of table 2 45ms sound or noise piece is distributed
The vocoder parameter | Figure place (half rate) | Figure place (full rate) |
Fundamental frequency sound metric gain PRBA vector HOC vector timeslice ID FEC | 10 8 5+5=10 8+6+7+8+6=35 4×(7+3)=40 0 12+3×11+2×4=53 | 16 16 5+5+2×2=14 8+6+7+8+6+2×12=59 4×(7+3)+2× (9+9+9+8)=110 7 2×12+6×11=90 |
Amount to | 156 | 312 |
Shuangzi frame quantizer is used for the quantized spectrum amplitude.This quantizer combines log-compressed expansion, spectrum estimation, discrete cosine transform (DCT) and vector and mark quantization methods.Fidelity with every is weighed, and its efficient height and complexity are suitable.This quantizer can be regarded the predictive transformation scrambler of a bidimensional as.
Fig. 7 example Shuangzi frame amplitude quantizer, it is received from input 1a and the 1b that the MBE parameter estimator of two continuous 22.5ms subframes comes.Input 1a represents the spectral amplitude and a given label 1 of odd number 22.5ms subframe.The amplitude number of subframe numbers 1 is designated as L
1Input 1b represents the spectral amplitude and a given label 0 of even number 22.5ms subframe.The amplitude number of subframe numbers 0 is designated as L
0
Input 1a is by a log-compressed extender 2a, to being included in the L of input 1a
1Each work is the logarithm operation at the end with 2 in the individual amplitude, and producing in the following manner simultaneously has L
1Another vector of unit:
y[i]=log
2(x[i]) (i=1,2,…,L
1)
Herein, y[i] expression signal 3a.Extender 2b is to being included in the L of input 1b
0Each work in the individual amplitude is the logarithm operation at the end with 2, and generation has L in a similar manner
1Another vector of unit:
Y[i]=log
2(x[i]) (and i=1,2 ..., L
0) y[i herein] expression input signal 3b.
After compander 2a and 2b, mean value computation device 4a and 4b calculate the average 5a and the 5b of each subframe.Average, or yield value are represented the average speech level of subframe.In every frame, by calculate two subframes each the logarithmic spectrum amplitude average and add that in this subframe depending on harmonic number purpose side-play amount determines two yield value 5a, 5b.
The mean value computation method of logarithmic spectrum amplitude 3a is:
Output y herein represents mean value signal 5a.
The average 4b computing method of logarithmic spectrum amplitude 3b are similar, for:
Output y herein represents mean value signal 5b.
E (n)=[x1 (n)-z1]
2+ [x2 (n)-z2]
2, (n=0,1 ..., 31) and make in the appendix A vector of squared-distance e minimum elect last five that produce piece 6 outputs.Five five outputs with five even scalar quantizer of the output of vector quantizer 840 combine by colligator 850.Colligator 850 is output as 10, and it constitutes the output of piece 6, and this output is as an input of colligator 22 among Fig. 7, and its label is 21c.
The further main signal path of reference quantization device, the input signal 3a of log-compressed expansion and 3b generate a D by value 33a and 33b that colligator 7a and 7b deduct the fallout predictor that the feedback fraction by quantizer comes
1(1) signal 8a and a D
1(0) signal 8b.
Next step utilizes the look-up table of appendix O, and signal 8a and 8b are assigned in four frequency chunks.According to the amplitude sum that is divided subframe, this table provides the amplitude number of distributing in four frequency chunks each.Because the amplitude sum of arbitrary subframe changes between minimum value 9 and maximal value 56, so this table has comprised the value of same range as.Is 0.2: 0.225: 0.275 with the length adjustment of each frequency chunks to mutual ratio: 0.3, make the length and the spectrum amplitude number of degrees that equal current subframe simultaneously.
Then, each frequency chunks through discrete cosine transformer (DCT) 9a or 9b with efficiently to the data decorrelation in each frequency chunks.Two DCT coefficient 10a or 10b in each frequency chunks are told, and the twiddle operation 12a by 2 * 2 or 12b are to generate conversion coefficient 13a or 13b.Then, conversion coefficient 13a and 13b are carried out 8 DCT14a or 14b, to produce a PRBA vector 15a or 15b.The residue DCT coefficient 11a of each frequency chunks and 11b form one group four elongated degree high-order coefficients (HOC) vector.
As mentioned above, after the frequency division, every through discrete cosine transformer 9a and 9b processing.Input item quantity W and the value x of each (0) that the DCT piece uses, x (1) ..., x (W-1), as shown in the formula:
The value of y (0) and y (1) (being determined by 10a) is what to separate with other output y (2) to y (W-1) (being determined by 11a).
Then, utilize a rotation algorithm to make output vector 13a and 13b (y (0), y (1)) that one 2 * 2 twiddle operation 12a and 12b convert Unit two to input vector 10a and 10b (x (0), x (1)) with Unit two, as shown in the formula:
Y (0)=x (0)+sqrt (2) * x (1), with
y(1)=x(0)-sqrt(2)×x(1).
Then, according to four two element vectors of following formula to coming by 13a and 13b, do one eight point (x (0), x (1) ..., x (7)) DCT:
Output y (k) is the PRBA vector 15a and the 15b of one eight unit.
As long as the prediction and the dct transform of single subframe amplitude have been finished, two PRBA vectors just are quantized.At first with and difference conversion 16 two eight element vectors are combined into one and vector and a difference vector.Specifically be and/difference operation 16 is to carry out on two eight unit PRBA vector 15a and 15b, produces one 16 element vectors 17, wherein, 15a and 15b are represented by x and y respectively, 17 are represented by z, as shown in the formula:
Z (i)=x (i)+y (i) and
z(8+i)=x(i)-y(i), i=0,1,…,7.
Then, these vectors disperse vector quantizer 20a to quantize with one, and here and unit 1-2, the 3-4 of vector, 5-7 uses 8 respectively, and 6 and 7, and unit 1-3 in the difference vector and 4-7 use 8 and 6 respectively.Because the unit of each vector 0 is equivalent to the yield value that quantizes gained respectively on function, it is left in the basket.
PRBA disperses vector quantizer 20a to quantize PRBA and difference vector 17, produces a quantization vector 21a.Two unit z (1) and z (2) constitute a two-dimensional vector to be quantified.Each two-dimensional vector relatively with (being made up of x1 (n) and x2 (n) in the table (" PRBA and [1,2] VQ code book (8) ") of appendix B) for this vector.Available squared-distance e compares, as shown in the formula:
E (n)=[x1 (n)-z (1)]
2+ [x2 (n)-z (2)]
2, n=0,1 ..., 255. select the vector that makes squared-distance e minimum in appendix B, to produce at first 8 of output vector 21a.
Next step, two unit z (3) and z (4) constitute a two-dimensional vector and quantize.Each two-dimensional vector relatively with (being made up of x1 (n) and x2 (n) in the table (" PRBA and [3,4] VQ code book (6) ") of appendix C) for this vector.E compares with squared-distance, as shown in the formula:
E (n)=[x1 (n)-z (3)]
2+ [x2 (n)-z (4)]
2, n=0,1 ..., 63. select the vector that makes squared-distance e minimum in appendix C, to produce follow 6 of output vector 21a.
Next step, three unit z (5), z (6) and z (7) constitute a trivector and quantize.Each trivector relatively with (by x1 (n) in the table (" PRBA and [5,7] VQ code book (7) ") of appendix D, x2 (n) and x3 (n) form) for this vector.Available squared-distance e compares, as shown in the formula:
E (n)=[x1 (n)-z (5)]
2+ [x2 (n)-z (6)]
2+ [x3 (n)-z (7)]
2, n=0,1 ..., 127. select the vector that makes squared-distance e minimum in appendix D, to produce follow 7 of output vector 21a.
Next step, three unit z (9), z (10) and z (11) constitute a trivector and quantize.Each trivector relatively with (by x1 (n) in the table (" PRBA poor [1,3] VQ code book (8) ") of appendix E, x2 (n) and x3 (n) form) for this vector.Available squared-distance e compares, as shown in the formula:
E (n)=[x1 (n)-z (9)]
2+ [x2 (n)-z (10)]
2+ [x3 (n)-z (11)]
2, n=0,1 ..., 255. select the vector that makes squared-distance e minimum in appendix E, to produce follow 8 of output vector 21a.
At last, four unit z (12), z (13), z (14) and z (15) constitute a four-vector and quantize.Each four-vector relatively with (by x1 (n) in the table (" PRBA poor [4,7] VQ code book (6) ") of appendix F, x2 (n), x3 (n) and x4 (n) composition) for this vector.Available squared-distance e compares, as shown in the formula:
e(n)=[x1(n)-z(12)]
2+[x2(n)-z(13)]
2+[x3(n)-z(14)]
2+[x4(n)-z(15)]
2,
N=0,1 ..., 63. select the vector that makes squared-distance e minimum in appendix F, to produce last 6 of output vector 21a.
The quantification of HOC vector is similar to the PRBA vector.At first, corresponding in four frequency chunks each, the HOC vector in corresponding two subframes to one and-difference conversion 18 combines, wherein and-difference conversion 18 for each frequency chunks produce one with-difference vector 19.
Respectively each frequency chunks is carried out on two HOC vector 11a and 11b and/difference operation, produce a vector z
m:
J=max(B
m0,B
m1)-2
K=min(B
m0,B
m1)-2
z
m(i)=0.5[x(i)+y(i)] 1≤i≤K
If L
0>L
1, z
m(i)=y (i)
Otherwise z
m(i)=and x (i), K<i≤J
z
m(J+i)=0.5[x (i)-y (i)] 0≤i≤K herein, B
M0And B
M1Be respectively the length of m frequency chunks of subframe zero-sum subframe one, O is listed as appendix, for each frequency chunks is determined z (being that m equals 0 to 3).For all four frequency chunks (m equals 0 to 3) in conjunction with J+K unit with difference vector z
m, with form HOC's and/difference vector 19.
Because the varying in size of each HOC vector, thus with difference vector also have variation and also may be different length.In the vector quantization step, handle this problem by the unit outside preceding four unit of ignoring each vector.Vector quantization and vector are made with seven in remaining unit, and difference vector is with three.After vector quantization is carried out, to after quantizing carry out original with difference vector and-inverse transformation of difference conversion.Owing to whole four frequency chunks have been used this process, so 40 (4 * (7+3)) are used for the HOC vector of two subframe correspondences is made vector quantization altogether.
HOC disperses vector quantizer 20b to quantize HOC and difference vector 19 respectively on whole four frequency chunks.At first, represent the vector z of m frequency chunks
mCompare with each alternative vector corresponding and poor code book in the appendix respectively.Code book is by its pairing frequency chunks sign, and to identify it be one and sign indicating number or a difference sign indicating number.So, appendix G " HOC and 0VQ code book (7) " represent frequency chunks 0 and code book.Other code book is appendix H (" a HOC difference 0VQ code book (3) "), appendix I (" HOC and 1VQ code book (7) "), appendix J (" HOC difference 1VQ code book (3) "), appendix K (" HOC and 2VQ code book (7) "), appendix L (" HOC difference 2VQ code book (3) "), appendix M (" HOC and 2VQ code book (7) "), appendix N (" HOC difference 3VQ code book (3) ").The vector z of each frequency chunks
mWith relatively representing with squared-distance of each alternative vector of corresponding and code book, wherein, alternative and vector (by x1 (n), x2 (n), x3 (n) and x4 (n) composition) is used e1 to each
nCalculate, as shown in the formula:
(by x1 (n), x2 (n), x3 (n) and x4 (n) composition) uses e2 to each alternative difference vector
mCalculate, as shown in the formula:
Press preamble described calculating J and K herein.
Corresponding and record can make squared-distance e1 in the code book
nIndex n seven bit representations of a minimum alternative and vector.And can make squared-distance e2
mExponent m three bit representations of a minimum alternative difference vector.In whole four frequency chunks,, form the carry-out bit 21b of 40 HOC in conjunction with these ten.
The PRBA vector 21a of the compound quantification of piece 22 multichannels, quantification average 21b and quantification average 21c are to generate carry-out bit 23.These 23 are final carry-out bits of Shuangzi frame amplitude quantizer, and the feedback that offers quantizer simultaneously partly.
The feedback of Shuangzi frame quantizer partly is designated as the reverse function of carrying out function in the big frame of Q in piece 24 representative graphs.Piece 24 produces D according to quantization 23
1(1) and D
1(0) the estimated value 25a and the 25b of (8a and 8b).Do not have under the prerequisite of quantization error in being designated as the big frame of Q, these estimations will equal D
1(1) and D
1(0).
Then, predictor block 30 interpolations and the amplitude of sampling and estimating again generate L
1Individual estimation amplitude is afterwards from L
1In the individual estimation amplitude each deducts the average of estimation amplitude to generate P
1(1) output 31a.Then, to the estimation amplitude interpolation of input with sample again and produce L
0Individual estimation amplitude is from L
0In the individual estimation amplitude each deducts the average of estimation amplitude to generate P
1(0) output 31b.
As long as scrambler has quantized model parameter for each 45ms piece, the position of quantification will be endowed priority before transmission, make the FEC coding, and do staggered the processing.At first, give its priority according to quantization to the order of the estimation susceptibility of bit-errors.Experiment demonstration PRBA and HOC's is generally more responsive to bit-errors than corresponding difference vector with vector.And PRBA and vector are generally more responsive than HOC and vector.These relevant susceptibilitys in a precedence scheme, have been utilized.Normally, distributing the highest priority for average pitch frequency and average gain position, secondly is PRBA and position and HOC and position, is once more to be some remaining positions at last in PRBA difference position and HOC difference position.
Then, utilize the hybrid code of [24,12] expansion Golay sign indicating number, [23,12] Golay sign indicating number and [15,11] Hamming code, add the high redundancy degree, hang down redundance or do not add redundance and add for more insensitive position to more sensitive position.Half-speed systems adopts [24, a 12] Golay sign indicating number, after three [23,12] Golay sign indicating numbers are arranged, be two [15,11] Hamming codes after again, remaining 33 are not protected.The full rate system adopts two [24,12] Golay sign indicating numbers, after six [23,12] Golay sign indicating numbers are arranged, do not support for remaining 126.The design of this distribution is the limited figure place that can use FEC in order to use efficiently.Final step is the staggered FEC of a processing bits of coded in each 45ms piece, to disperse the influence of short burst error.Then, the interleaved bits of two continuous 45ms pieces is incorporated in the 90ms frame of a formation encoder output bit stream.
After the coding stream signal transmits in channel and receives, design corresponding demoder and come from the bit stream of coding, to reproduce high-quality voice.Demoder at first is divided into the frame of each 90ms the quantize block of two 45ms.Afterwards, demoder carries out release of an interleave to each piece, and carries out error correction decoding, to correct and/or to detect some possible bit-errors pattern.For obtaining the enough performances by the mobile-satellite channel, all error correcting codes generally are decoded to its highest error correcting capability.Next step, demoder is this piece recombinant quantization with the fec decoder position, the model parameter of two subframes of this piece is represented in reconstruct from these.
AMBE
Demoder is felt the voice of natures with the synthetic one group of phase place of reconstruct logarithmic spectrum amplitude, sound synthesizer with these phase places generations.Use synthesis phase information to reduce widely and the relevant message transmission rate of system that between scrambler and demoder, directly transmits this information or equivalent.Then, demoder adopts the spectrum strengthening measure to the spectral amplitude of reconstruct, to improve the perceptual quality of voice signal.If the local channel parameters indication of estimating has the bit-errors that can not correct and exists, then further detecting position mistake of demoder and level and smooth reconstruction parameter.Reinforcement and level and smooth model parameter (fundamental frequency, V/UV judgement, spectral amplitude and synthesis phase) are used for phonetic synthesis.
Reconstruction parameter forms the input of the voice operation demonstrator algorithm of demoder, is inserted into the voice segments of level and smooth 22.5ms in the model parameter frame of this algorithm with order.Composition algorithm synthesizes voiced speech with one group of harmonic oscillator (or a high-frequency FFT simulator).It is added to the output of superposition algorithm of a weighting with synthetic unvoiced speech.These summations form synthetic speech signal, output to a D/A transducer, reset to loudspeaker again.Yet, this synthetic speech signal may be on the meaning of sampled point one by one with original signal and keep off, but a people sounds feeling to be identical.
Other embodiment is also contained within the scope of claims.
Appendix A
Gain VQ code book (5) value table
n | x1(n) | x2(n) |
0 | -6696 | 6699 |
1 | -5724 | 5641 |
2 | -4860 | 4854 |
3 | -3861 | 3824 |
4 | -3132 | 3091 |
5 | -2538 | 2630 |
6 | -2052 | 2088 |
7 | -1890 | 1491 |
8 | -1269 | 1627 |
9 | -1350 | 1003 |
10 | -756 | 1111 |
11 | -864 | 514 |
12 | -324 | 623 |
13 | -486 | 162 |
14 | -297 | -109 |
15 | 54 | 379 |
16 | 21 | -49 |
17 | 326 | 122 |
18 | 21 | -441 |
19 | 522 | -196 |
20 | 348 | -686 |
21 | 826 | -466 |
22 | 630 | -1005 |
23 | 1000 | -1323 |
24 | 1174 | -809 |
25 | 1631 | -1274 |
26 | 1479 | -1789 |
27 | 2088 | -1960 |
28 | 2566 | -2524 |
29 | 3132 | -3185 |
30 | 3958 | -3994 |
31 | 5546 | -5978 |
Appendix B
PRBA and [1,2] VQ code book (8) value table
Appendix C
PRBA and [3,4] VQ code book (6) value table
n | x1(n) | x2(n) |
0 | -1320 | -848 |
1 | -820 | -743 |
2 | -440 | -972 |
3 | -424 | -584 |
4 | -715 | -456 |
5 | -1155 | -335 |
6 | -627 | -243 |
7 | -402 | -183 |
8 | -165 | -459 |
9 | -385 | -378 |
10 | -160 | -716 |
11 | 77 | -594 |
12 | -198 | -277 |
13 | -204 | -115 |
14 | -6 | -362 |
15 | -22 | -173 |
16 | -841 | -86 |
17 | -1178 | 206 |
18 | -551 | 20 |
19 | -414 | 209 |
20 | -713 | 252 |
21 | -770 | 665 |
22 | -433 | 473 |
23 | -361 | 818 |
24 | -338 | 17 |
25 | -148 | 49 |
26 | -5 | -33 |
27 | -10 | 124 |
28 | -195 | 234 |
29 | -129 | 469 |
30 | 9 | 316 |
31 | -43 | 647 |
n | x1(n) | x2(n) |
32 | 203 | -961 |
33 | 184 | -397 |
34 | 370 | -550 |
35 | 358 | -279 |
36 | 135 | -199 |
37 | 135 | -5 |
38 | 277 | -111 |
39 | 444 | -92 |
40 | 661 | -744 |
41 | 593 | -355 |
42 | 1193 | -634 |
43 | 933 | -432 |
44 | 797 | -191 |
45 | 611 | -66 |
46 | 1125 | -130 |
47 | 1700 | -24 |
48 | 143 | 183 |
49 | 288 | 262 |
50 | 307 | 60 |
51 | 478 | 153 |
52 | 189 | 457 |
53 | 78 | 967 |
54 | 445 | 393 |
55 | 386 | 693 |
56 | 819 | 67 |
57 | 681 | 266 |
58 | 1023 | 273 |
59 | 1351 | 281 |
60 | 708 | 551 |
61 | 734 | 1016 |
62 | 983 | 618 |
63 | 1751 | 723 |
Appendix D
PRBA and [5,7] VQ code book (7) value table
Appendix E
PRBA poor [1,3] VQ code book (8) value table
Appendix F
PRBA poor [4,7] VQ code book (6) value table
n | x1(n) | x2(n) | x3(n) | x4(n) |
0 | -279 | -330 | -261 | 7 |
1 | -465 | -242 | -9 | 7 |
2 | -248 | -66 | -189 | 7 |
3 | -279 | -44 | 27 | 217 |
4 | -217 | -198 | -189 | -233 |
5 | -155 | -154 | -81 | -53 |
6 | -62 | -110 | -117 | 157 |
7 | 0 | -44 | -153 | -53 |
8 | -186 | -110 | 63 | -203 |
9 | -310 | 0 | 207 | -53 |
10 | -155 | -242 | 99 | 187 |
11 | -155 | -88 | 63 | 7 |
12 | -124 | -330 | 27 | -23 |
13 | 0 | -110 | 207 | -113 |
14 | -62 | -22 | 27 | 157 |
15 | -93 | 0 | 279 | 127 |
16 | -413 | 48 | -93 | -115 |
17 | -203 | 96 | -56 | -23 |
18 | -443 | 168 | -130 | 138 |
19 | -143 | 288 | -130 | 115 |
20 | -113 | 0 | -93 | -138 |
21 | -53 | 240 | -241 | -115 |
22 | -83 | 72 | -130 | 92 |
23 | -53 | 192 | -19 | -23 |
24 | -113 | 48 | 129 | -92 |
25 | -323 | 240 | 129 | -92 |
26 | -83 | 72 | 92 | 46 |
27 | -263 | 120 | 92 | 69 |
28 | -23 | 168 | 314 | -69 |
29 | -53 | 360 | 92 | -138 |
30 | -23 | 0 | -19 | 0 |
31 | 7 | 192 | 55 | 207 |
n | x1(n) | x2(n) | x3(n) | x4(n) |
32 | 7 | -275 | -296 | -45 |
33 | 63 | -209 | -72 | -15 |
34 | 91 | -253 | -8 | 225 |
35 | 91 | -55 | -40 | 45 |
36 | 119 | -99 | -72 | -225 |
37 | 427 | -77 | -72 | -135 |
38 | 399 | -121 | -200 | 105 |
39 | 175 | -33 | -104 | -75 |
40 | 7 | -99 | 24 | -75 |
41 | 91 | 11 | 88 | -15 |
42 | 119 | -165 | 152 | 45 |
43 | 35 | -55 | 88 | 75 |
44 | 231 | -319 | 120 | -105 |
45 | 231 | -55 | 184 | -165 |
46 | 259 | -143 | -8 | 15 |
47 | 371 | -11 | 152 | 45 |
48 | 60 | 71 | -63 | -55 |
49 | 12 | 159 | -63 | -241 |
50 | 60 | 71 | -21 | 69 |
51 | 60 | 115 | -105 | 162 |
52 | 108 | 5 | -357 | -148 |
53 | 372 | 93 | -231 | -179 |
54 | 132 | 5 | -231 | 100 |
55 | 180 | 225 | -147 | 7 |
56 | 36 | 27 | 63 | -148 |
57 | 60 | 203 | 105 | -24 |
58 | 108 | 93 | 189 | 100 |
59 | 156 | 335 | 273 | 69 |
60 | 204 | 93 | 21 | 38 |
61 | 252 | 159 | 63 | -148 |
62 | 180 | 5 | 21 | 224 |
63 | 348 | 269 | 63 | 69 |
Appendix G
HOC and 0VQ code book (7) value table
Appendix H
HOC difference 0VQ code book (3) value table
n | x1(n) | x2(n) | x3(n) | x4(n) |
0 | -558 | -117 | 0 | 0 |
1 | -248 | 195 | 88 | -22 |
2 | -186 | -312 | -176 | -44 |
3 | 0 | 0 | 0 | 77 |
4 | 0 | -117 | 154 | -88 |
5 | 62 | 156 | -176 | -55 |
6 | 310 | -156 | -66 | 22 |
7 | 372 | 273 | 110 | 33 |
Appendix I
Appendix J
HOC difference 1VQ code book (3) value table
n | x1(n) | x2(n) | x3(n) | x4(n) |
0 | -173 | -285 | 5 | 28 |
1 | -35 | 19 | -179 | 76 |
2 | -357 | 57 | 51 | -20 |
3 | -127 | 285 | 51 | -20 |
4 | 11 | -19 | 5 | -116 |
5 | 333 | -171 | -41 | 28 |
6 | 11 | -19 | 143 | 124 |
7 | 333 | 209 | -41 | -36 |
Appendix K
Appendix L
HOC difference 2VQ code book (3) value table
n | x1(n) | x2(n) | x3(n) | x4(n) |
0 | -224 | -237 | 15 | -9 |
1 | -36 | -27 | -195 | -27 |
2 | -365 | 113 | 36 | 9 |
3 | -36 | 288 | -27 | -9 |
4 | 58 | 8 | 57 | 171 |
5 | 199 | -237 | 57 | -9 |
6 | -36 | 8 | 120 | -81 |
7 | 340 | 113 | -48 | -9 |
Appendix M
Appendix N
HOC difference 3VQ code book (3) value table
n | x1(n) | x2(n) | x3(n) | x4(n) |
0 | -94 | -248 | 60 | 0 |
1 | 0 | -17 | -100 | -90 |
2 | -376 | -17 | 40 | 18 |
3 | -141 | 247 | -80 | 36 |
4 | 47 | -50 | -80 | 162 |
5 | 329 | -182 | 20 | -18 |
6 | 0 | 49 | 200 | 0 |
7 | 282 | 181 | -20 | -18 |
Appendix O
Frequency chunks size table
Subframe amplitude sum | Frequency chunks 1 amplitude number | Frequency chunks 2 amplitude numbers | Frequency chunks 3 amplitude numbers | Frequency chunks 4 amplitude numbers |
9 | 2 | 2 | 2 | 3 |
10 | 2 | 2 | 3 | 3 |
11 | 2 | 3 | 3 | 3 |
12 | 2 | 3 | 3 | 4 |
13 | 3 | 3 | 3 | 4 |
14 | 3 | 3 | 4 | 4 |
15 | 3 | 3 | 4 | 5 |
16 | 3 | 4 | 4 | 5 |
17 | 3 | 4 | 5 | 5 |
18 | 4 | 4 | 5 | 5 |
19 | 4 | 4 | 5 | 6 |
20 | 4 | 4 | 6 | 6 |
21 | 4 | 5 | 6 | 6 |
22 | 4 | 5 | 6 | 7 |
23 | 5 | 5 | 6 | 7 |
24 | 5 | 5 | 7 | 7 |
25 | 5 | 6 | 7 | 7 |
26 | 5 | 6 | 7 | 8 |
27 | 5 | 6 | 8 | 8 |
28 | 6 | 6 | 8 | 8 |
29 | 6 | 6 | 8 | 9 |
30 | 6 | 7 | 8 | 9 |
31 | 6 | 7 | 9 | 9 |
32 | 6 | 7 | 9 | 10 |
33 | 7 | 7 | 9 | 10 |
34 | 7 | 8 | 9 | 10 |
35 | 7 | 8 | 10 | 10 |
36 | 7 | 8 | 10 | 11 |
37 | 8 | 8 | 10 | 11 |
38 | 8 | 9 | 10 | 11 |
39 | 8 | 9 | 11 | 11 |
40 | 8 | 9 | 11 | 12 |
41 | 8 | 9 | 11 | 13 |
42 | 8 | 9 | 12 | 13 |
43 | 8 | 10 | 12 | 13 |
44 | 9 | 10 | 12 | 13 |
45 | 9 | 10 | 12 | 14 |
46 | 9 | 10 | 13 | 14 |
47 | 9 | 11 | 13 | 14 |
48 | 10 | 11 | 13 | 14 |
49 | 10 | 11 | 13 | 15 |
50 | 10 | 11 | 14 | 15 |
51 | 10 | 12 | 14 | 15 |
52 | 10 | 12 | 14 | 16 |
53 | 11 | 12 | 14 | 16 |
54 | 11 | 12 | 15 | 16 |
55 | 11 | 12 | 15 | 17 |
56 | 11 | 13 | 15 | 17 |
Claims (30)
1. one kind is that the method comprises the steps: through the method for 90 milliseconds of position frames of satellite channel transmission with voice coding
With a digitization of speech signals is a column of figure speech samples;
Digital voice sample is assigned in the row subframe, and each subframe comprises many digital voice samples;
For each subframe is estimated a group model parameter, wherein model parameter comprises one group of spectral amplitude parameter of representing this subframe spectrum information;
Two continuous subframes in this sequence of subframes are combined into a piece;
Quantize the spectral amplitude parameter of two subframes in uniformly, wherein unified quantization comprises formation prediction spectral amplitude parameter from last quantized spectrum range parameter, calculating is as the surplus parameter of the difference of spectral amplitude parameter and prediction spectral amplitude parameter, in conjunction with the surplus parameter in one two subframe, and be the spectrum position of a group coding with the surplus parameter quantification of combination with many vector quantizers;
Increase redundant Error Control position for every coding spectrum bit, to prevent bit-errors occurring to the small part coding spectrum bit in this piece; With
The redundant Error Control position and the coding spectrum bit of the increase in two continuous blocks are combined into 90 milliseconds of position frames through the satellite channel transmission.
2. the method for claim 1, wherein the combination of the surplus parameter of two subframes further comprises in one:
Surplus parameter in each subframe is assigned in many frequency chunks;
Surplus parameter in each frequency chunks is carried out a linear transformation, to generate one group of conversion surplus coefficient of each subframe;
With the synthetic PRBA vector of the minority conversion surplus coefficient sets in all frequency chunks, and the conversion surplus coefficient sets of being left in each frequency chunks is synthesized the HOC vector of this frequency chunks;
Conversion PRBA vector to be generating a conversion PRBA vector, and compute vectors with difference with in conjunction with two conversion PRBA vectors in two subframes; With
Calculate the vector of each frequency chunks and poor, with two HOC vectors in conjunction with two subframes of this frequency chunks.
3. method as claimed in claim 1 or 2, wherein the spectral amplitude parameter is represented the logarithmic spectrum amplitude that the excitation of band more than speech model is estimated.
4. method as claimed in claim 3, wherein the spectral amplitude parameter is not estimate from rely on the spectrum that sound status calculates.
5. method as claimed in claim 1 or 2 predicts that wherein the spectral amplitude parameter forms be applied to the linear interpolation that the quantized spectrum amplitude of last last subframe is carried out less than one gain.
6. method as claimed in claim 1 or 2, wherein every redundant Error Control position is formed by the many block codes that comprise Golay sign indicating number and Hamming code.
7. method as claimed in claim 6, wherein those block codes comprise one [24,12] expansion Golay sign indicating number, three [23,12] Golay sign indicating numbers and two [15,11] Hamming codes.
8. method as claimed in claim 2, wherein the conversion surplus coefficient of each frequency chunks is with taking advantage of 2 conversion and calculate at two enterprising line linearities 2 of lowest-order DCT coefficient with one after the discrete cosine transform.
9. method as claimed in claim 8 is wherein used four frequency chunks, and the length of each frequency chunks is approximate is directly proportional with the number of spectral amplitude parameter in this subframe.
10. method as claimed in claim 2, wherein those vector quantizers comprise: one three shunt vector quantizer, it adds 6 for the PRBA vector with 8 of uses and adds 7; With one two shunt vector quantizer, it uses 8 for the PRBA phasor difference and adds 6.
11. method as claimed in claim 10, its meta frame comprises additional bit, and its representative is by the error in the conversion surplus coefficient of vector quantizer introducing.
12. method as claimed in claim 1 or 2, wherein sequence of subframes nominal origination interval is 22.5 milliseconds of each subframes.
13. method as claimed in claim 12, its meta frame is formed by 312 in half-rate mode, forms by 624 in full-rate mode.
14. a method that decodes voice from 90 milliseconds of position frames that receive through satellite channel, the method may further comprise the steps:
The position frame is divided into two position pieces, and wherein each piece is represented two voice subframes;
Error control decoding is implemented to each piece in redundant Error Control position in using every, to generate the error-decoded position that prevents bit-errors at least in part;
Utilize the spectral amplitude parameter of two subframes in one of the error-decoded position reconstruct uniformly, unified reconstruct wherein comprises uses one group of each surplus parameter of also calculating two subframes in conjunction with the surplus parameter thus of many vector quantizer code books reconstruct, from the spectral amplitude parameter of last reconstruct, form prediction spectral amplitude parameter, and in prediction spectral amplitude parameter, add each surplus parameter, to form the reconstruct spectral amplitude parameter of each subframe in this piece; With
The many digital voice samples that synthesize this subframe with the reconstruct spectral amplitude parameter of each subframe.
15. method as claimed in claim 14 wherein also comprises step from each surplus parameter in conjunction with surplus calculation of parameter two subframes of one:
With assigning in some frequency chunks of this piece in conjunction with the surplus parameter;
Form the conversion PRBA and the difference vector of this piece;
From form the HOC and the difference vector of each frequency chunks in conjunction with the surplus parameter;
Conversion PRBA and difference vector are carried out contrary and difference operation and inverse transformation, to form the PRBA vector of two subframes; With
HOC and difference vector are carried out contrary and difference operation, with the HOC vector of two subframes that form each frequency chunks; With
In conjunction with the PRBA vector of each frequency chunks of each subframe and HOC vector to form each surplus parameter of two subframes in this piece.
16. as claim 14 or 15 described methods, wherein reconstruct spectral amplitude parameter is represented the logarithmic spectrum amplitude of the excitation of band more than speech model.
17. as claim 14 or 15 described methods, also comprise a demoder, it utilizes synthetic one group of phase parameter of spectral amplitude parameter of reconstruct.
18., predict that wherein the spectral amplitude parameter forms be applied to the linear interpolation that the quantized spectrum amplitude of last last subframe is carried out less than one gain as claim 14 or 15 described methods.
19. as claim 14 or 15 described methods, wherein every Error Control position is to be formed by some block codes that comprise Golay sign indicating number and Hamming code.
20. method as claimed in claim 19, wherein those block codes comprise one [24,12] expansion Golay sign indicating number, three [23,12] Golay sign indicating numbers and two [15,11] Hamming codes.
21. method as claimed in claim 15, wherein the conversion surplus coefficient of each frequency chunks is with taking advantage of 2 conversion to calculate with the linearity 2 on two lowest-order DCT coefficients after the discrete cosine transform.
22. method as claimed in claim 21 is wherein used four frequency chunks, and the length of each frequency chunks is approximate is directly proportional with the number of spectral amplitude parameter in this subframe.
23. method as claimed in claim 15, wherein those vector quantizer code books comprise: one three vector quantizer code book along separate routes, and it uses 8 to add 6 and add 7 for PRBA and vector; With one two shunt vector quantizer code book, it uses 8 for the PRBA difference vector and adds 6.
24. method as claimed in claim 23, its meta frame comprises additional position, and its representative is by the error in the conversion surplus coefficient of vector quantizer code book introducing.
25. as claim 14 or 15 described methods, wherein the nominal duration of subframe is 22.5 milliseconds.
26. method as claimed in claim 25, its meta frame is formed by 312 in half-rate mode, forms by 624 in full-rate mode.
27. one with the scrambler of voice coding for 90 milliseconds of position frames transmitting in satellite channel, comprising:
An Aristogrid is set to convert a voice signal to a column of figure speech samples;
A sub-frame generator is set to digital voice sample is assigned in the row subframe, and each subframe comprises a plurality of digital voice samples;
A model parameter estimation device is set to estimate a group model parameter of each subframe, and wherein, model parameter comprises one group of spectral amplitude parameter of representing the spectrum information of this subframe;
A colligator is set to two continuous subframes in this sequence of subframes are combined into one;
A two frame spectral magnitude quantizer, be set to the parameter of two subframes of this piece of unified quantization, wherein, unified quantization comprises forming from last quantized spectrum range parameter predicts the spectral amplitude parameter, calculating is as the surplus parameter of the difference of spectral amplitude parameter and prediction spectral amplitude parameter, in conjunction with the surplus parameter of one two subframes, and be the spectrum position of a group coding with the surplus parameter quantification of combination with many vector quantizers;
An error code scrambler is set in the coding spectrum bit of each piece to increase the Error Control position in case bit-errors occurs to the small part coding spectrum bit in the piece here; With
A colligator is set to the redundant Error Control position and the coding spectrum bit of the increase in two continuous blocks are combined into 90 milliseconds of position frames through the satellite channel transmission.
28. scrambler as claimed in claim 27, wherein two frame spectral magnitude quantizer are set to the following method the surplus parameter in conjunction with two subframes in this piece:
The surplus parameter of each subframe is assigned in some frequency chunks;
Surplus parameter in each frequency chunks is implemented a linear transformation, to generate one group of conversion surplus coefficient of each subframe;
With the synthetic PRBA vector of the minority conversion surplus coefficient sets in all frequency chunks, and the conversion surplus coefficient sets of being left in each frequency chunks is synthesized the HOC vector of this frequency chunks;
Conversion PRBA vector to be generating a conversion PRBA vector, and compute vectors with difference with in conjunction with two conversion PRBA vectors in two subframes; With
Calculate the vector of each frequency chunks and difference with two HOC vectors in conjunction with two subframes of this frequency chunks.
29. the demoder of a decoded speech from 90 milliseconds of position frames that receive through satellite channel comprises:
A dispenser is set to the position frame is divided into two position pieces, and wherein each piece is represented two voice subframes;
An error control decoder is configured such that with the redundant Error Control position that is contained in this piece each piece is implemented error control decoding, to generate the error-decoded position that prevents bit-errors at least in part;
A two frame spectral amplitude reconstructor, be set to the spectral amplitude parameter of two subframes in one of the unified reconstruct, wherein unified reconstruct comprises uses one group of many vector quantizer code books reconstruct in conjunction with the surplus parameter, and calculate each surplus parameters of two subframes thus, from the spectral amplitude parameter of last reconstruct, form prediction spectral amplitude parameter, and in prediction spectral amplitude parameter, add each surplus parameter, to form the reconstruct spectral amplitude parameter of each subframe in this piece; With
A compositor is set to utilize the reconstruct spectral amplitude parameter of each subframe to synthesize a plurality of digital voice samples of this subframe.
30. demoder as claimed in claim 29, wherein two frame spectral magnitude quantizer be set to come as follows from one in conjunction with each surplus parameter of calculating two subframes the surplus parameter:
With assigning in some frequency chunks of this piece in conjunction with the surplus parameter;
Form the conversion PRBA and the difference vector of this piece;
From in conjunction with HOC that forms each frequency chunks the surplus parameter and difference vector;
Conversion PRBA and difference vector are carried out contrary and difference operation and inverse transformation, to generate the PRBA vector of two subframes; With
HOC and difference vector are carried out contrary and difference operation, with the HOC vector of two subframes that generate each frequency chunks; With
In conjunction with the PRBA vector and the HOC vector of each frequency chunks of each subframe, with each surplus parameter of two subframes that generate this piece.
Applications Claiming Priority (3)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US08/818,137 US6131084A (en) | 1997-03-14 | 1997-03-14 | Dual subframe quantization of spectral magnitudes |
US818,137 | 1997-03-14 | ||
US818137 | 1997-03-14 |
Publications (2)
Publication Number | Publication Date |
---|---|
CN1193786A CN1193786A (en) | 1998-09-23 |
CN1123866C true CN1123866C (en) | 2003-10-08 |
Family
ID=25224767
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN98105557A Expired - Lifetime CN1123866C (en) | 1997-03-14 | 1998-03-13 | Dual subframe quantization of spectral magnitudes |
Country Status (8)
Country | Link |
---|---|
US (1) | US6131084A (en) |
JP (1) | JP4275761B2 (en) |
KR (1) | KR100531266B1 (en) |
CN (1) | CN1123866C (en) |
BR (1) | BR9803683A (en) |
FR (1) | FR2760885B1 (en) |
GB (1) | GB2324689B (en) |
RU (1) | RU2214048C2 (en) |
Families Citing this family (86)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US6765904B1 (en) | 1999-08-10 | 2004-07-20 | Texas Instruments Incorporated | Packet networks |
US6269332B1 (en) * | 1997-09-30 | 2001-07-31 | Siemens Aktiengesellschaft | Method of encoding a speech signal |
US6199037B1 (en) * | 1997-12-04 | 2001-03-06 | Digital Voice Systems, Inc. | Joint quantization of speech subframe voicing metrics and fundamental frequencies |
AU730123B2 (en) * | 1997-12-08 | 2001-02-22 | Mitsubishi Denki Kabushiki Kaisha | Method and apparatus for processing sound signal |
US6182033B1 (en) * | 1998-01-09 | 2001-01-30 | At&T Corp. | Modular approach to speech enhancement with an application to speech coding |
US7392180B1 (en) * | 1998-01-09 | 2008-06-24 | At&T Corp. | System and method of coding sound signals using sound enhancement |
FR2784218B1 (en) * | 1998-10-06 | 2000-12-08 | Thomson Csf | LOW-SPEED SPEECH CODING METHOD |
WO2000022606A1 (en) * | 1998-10-13 | 2000-04-20 | Motorola Inc. | Method and system for determining a vector index to represent a plurality of speech parameters in signal processing for identifying an utterance |
JP2000308167A (en) * | 1999-04-20 | 2000-11-02 | Mitsubishi Electric Corp | Voice encoding device |
US6744757B1 (en) | 1999-08-10 | 2004-06-01 | Texas Instruments Incorporated | Private branch exchange systems for packet communications |
US6757256B1 (en) | 1999-08-10 | 2004-06-29 | Texas Instruments Incorporated | Process of sending packets of real-time information |
US6678267B1 (en) | 1999-08-10 | 2004-01-13 | Texas Instruments Incorporated | Wireless telephone with excitation reconstruction of lost packet |
US6804244B1 (en) | 1999-08-10 | 2004-10-12 | Texas Instruments Incorporated | Integrated circuits for packet communications |
US6801499B1 (en) * | 1999-08-10 | 2004-10-05 | Texas Instruments Incorporated | Diversity schemes for packet communications |
US6801532B1 (en) * | 1999-08-10 | 2004-10-05 | Texas Instruments Incorporated | Packet reconstruction processes for packet communications |
US7315815B1 (en) | 1999-09-22 | 2008-01-01 | Microsoft Corporation | LPC-harmonic vocoder with superframe structure |
US6377916B1 (en) * | 1999-11-29 | 2002-04-23 | Digital Voice Systems, Inc. | Multiband harmonic transform coder |
US7574351B2 (en) * | 1999-12-14 | 2009-08-11 | Texas Instruments Incorporated | Arranging CELP information of one frame in a second packet |
KR100383668B1 (en) * | 2000-09-19 | 2003-05-14 | 한국전자통신연구원 | The Speech Coding System Using Time-Seperated Algorithm |
US7116787B2 (en) * | 2001-05-04 | 2006-10-03 | Agere Systems Inc. | Perceptual synthesis of auditory scenes |
US7644003B2 (en) * | 2001-05-04 | 2010-01-05 | Agere Systems Inc. | Cue-based audio coding/decoding |
US7243295B2 (en) * | 2001-06-12 | 2007-07-10 | Intel Corporation | Low complexity channel decoders |
US20030135374A1 (en) * | 2002-01-16 | 2003-07-17 | Hardwick John C. | Speech synthesizer |
US7970606B2 (en) | 2002-11-13 | 2011-06-28 | Digital Voice Systems, Inc. | Interoperable vocoder |
US7634399B2 (en) * | 2003-01-30 | 2009-12-15 | Digital Voice Systems, Inc. | Voice transcoder |
US8359197B2 (en) | 2003-04-01 | 2013-01-22 | Digital Voice Systems, Inc. | Half-rate vocoder |
US6980933B2 (en) * | 2004-01-27 | 2005-12-27 | Dolby Laboratories Licensing Corporation | Coding techniques using estimated spectral magnitude and phase derived from MDCT coefficients |
DE102004007191B3 (en) | 2004-02-13 | 2005-09-01 | Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V. | Audio coding |
DE102004007184B3 (en) | 2004-02-13 | 2005-09-22 | Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V. | Method and apparatus for quantizing an information signal |
US7805313B2 (en) * | 2004-03-04 | 2010-09-28 | Agere Systems Inc. | Frequency-based coding of channels in parametric multi-channel coding systems |
US7668712B2 (en) | 2004-03-31 | 2010-02-23 | Microsoft Corporation | Audio encoding and decoding with intra frames and adaptive forward error correction |
US7522730B2 (en) * | 2004-04-14 | 2009-04-21 | M/A-Com, Inc. | Universal microphone for secure radio communication |
KR101037931B1 (en) * | 2004-05-13 | 2011-05-30 | 삼성전자주식회사 | Speech compression and decompression apparatus and method thereof using two-dimensional processing |
US8204261B2 (en) * | 2004-10-20 | 2012-06-19 | Fraunhofer-Gesellschaft Zur Foerderung Der Angewandten Forschung E.V. | Diffuse sound shaping for BCC schemes and the like |
US7720230B2 (en) * | 2004-10-20 | 2010-05-18 | Agere Systems, Inc. | Individual channel shaping for BCC schemes and the like |
US7787631B2 (en) * | 2004-11-30 | 2010-08-31 | Agere Systems Inc. | Parametric coding of spatial audio with cues based on transmitted channels |
WO2006060279A1 (en) * | 2004-11-30 | 2006-06-08 | Agere Systems Inc. | Parametric coding of spatial audio with object-based side information |
US7761304B2 (en) * | 2004-11-30 | 2010-07-20 | Agere Systems Inc. | Synchronizing parametric coding of spatial audio with externally provided downmix |
US7903824B2 (en) * | 2005-01-10 | 2011-03-08 | Agere Systems Inc. | Compact side information for parametric coding of spatial audio |
EP1691348A1 (en) * | 2005-02-14 | 2006-08-16 | Ecole Polytechnique Federale De Lausanne | Parametric joint-coding of audio sources |
JP4849297B2 (en) * | 2005-04-26 | 2012-01-11 | ソニー株式会社 | Encoding apparatus and method, decoding apparatus and method, and program |
WO2006126858A2 (en) | 2005-05-26 | 2006-11-30 | Lg Electronics Inc. | Method of encoding and decoding an audio signal |
US7177804B2 (en) | 2005-05-31 | 2007-02-13 | Microsoft Corporation | Sub-band voice codec with multi-stage codebooks and redundant coding |
US7831421B2 (en) | 2005-05-31 | 2010-11-09 | Microsoft Corporation | Robust decoder |
US7707034B2 (en) | 2005-05-31 | 2010-04-27 | Microsoft Corporation | Audio codec post-filter |
US8494667B2 (en) * | 2005-06-30 | 2013-07-23 | Lg Electronics Inc. | Apparatus for encoding and decoding audio signal and method thereof |
WO2007004830A1 (en) | 2005-06-30 | 2007-01-11 | Lg Electronics Inc. | Apparatus for encoding and decoding audio signal and method thereof |
EP1913578B1 (en) | 2005-06-30 | 2012-08-01 | LG Electronics Inc. | Method and apparatus for decoding an audio signal |
ES2433316T3 (en) * | 2005-07-19 | 2013-12-10 | Koninklijke Philips N.V. | Multi-channel audio signal generation |
US7788107B2 (en) | 2005-08-30 | 2010-08-31 | Lg Electronics Inc. | Method for decoding an audio signal |
US7987097B2 (en) | 2005-08-30 | 2011-07-26 | Lg Electronics | Method for decoding an audio signal |
JP4859925B2 (en) | 2005-08-30 | 2012-01-25 | エルジー エレクトロニクス インコーポレイティド | Audio signal decoding method and apparatus |
JP5111375B2 (en) | 2005-08-30 | 2013-01-09 | エルジー エレクトロニクス インコーポレイティド | Apparatus and method for encoding and decoding audio signals |
KR100857113B1 (en) | 2005-10-05 | 2008-09-08 | 엘지전자 주식회사 | Method and apparatus for signal processing and encoding and decoding method, and apparatus therefor |
US7672379B2 (en) | 2005-10-05 | 2010-03-02 | Lg Electronics Inc. | Audio signal processing, encoding, and decoding |
US8068569B2 (en) | 2005-10-05 | 2011-11-29 | Lg Electronics, Inc. | Method and apparatus for signal processing and encoding and decoding |
WO2007040349A1 (en) | 2005-10-05 | 2007-04-12 | Lg Electronics Inc. | Method and apparatus for signal processing |
US7696907B2 (en) | 2005-10-05 | 2010-04-13 | Lg Electronics Inc. | Method and apparatus for signal processing and encoding and decoding method, and apparatus therefor |
US7646319B2 (en) | 2005-10-05 | 2010-01-12 | Lg Electronics Inc. | Method and apparatus for signal processing and encoding and decoding method, and apparatus therefor |
US7751485B2 (en) | 2005-10-05 | 2010-07-06 | Lg Electronics Inc. | Signal processing using pilot based coding |
US7974713B2 (en) | 2005-10-12 | 2011-07-05 | Fraunhofer-Gesellschaft Zur Foerderung Der Angewandten Forschung E.V. | Temporal and spatial shaping of multi-channel audio signals |
US7742913B2 (en) | 2005-10-24 | 2010-06-22 | Lg Electronics Inc. | Removing time delays in signal paths |
US7934137B2 (en) | 2006-02-06 | 2011-04-26 | Qualcomm Incorporated | Message remapping and encoding |
WO2007120023A1 (en) * | 2006-04-19 | 2007-10-25 | Samsung Electronics Co., Ltd. | Apparatus and method for supporting relay service in a multi-hop relay broadband wireless access communication system |
UA91827C2 (en) * | 2006-09-29 | 2010-09-10 | Общество С Ограниченной Ответственностью "Парисет" | Method of multi-component coding and decoding electric signals of different origin |
DE102006051673A1 (en) * | 2006-11-02 | 2008-05-15 | Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V. | Apparatus and method for reworking spectral values and encoders and decoders for audio signals |
JP5270566B2 (en) | 2006-12-07 | 2013-08-21 | エルジー エレクトロニクス インコーポレイティド | Audio processing method and apparatus |
KR101062353B1 (en) * | 2006-12-07 | 2011-09-05 | 엘지전자 주식회사 | Method for decoding audio signal and apparatus therefor |
US8036886B2 (en) * | 2006-12-22 | 2011-10-11 | Digital Voice Systems, Inc. | Estimation of pulsed speech model parameters |
JP4254866B2 (en) * | 2007-01-31 | 2009-04-15 | ソニー株式会社 | Information processing apparatus and method, program, and recording medium |
JP4708446B2 (en) * | 2007-03-02 | 2011-06-22 | パナソニック株式会社 | Encoding device, decoding device and methods thereof |
AU2008326957B2 (en) * | 2007-11-21 | 2011-06-30 | Lg Electronics Inc. | A method and an apparatus for processing a signal |
EP2229677B1 (en) | 2007-12-18 | 2015-09-16 | LG Electronics Inc. | A method and an apparatus for processing an audio signal |
US8195452B2 (en) * | 2008-06-12 | 2012-06-05 | Nokia Corporation | High-quality encoding at low-bit rates |
EP2410521B1 (en) * | 2008-07-11 | 2017-10-04 | Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V. | Audio signal encoder, method for generating an audio signal and computer program |
EP2304722B1 (en) * | 2008-07-17 | 2018-03-14 | Nokia Technologies Oy | Method and apparatus for fast nearest-neighbor search for vector quantizers |
JP5603339B2 (en) * | 2008-10-29 | 2014-10-08 | ドルビー インターナショナル アーベー | Protection of signal clipping using existing audio gain metadata |
US9275644B2 (en) * | 2012-01-20 | 2016-03-01 | Qualcomm Incorporated | Devices for redundant frame coding and decoding |
US8737645B2 (en) * | 2012-10-10 | 2014-05-27 | Archibald Doty | Increasing perceived signal strength using persistence of hearing characteristics |
PT2959482T (en) | 2013-02-20 | 2019-08-02 | Fraunhofer Ges Forschung | Apparatus and method for encoding or decoding an audio signal using a transient-location dependent overlap |
EP2830058A1 (en) | 2013-07-22 | 2015-01-28 | Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V. | Frequency-domain audio coding supporting transform length switching |
CN105723456B (en) * | 2013-10-18 | 2019-12-13 | 弗朗霍夫应用科学研究促进协会 | encoder, decoder, encoding and decoding method for adaptively encoding and decoding audio signal |
JP6366706B2 (en) * | 2013-10-18 | 2018-08-01 | フラウンホーファー−ゲゼルシャフト・ツール・フェルデルング・デル・アンゲヴァンテン・フォルシュング・アインゲトラーゲネル・フェライン | Audio signal coding and decoding concept using speech-related spectral shaping information |
RU2691122C1 (en) * | 2018-06-13 | 2019-06-11 | Ордена трудового Красного Знамени федеральное государственное бюджетное образовательное учреждение высшего образования "Московский технический университет связи и информатики" (МТУСИ) | Method and apparatus for companding audio broadcast signals |
US11270714B2 (en) | 2020-01-08 | 2022-03-08 | Digital Voice Systems, Inc. | Speech coding using time-varying interpolation |
US11990144B2 (en) | 2021-07-28 | 2024-05-21 | Digital Voice Systems, Inc. | Reducing perceived effects of non-voice data in digital speech |
Family Cites Families (41)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US3706929A (en) * | 1971-01-04 | 1972-12-19 | Philco Ford Corp | Combined modem and vocoder pipeline processor |
US3982070A (en) * | 1974-06-05 | 1976-09-21 | Bell Telephone Laboratories, Incorporated | Phase vocoder speech synthesis system |
US3975587A (en) * | 1974-09-13 | 1976-08-17 | International Telephone And Telegraph Corporation | Digital vocoder |
US4091237A (en) * | 1975-10-06 | 1978-05-23 | Lockheed Missiles & Space Company, Inc. | Bi-Phase harmonic histogram pitch extractor |
US4422459A (en) * | 1980-11-18 | 1983-12-27 | University Patents, Inc. | Electrocardiographic means and method for detecting potential ventricular tachycardia |
ATE15415T1 (en) * | 1981-09-24 | 1985-09-15 | Gretag Ag | METHOD AND DEVICE FOR REDUNDANCY-REDUCING DIGITAL SPEECH PROCESSING. |
AU570439B2 (en) * | 1983-03-28 | 1988-03-17 | Compression Labs, Inc. | A combined intraframe and interframe transform coding system |
NL8400728A (en) * | 1984-03-07 | 1985-10-01 | Philips Nv | DIGITAL VOICE CODER WITH BASE BAND RESIDUCODING. |
US4583549A (en) * | 1984-05-30 | 1986-04-22 | Samir Manoli | ECG electrode pad |
US4622680A (en) * | 1984-10-17 | 1986-11-11 | General Electric Company | Hybrid subband coder/decoder method and apparatus |
US4885790A (en) * | 1985-03-18 | 1989-12-05 | Massachusetts Institute Of Technology | Processing of acoustic waveforms |
US5067158A (en) * | 1985-06-11 | 1991-11-19 | Texas Instruments Incorporated | Linear predictive residual representation via non-iterative spectral reconstruction |
US4879748A (en) * | 1985-08-28 | 1989-11-07 | American Telephone And Telegraph Company | Parallel processing pitch detector |
US4720861A (en) * | 1985-12-24 | 1988-01-19 | Itt Defense Communications A Division Of Itt Corporation | Digital speech coding circuit |
CA1299750C (en) * | 1986-01-03 | 1992-04-28 | Ira Alan Gerson | Optimal method of data reduction in a speech recognition system |
US4797926A (en) * | 1986-09-11 | 1989-01-10 | American Telephone And Telegraph Company, At&T Bell Laboratories | Digital speech vocoder |
US5054072A (en) * | 1987-04-02 | 1991-10-01 | Massachusetts Institute Of Technology | Coding of acoustic waveforms |
US5095392A (en) * | 1988-01-27 | 1992-03-10 | Matsushita Electric Industrial Co., Ltd. | Digital signal magnetic recording/reproducing apparatus using multi-level QAM modulation and maximum likelihood decoding |
US5023910A (en) * | 1988-04-08 | 1991-06-11 | At&T Bell Laboratories | Vector quantization in a harmonic speech coding arrangement |
US4821119A (en) * | 1988-05-04 | 1989-04-11 | Bell Communications Research, Inc. | Method and apparatus for low bit-rate interframe video coding |
US4979110A (en) * | 1988-09-22 | 1990-12-18 | Massachusetts Institute Of Technology | Characterizing the statistical properties of a biological signal |
JP3033060B2 (en) * | 1988-12-22 | 2000-04-17 | 国際電信電話株式会社 | Voice prediction encoding / decoding method |
JPH0782359B2 (en) * | 1989-04-21 | 1995-09-06 | 三菱電機株式会社 | Speech coding apparatus, speech decoding apparatus, and speech coding / decoding apparatus |
WO1990013112A1 (en) * | 1989-04-25 | 1990-11-01 | Kabushiki Kaisha Toshiba | Voice encoder |
US5036515A (en) * | 1989-05-30 | 1991-07-30 | Motorola, Inc. | Bit error rate detection |
US5307441A (en) * | 1989-11-29 | 1994-04-26 | Comsat Corporation | Wear-toll quality 4.8 kbps speech codec |
US5081681B1 (en) * | 1989-11-30 | 1995-08-15 | Digital Voice Systems Inc | Method and apparatus for phase synthesis for speech processing |
US5511073A (en) * | 1990-06-25 | 1996-04-23 | Qualcomm Incorporated | Method and apparatus for the formatting of data for transmission |
US5226108A (en) * | 1990-09-20 | 1993-07-06 | Digital Voice Systems, Inc. | Processing a speech signal with estimated pitch |
US5216747A (en) * | 1990-09-20 | 1993-06-01 | Digital Voice Systems, Inc. | Voiced/unvoiced estimation of an acoustic signal |
US5247579A (en) * | 1990-12-05 | 1993-09-21 | Digital Voice Systems, Inc. | Methods for speech transmission |
US5630011A (en) * | 1990-12-05 | 1997-05-13 | Digital Voice Systems, Inc. | Quantization of harmonic amplitudes representing speech |
US5226084A (en) * | 1990-12-05 | 1993-07-06 | Digital Voice Systems, Inc. | Methods for speech quantization and error correction |
US5233660A (en) * | 1991-09-10 | 1993-08-03 | At&T Bell Laboratories | Method and apparatus for low-delay celp speech coding and decoding |
EP0751496B1 (en) * | 1992-06-29 | 2000-04-19 | Nippon Telegraph And Telephone Corporation | Speech coding method and apparatus for the same |
US5596659A (en) * | 1992-09-01 | 1997-01-21 | Apple Computer, Inc. | Preprocessing and postprocessing for vector quantization |
AU5682494A (en) * | 1992-11-30 | 1994-06-22 | Digital Voice Systems, Inc. | Method and apparatus for quantization of harmonic amplitudes |
US5517511A (en) * | 1992-11-30 | 1996-05-14 | Digital Voice Systems, Inc. | Digital transmission of acoustic signals over a noisy communication channel |
JP2655046B2 (en) * | 1993-09-13 | 1997-09-17 | 日本電気株式会社 | Vector quantizer |
US5704003A (en) * | 1995-09-19 | 1997-12-30 | Lucent Technologies Inc. | RCELP coder |
US5696873A (en) * | 1996-03-18 | 1997-12-09 | Advanced Micro Devices, Inc. | Vocoder system and method for performing pitch estimation using an adaptive correlation sample window |
-
1997
- 1997-03-14 US US08/818,137 patent/US6131084A/en not_active Expired - Lifetime
-
1998
- 1998-03-13 CN CN98105557A patent/CN1123866C/en not_active Expired - Lifetime
- 1998-03-13 JP JP06340098A patent/JP4275761B2/en not_active Expired - Lifetime
- 1998-03-13 KR KR1019980008546A patent/KR100531266B1/en not_active IP Right Cessation
- 1998-03-13 FR FR9803119A patent/FR2760885B1/en not_active Expired - Lifetime
- 1998-03-13 BR BR9803683-1A patent/BR9803683A/en not_active Application Discontinuation
- 1998-03-13 RU RU98104951/09A patent/RU2214048C2/en active
- 1998-03-16 GB GB9805682A patent/GB2324689B/en not_active Expired - Lifetime
Also Published As
Publication number | Publication date |
---|---|
RU2214048C2 (en) | 2003-10-10 |
GB2324689B (en) | 2001-09-19 |
GB2324689A (en) | 1998-10-28 |
BR9803683A (en) | 1999-10-19 |
JPH10293600A (en) | 1998-11-04 |
KR19980080249A (en) | 1998-11-25 |
US6131084A (en) | 2000-10-10 |
FR2760885A1 (en) | 1998-09-18 |
GB9805682D0 (en) | 1998-05-13 |
CN1193786A (en) | 1998-09-23 |
KR100531266B1 (en) | 2006-03-27 |
FR2760885B1 (en) | 2000-12-29 |
JP4275761B2 (en) | 2009-06-10 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN1123866C (en) | Dual subframe quantization of spectral magnitudes | |
CN1154283C (en) | Coding method and apparatus, and decoding method and apparatus | |
US7957963B2 (en) | Voice transcoder | |
CN1158647C (en) | Spectral magnetude quantization for a speech coder | |
CN1132154C (en) | Multi-channel signal encoding and decoding | |
CN1136537C (en) | Synthesis of speech using regenerated phase information | |
JP4218134B2 (en) | Decoding apparatus and method, and program providing medium | |
JP4101957B2 (en) | Joint quantization of speech parameters | |
US8359197B2 (en) | Half-rate vocoder | |
US7840402B2 (en) | Audio encoding device, audio decoding device, and method thereof | |
JP2001222297A (en) | Multi-band harmonic transform coder | |
US8386267B2 (en) | Stereo signal encoding device, stereo signal decoding device and methods for them | |
CN1432176A (en) | Method and appts. for predictively quantizing voice speech | |
CN1288557A (en) | Decoding method and systme comprising adaptive postfilter | |
CN1228867A (en) | Method and apparatus for improving voice quality of tandemed vocoders | |
CN1795495A (en) | Audio encoding device, audio decoding device, audio encodingmethod, and audio decoding method | |
CN104123946A (en) | Systemand method for including identifier with packet associated with speech signal | |
CN1265217A (en) | Method and appts. for speech enhancement in speech communication system | |
JP2004287397A (en) | Interoperable vocoder | |
CN1334952A (en) | Coded enhancement feature for improved performance in coding communication signals | |
CN1200404C (en) | Relative pulse position of code-excited linear predict voice coding | |
CN1192357C (en) | Adaptive criterion for speech coding | |
US20050228652A1 (en) | Fixed sound source vector generation method and fixed sound source codebook | |
JP2005215502A (en) | Encoding device, decoding device, and method thereof | |
Naitoh et al. | Half-rate voice coding system for mobile radio |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
C14 | Grant of patent or utility model | ||
GR01 | Patent grant | ||
CX01 | Expiry of patent term | ||
CX01 | Expiry of patent term |
Granted publication date: 20031008 |