WO2000000963A1 - Codeur vocal - Google Patents
Codeur vocal Download PDFInfo
- Publication number
- WO2000000963A1 WO2000000963A1 PCT/JP1999/003492 JP9903492W WO0000963A1 WO 2000000963 A1 WO2000000963 A1 WO 2000000963A1 JP 9903492 W JP9903492 W JP 9903492W WO 0000963 A1 WO0000963 A1 WO 0000963A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- sound source
- gain
- output
- mode
- audio signal
- Prior art date
Links
- 239000013598 vector Substances 0.000 claims abstract description 106
- 238000013139 quantization Methods 0.000 claims description 98
- 230000003044 adaptive effect Effects 0.000 claims description 75
- 238000001228 spectrum Methods 0.000 claims description 62
- 230000005236 sound signal Effects 0.000 claims description 60
- 238000004364 calculation method Methods 0.000 claims description 32
- 230000003595 spectral effect Effects 0.000 claims description 19
- 230000015572 biosynthetic process Effects 0.000 claims description 9
- 238000003786 synthesis reaction Methods 0.000 claims description 9
- 230000005284 excitation Effects 0.000 claims description 6
- 230000002194 synthesizing effect Effects 0.000 claims 1
- 238000000034 method Methods 0.000 description 15
- 230000004044 response Effects 0.000 description 12
- 238000010586 diagram Methods 0.000 description 10
- 238000004458 analytical method Methods 0.000 description 7
- 239000000284 extract Substances 0.000 description 5
- 230000001186 cumulative effect Effects 0.000 description 2
- 238000001914 filtration Methods 0.000 description 2
- 238000000050 ionisation spectroscopy Methods 0.000 description 2
- 208000014451 palmoplantar keratoderma and congenital alopecia 2 Diseases 0.000 description 2
- 241000370685 Arge Species 0.000 description 1
- 101100386910 Caenorhabditis elegans laf-1 gene Proteins 0.000 description 1
- 238000006243 chemical reaction Methods 0.000 description 1
- 238000004891 communication Methods 0.000 description 1
- 230000006835 compression Effects 0.000 description 1
- 238000007906 compression Methods 0.000 description 1
- 239000012141 concentrate Substances 0.000 description 1
- 230000006866 deterioration Effects 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 238000000605 extraction Methods 0.000 description 1
- 230000006870 function Effects 0.000 description 1
- 238000011002 quantification Methods 0.000 description 1
- 238000001308 synthesis method Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L19/00—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
- G10L19/04—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using predictive techniques
- G10L19/08—Determination or coding of the excitation function; Determination or coding of the long-term prediction parameters
- G10L19/10—Determination or coding of the excitation function; Determination or coding of the long-term prediction parameters the excitation function being a multipulse excitation
- G10L19/107—Sparse pulse excitation, e.g. by using algebraic codebook
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L19/00—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
- G10L19/04—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using predictive techniques
- G10L19/16—Vocoder architecture
- G10L19/18—Vocoders using multiple modes
Definitions
- the present invention relates to an audio encoding device, and more particularly to an audio encoding device for encoding an audio signal at a low bit rate with high quality.
- CEL P is described, for example, in the paper "Code—excited li near near pred icti on; High quali li ty sp e ch at ve ry ow bitrates" by M. Schroeder and B. Ata1 (Proc. I CASS P, pp.
- the transmitting side first extracts spectral parameters representing the spectral characteristics of the audio signal from the audio signal for each frame (for example, 20 ms) using linear prediction (LPC) analysis.
- LPC linear prediction
- the frame is further divided into subframes (for example, 5 ms), and the parameters (adapted parameters and delay parameters corresponding to the pitch period) in the adaptive codebook are extracted for each subframe based on the past sound source signals.
- the pitch of the audio signal of the subframe is predicted by the adaptive codebook.
- an optimum sound source code vector is selected from a sound source book (vector quantization codebook) composed of predetermined types of noise signals, and the optimum gain is selected.
- a quantized source signal is obtained.
- the sound source code vector is selected so as to minimize the error power between the signal synthesized with the selected noise signal and the residual signal.
- an index and a gain representing the type of the selected code vector, and the parameters of the spectrum parameter and the adaptive code book are combined and transmitted by a multiplexer unit. The description on the receiving side is omitted.
- the sound source signal is represented by a plurality of pulses, and the position of each pulse is represented by a predetermined number of bits and transmitted.
- the amplitude of each pulse is limited to +1.0 or 1.1.0. Therefore, according to the method described in Reference 3, the amount of calculation for pulse search can be significantly reduced.
- bit rates of 8 kb / s and above can provide good sound quality, At a bit rate lower than this, there is a problem that the sound quality of the background noise portion of the coded sound is extremely deteriorated, especially when background noise is superimposed on the sound.
- the reason is as follows.
- the sound source signal is represented by a combination of a plurality of pulses. Therefore, in the vowel section of the voice, the pulses concentrate around the pitch pulse, which is the starting point of the pitch. Therefore, it is possible to efficiently represent an audio signal with a small number of pulses.
- a random signal such as background noise
- it is necessary to generate pulses at random so it is difficult to represent the background noise well with a small number of pulses. Therefore, if the bit rate is reduced and the number of pulses is reduced, the sound quality against the background noise will be rapidly deteriorated.
- an object of the present invention is to solve the above-mentioned problems and to provide a speech coding apparatus with a relatively small amount of computation even when the bit rate is low, and in particular, with less deterioration of sound quality against background noise. . Disclosure of the invention
- a speech encoding device includes a spectrum parameter calculation unit that receives a speech signal, obtains a spectrum parameter, and quantizes the spectrum parameter.
- An adaptive codebook section that obtains a delay and gain from an obtained sound source signal using an adaptive codebook, predicts an audio signal, and obtains a residual, and quantizes an audio signal of the audio signal using the spectrum parameter.
- a sound source quantization device that outputs a sound signal from the sound signal and determines a mode by extracting a feature of the sound signal from the sound signal; and a case where an output of the judgment unit is a predetermined mode.
- a code book for expressing a sound source signal by a combination of a plurality of non-zero pulses, and for quantizing the amplitude or polarity of the pulses collectively and storing the code book in the code book.
- Sound source quantization that searches by combining a vector and a plurality of shift amounts for shifting the position of the pulse, and outputs a combination of the vector and the shift amount that minimizes distortion with the input voice.
- a multiplexer unit that combines and outputs the output of the spectrum parameter calculation unit, the output of the discrimination unit, the output of the adaptive codebook unit, and the output of the excitation quantization unit.
- a speech encoding device extracts a feature from the speech signal.
- a discriminating unit for discriminating a mode by using a combination of a plurality of non-zero pulses when the output of the discriminating unit is a predetermined mode, and quantifying the amplitude or polarity of the pulse collectively.
- a sound source quantization unit that has a codebook to be converted, generates a code vector that generates the position of the pulse according to a predetermined rule, and minimizes distortion with an input voice, and the spectrum parameter calculation unit
- a multiplexer unit for combining and outputting the output of the discriminating unit, the output of the adaptive codebook unit, and the output of the sound source quantization unit.
- a speech encoding device includes: a discriminating unit that extracts a feature from the speech signal to discriminate a mode; A codebook that represents a combination of a number of non-zero pulses, has a codebook that quantizes the amplitude or polarity of the pulses together, and a gain codebook that quantizes the gain, and a code vector stored in the codebook.
- a search is performed by combining a plurality of shift amounts for shifting the position of the pulse and a gain code vector stored in the gain codebook, and a code vector and a shift amount for minimizing a distortion from an input voice are obtained.
- a sound source quantization unit that outputs a combination of gain vectors, an output of the spectrum parameter calculation unit, an output of the discrimination unit, and the adaptive code With the output of the book section, and a multiplexer unit you output by combining the output of the sound source quantization section.
- a speech encoding apparatus includes: a discriminating unit that extracts a feature from the speech signal to discriminate a mode; A codebook that quantizes the amplitude or polarity of the pulse collectively and a gain codebook that quantizes the gain.The position of the pulse is determined according to a predetermined rule.
- a sound source quantization unit for outputting a combination of a code vector and a gain code vector for minimizing distortion of the generated and input voice, an output of the spectrum parameter calculation unit, and an output of the discrimination unit.
- a multiplexer section for combining and outputting the output of the adaptive codebook section and the output of the excitation quantization section.
- FIG. 2 is a block diagram showing the configuration of a second embodiment of the present invention
- FIG. 3 is a block diagram showing a configuration of a third embodiment of the present invention.
- FIG. 4 is a block diagram showing a configuration of a fourth embodiment of the present invention.
- FIG. 5 is a block diagram showing a configuration of a fifth embodiment of the present invention. BEST MODE FOR CARRYING OUT THE INVENTION
- a mode discrimination circuit extracts a feature amount from a speech signal and discriminates a mode based on the feature amount. .
- the sound source quantization circuit (350 in FIG. 1) quantizes the amplitudes or polarities of a plurality of pulses collectively in a codebook (FIG. 1).
- the search is performed by combining the code vector stored in (3 51, 3 52) with each of a plurality of shift amounts that temporally shift the position of a predetermined pulse, and distortion with the input voice is obtained. Select the combination of code vector and shift amount that minimizes.
- the gain quantization circuit (365 in FIG.
- the multiplexer section (400 in FIG. 1) is composed of the output of the spectrum parameter calculation section (210 in FIG. 1), the output of the mode determination section (800 in FIG. 1), and an adaptive code.
- the output of the book circuit (500 in FIG. 1), the output of the sound source quantization section (350 in FIG. 1), and the output of the gain quantization circuit are combined and output.
- the demultiplexer section 5100 separates a code sequence input from an input terminal, and outputs a spectrum parameter, a delay of an adaptive codebook, an adaptive code vector, And the gain of the sound source, the amplitude or polarity code vector as the sound source information, and the code representing the pulse position are output separately.
- the mode discrimination unit (530 in FIG. 5) discriminates the mode using the past quantized gain in the adaptive codebook.
- FIG. 1 is a block diagram showing the configuration of one embodiment of the speech encoding device of the present invention.
- the frame division circuit 110 divides the audio signal into frames (for example, 20 m), and the sub-frame division circuit 120 outputs the frame signal as the frame signal. Is divided into subframes (for example, 5 ms) shorter than a frame.
- the well-known LPC (Linear Prediction Coding) analysis, the Bulg analysis, or the like can be used for calculating the spectrum parameters.
- the Burg analysis is used. For details of the Burg analysis, see the description of “Signal Analysis and System Identification” by Nakamizo (Corona Corp., 1988), pp. 82-87 (hereinafter referred to as Reference 4). The description of Reference 4 is incorporated herein by reference.
- the conversion from linear prediction coefficients to LSP is described in a paper by Sugamura et al.
- the LSP of the fourth subframe is output to spectrum parameter quantization circuit 210.
- the spectrum parameter quantization circuit 210 efficiently quantizes the LSP parameter of a predetermined subframe and outputs a quantization value that minimizes the distortion of the following equation (1).
- LSP (i), QLSP (i) j, and W (i) are the i-th LSP before quantization, the j-th result after quantization, and the weight coefficient, respectively.
- Japanese Patent Application Laid-Open No. Hei 5-6199 Japanese Patent Application No. 3-155049:
- the spectral parameter overnight quantization circuit 210 restores the LSP parameters of the first to fourth subframes based on the LSP parameters quantized in the fourth subframe.
- the spectral parameter quantization circuit 210 linearly interpolates the quantized LSP parameter of the fourth sub-frame of the current frame and the quantized LSP of the fourth sub-frame of the previous frame, and performs first to Recover the LSP of 3 subframes.
- the spectrum parameter quantization circuit 210 selects one type of code vector that minimizes the error power between the LSP before quantization and the LSP after quantization, and then performs linear interpolation on the first to fourth subframes. LSF can be restored. In order to further improve the performance, the spectral parameter quantization circuit 210 minimizes the error power.
- the cumulative distortion is evaluated for each candidate, and a set of a candidate and an interpolation LSP that minimizes the cumulative distortion can be selected.
- Reference 10 Japanese Patent Application No. 5-8737
- the response signal calculation circuit 240 receives the linear prediction coefficient ⁇ ; i for each subframe from the spectrum parameter calculation circuit 200 and performs quantization and interpolation from the spectrum parameter quantization circuit 210 to restore the signal.
- the response signal x z (n) is expressed by the following equation.
- N indicates the subframe length.
- A is a weighting factor that controls the amount of hearing weighting, and is the same value as the following equation (7).
- s w (n) and p (n) are The output signal of the weighting signal calculation circuit and the output signal of the filter denominator term of the first term on the right side of equation (7) described below are shown.
- the impulse response calculation circuit 310 calculates the impulse response h w (n) of the auditory weighting filter having a transfer function H w (z) represented by the following equation (6) by a predetermined point L, Output to adaptive codebook circuit 500 and sound source quantization circuit 350.
- the mode discriminating circuit 800 uses the output signal of the sub-frame dividing circuit 120 to extract a feature amount and discriminate between voiced and unvoiced for each sub-frame.
- a pitch prediction gain can be used as a feature.
- the mode discriminating circuit 800 compares the value of the pitch prediction gain obtained for each subframe with a predetermined threshold value, and determines that the voice is voiced if the pitch prediction gain is larger than the threshold value, and is otherwise unvoiced.
- the mode discrimination circuit 800 outputs voiced / unvoiced discrimination information to the sound source quantization circuit 350, the gain quantization circuit 365, and the multiplexer 400.
- ne represents a convolution operation
- adaptive codebook circuit 500 performs pitch prediction according to the following equation (10), and outputs prediction residual signal e w (n) to sound source quantization circuit 350.
- e w (n) x t ( ⁇ ) - ⁇ ( ⁇ - ⁇ ) * ⁇ ( ⁇ )
- the sound source quantization circuit 350 receives the voiced / unvoiced discrimination information from the mode discrimination circuit 800, and switches between voiced and unvoiced pulses.
- a voiced voice has a ⁇ -bit amplitude codebook or a polarity codebook for quantizing the pulse amplitude for ⁇ pulses at a time.
- the polarity codebook is used.
- This polarity codebook is stored in the sound source codebook 351 in the case of voice, and in the sound source codebook 352 in the case of no voice.
- the sound source quantization circuit 350 reads the polarity code vector from the sound source codebook 351 and fits the position to each code vector.
- the D K to select a combination of co-one de base vector and a position that minimizes the (
- h w (n) is the auditory weighted impulse response.
- s wk ( mi ) is calculated by the second term in the sum of the right-hand side of equation (11), that is, the sum of g ′ i k h w (n-rrii).
- the sound source quantization circuit 350 outputs the index representing the code vector to the multiplexer 400. Further, the sound source quantization circuit 350 quantizes the position of the pulse with a predetermined number of bits, and outputs an index representing the position to the multiplexer 400.
- the positions of the pulses are determined at fixed intervals, and the shift amount for shifting the position of the entire pulse is determined.
- the sound source quantization circuit 350 can use four types of shift amounts (shift 0, shift 1, shift 2, and shift 3) assuming that the shift is performed one sample at a time. In this case, the sound source quantization circuit 350 quantizes the shift amount with two bits and transmits the result.
- the sound source quantization circuit 350 inputs the polarity code vector from the polarity code book 352 for each shift amount, and outputs all shift amounts. Then, a combination search of the code vector and all code vectors is performed, and a combination of the shift amount (5 (j)) and the code vector g k that minimizes the distortion D k .j in the following equation (15) is selected.
- the sound source quantization circuit 350 outputs to the multiplexer 400 an index representing the selected vector and a code representing the shift amount.
- a codebook for quantizing the amplitude of a plurality of pulses may be learned and stored in advance using an audio signal.
- Codebook learning methods are described, for example, in the paper by Linde et al., "An algorithm for vector quantification desig gn, (IEEE Tran s. Commun., Pp. 84-95, Jannury, 1980: The following is disclosed in References 12), etc. This Reference 12 forms a part of the present specification by reference thereto.
- the information on the amplitude and position in the voiced / unvoiced state is output to the gain quantization circuit 365.
- Gain quantization circuit 365 inputs amplitude and position information from sound source quantization circuit 350 Then, voiced / unvoiced discrimination information is input from the mode discrimination circuit 800.
- the gain quantization circuit 365 reads the gain code vector from the gain codebook 380, and calculates the following equation (16) with respect to the selected amplitude code vector or polarity code vector and position. the D K to select a gain code base transfected Le to minimize.
- the gain quantization circuit 365 simultaneously vector-quantizes both the gain of the adaptive codebook and the gain of the sound source expressed in pulses.
- Gain quantization circuit 3 6 5 discrimination information in the case of voiced obtains a gain code base vector to minimize D K of the formula (1 6).
- j3 k and G k are the k-th code vector in the two-dimensional gain codebook stored in the gain codebook 365.
- the gain quantization circuit 365 outputs an index representing the selected gain code vector to the multiplexer 400.
- the gain quantization circuit 3 6 searches for a gain code vector.
- the gain quantization circuit 365 outputs an index representing the selected gain code vector to the multiplexer 400.
- the weighted signal calculation circuit 360 receives the voiced / unvoiced discrimination information and the respective indexes, and reads the corresponding code vector from the index. In the case of voiced, the weighting signal calculation circuit 360 calculates the driving sound source signal V based on the following equation (18).
- v (n) is output to adaptive codebook circuit 500.
- the weighting signal calculation circuit 360 obtains the driving sound source signal V (n) based on the following equation (19).
- v ( ⁇ ) is output to the adaptive codebook circuit 500.
- the weighting signal calculation circuit 360 uses the output parameter of the spectrum parameter calculation circuit 200 and the output parameter of the spectrum parameter quantization circuit 210 to calculate the response signal by the following equation (20). s w (n) is calculated for each subframe and output to the response signal calculation circuit 240.
- FIG. 2 is a block diagram showing the configuration of the second embodiment of the present invention.
- the operation of the sound source quantization circuit 355 is different from that of the first embodiment. That is, in the second embodiment of the present invention, when voiced / unvoiced discrimination information is unvoiced, a position generated according to a predetermined rule is used as a pulse position.
- the position of a predetermined number (for example, Ml) of pulses is generated by the random number generation circuit 600. That is, Ml numbers generated by the random number generator 600 are considered as pulse positions. The Ml positions thus generated are output to the sound source quantization circuit 355.
- the sound source quantization circuit 355 performs the same operation as the sound source quantization circuit 350 in FIG. 1 when the discrimination information is voiced, and performs the sound source quantization for the position output from the random number generation circuit 600 when the discrimination information is unvoiced.
- the sound source quantization circuit 356 calculates the shift amount of all the code vectors of the sound source code book 352 and the pulse position.
- the distortion by the following equation is calculated for all the combinations, and a plurality of combinations are selected in the order of minimizing D k .j in the following equation ( 21 ), and output to the gain quantization circuit 366.
- the gain quantization circuit 365 quantizes the gain using the gain codebook 380 for each of the outputs of the plurality of sets in the sound source quantization circuit 365, and calculates D k of the following equation ( 22 ). Select the combination of shift amount, sound source code vector, and gain code vector that minimizes j.
- FIG. 4 is a block diagram showing a configuration of a fourth embodiment of the present invention.
- the sound source quantization circuit 357 when the voiced / unvoiced discrimination information is unvoiced, the sound source quantization circuit 357 generates a sound source code book 35 5 based on the position of the pulse generated by the random number generator 600. The amplitude or polarity of the pulse is collectively quantized by using 2, and all code vectors or a plurality of code vector candidates are output to the gain quantization circuit 367.
- the gain quantization circuit 365 quantizes the gain of each of the candidates output from the sound source quantization circuit 357 by a gain codebook 380, and a code vector and a gain code for minimizing distortion. Outputs a combination of vectors.
- FIG. 5 is a block diagram showing a configuration of a fifth embodiment of the present invention.
- the demultiplexer 510 separates the code sequence input from the input terminal 500, and outputs a spectrum parameter, an adaptive codebook delay, an adaptive code vector, and a sound source gain.
- the code indicating the amplitude or polarity code vector as the sound source information and the position of the pulse are separated and output.
- the gain decoding circuit 510 uses the gain codebook 380 to decode and output the adaptive codebook and the gain of the sound source.
- the adaptive codebook circuit 520 decodes the delay and the gain of the adaptive code vector, and generates an adaptive codebook reproduction signal using the synthesis filter input signal in the past subframe.
- the mode discriminating circuit 530 uses the adaptive codebook gain decoded in the past subframe, compares it with a predetermined threshold value, and judges whether the current subframe is voiced or unvoiced.
- the voiceless discrimination information is output to the sound source signal restoration circuit 540.
- the sound source signal restoration circuit 540 receives the voiced / unvoiced discrimination information, and when voiced, decodes the pulse position, reads the code vector from the sound source codebook 351 and gives the amplitude or polarity. A fixed number of pulses are generated per subframe to recover the sound source signal.
- the sound source signal restoring circuit 540 restores the sound source signal by generating a pulse from a predetermined pulse position, shift amount, amplitude or polarity code vector when there is no voice.
- the spectrum parameter decoding circuit 570 decodes the spectrum parameter and outputs it to the synthesis filter circuit 560.
- Adder 550 adds the adaptive codebook output signal and the output signal of excitation signal decoding circuit 540, and outputs the result to synthesis filter circuit 560.
- the synthesis filter circuit 560 receives the output of the adder 550, reproduces the sound, and outputs it from the terminal 580.
- the mode is determined based on the past quantization gain of the adaptive codebook, and the amplitude or polarity of a plurality of pulses is determined in the case of a predetermined mode.
- a search is made by combining a code vector stored in a codebook to be quantized together and each of a plurality of shift amounts for temporally shifting the position of a predetermined pulse to minimize distortion from input speech.
- a search is performed by combining a code vector, each of a plurality of shift amounts, and a gain code vector stored in a gain codebook for quantizing a gain, and Since the combination of the code vector, shift amount, and gain code vector that minimizes distortion is selected, even if the background noise is encoded at a low bit rate, the background noise Can be satisfactorily encoded.
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- Computational Linguistics (AREA)
- Mathematical Analysis (AREA)
- Mathematical Optimization (AREA)
- Mathematical Physics (AREA)
- Pure & Applied Mathematics (AREA)
- Algebra (AREA)
- General Physics & Mathematics (AREA)
- Signal Processing (AREA)
- Health & Medical Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Human Computer Interaction (AREA)
- Acoustics & Sound (AREA)
- Multimedia (AREA)
- Compression, Expansion, Code Conversion, And Decoders (AREA)
Description
Claims
Priority Applications (3)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US09/720,767 US6973424B1 (en) | 1998-06-30 | 1999-06-29 | Voice coder |
EP99957654A EP1093230A4 (en) | 1998-06-30 | 1999-06-29 | speech |
CA002336360A CA2336360C (en) | 1998-06-30 | 1999-06-29 | Speech coder |
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
JP18517998 | 1998-06-30 | ||
JP10/185179 | 1998-06-30 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2000000963A1 true WO2000000963A1 (fr) | 2000-01-06 |
Family
ID=16166231
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/JP1999/003492 WO2000000963A1 (fr) | 1998-06-30 | 1999-06-29 | Codeur vocal |
Country Status (4)
Country | Link |
---|---|
US (1) | US6973424B1 (ja) |
EP (1) | EP1093230A4 (ja) |
CA (1) | CA2336360C (ja) |
WO (1) | WO2000000963A1 (ja) |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2002071394A1 (en) * | 2001-03-07 | 2002-09-12 | Nec Corporation | Sound encoding apparatus and method, and sound decoding apparatus and method |
JP2003532149A (ja) * | 2000-04-24 | 2003-10-28 | クゥアルコム・インコーポレイテッド | 音声発話を予測的に量子化するための方法および装置 |
JP6996185B2 (ja) | 2017-09-15 | 2022-01-17 | 富士通株式会社 | 発話区間検出装置、発話区間検出方法及び発話区間検出用コンピュータプログラム |
Families Citing this family (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US8306813B2 (en) * | 2007-03-02 | 2012-11-06 | Panasonic Corporation | Encoding device and encoding method |
US20090319261A1 (en) * | 2008-06-20 | 2009-12-24 | Qualcomm Incorporated | Coding of transitional speech frames for low-bit-rate applications |
US20090319263A1 (en) * | 2008-06-20 | 2009-12-24 | Qualcomm Incorporated | Coding of transitional speech frames for low-bit-rate applications |
US8768690B2 (en) * | 2008-06-20 | 2014-07-01 | Qualcomm Incorporated | Coding scheme selection for low-bit-rate applications |
US8862465B2 (en) * | 2010-09-17 | 2014-10-14 | Qualcomm Incorporated | Determining pitch cycle energy and scaling an excitation signal |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JPH05281999A (ja) * | 1992-04-02 | 1993-10-29 | Sharp Corp | 巡回符号帳を用いる音声符号化装置 |
JPH09179593A (ja) * | 1995-12-26 | 1997-07-11 | Nec Corp | 音声符号化装置 |
JPH10133696A (ja) * | 1996-10-31 | 1998-05-22 | Nec Corp | 音声符号化装置 |
Family Cites Families (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US4220819A (en) * | 1979-03-30 | 1980-09-02 | Bell Telephone Laboratories, Incorporated | Residual excited predictive speech coding system |
JP3114197B2 (ja) | 1990-11-02 | 2000-12-04 | 日本電気株式会社 | 音声パラメータ符号化方法 |
JP3151874B2 (ja) | 1991-02-26 | 2001-04-03 | 日本電気株式会社 | 音声パラメータ符号化方式および装置 |
JP3143956B2 (ja) | 1991-06-27 | 2001-03-07 | 日本電気株式会社 | 音声パラメータ符号化方式 |
JP2746039B2 (ja) | 1993-01-22 | 1998-04-28 | 日本電気株式会社 | 音声符号化方式 |
JP3144284B2 (ja) * | 1995-11-27 | 2001-03-12 | 日本電気株式会社 | 音声符号化装置 |
US6393391B1 (en) * | 1998-04-15 | 2002-05-21 | Nec Corporation | Speech coder for high quality at low bit rates |
JPH10124091A (ja) | 1996-10-21 | 1998-05-15 | Matsushita Electric Ind Co Ltd | 音声符号化装置および情報記憶媒体 |
US6148282A (en) * | 1997-01-02 | 2000-11-14 | Texas Instruments Incorporated | Multimodal code-excited linear prediction (CELP) coder and method using peakiness measure |
-
1999
- 1999-06-29 CA CA002336360A patent/CA2336360C/en not_active Expired - Fee Related
- 1999-06-29 US US09/720,767 patent/US6973424B1/en not_active Expired - Fee Related
- 1999-06-29 WO PCT/JP1999/003492 patent/WO2000000963A1/ja active Application Filing
- 1999-06-29 EP EP99957654A patent/EP1093230A4/en not_active Withdrawn
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JPH05281999A (ja) * | 1992-04-02 | 1993-10-29 | Sharp Corp | 巡回符号帳を用いる音声符号化装置 |
JPH09179593A (ja) * | 1995-12-26 | 1997-07-11 | Nec Corp | 音声符号化装置 |
JPH10133696A (ja) * | 1996-10-31 | 1998-05-22 | Nec Corp | 音声符号化装置 |
Cited By (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2003532149A (ja) * | 2000-04-24 | 2003-10-28 | クゥアルコム・インコーポレイテッド | 音声発話を予測的に量子化するための方法および装置 |
US8660840B2 (en) | 2000-04-24 | 2014-02-25 | Qualcomm Incorporated | Method and apparatus for predictively quantizing voiced speech |
WO2002071394A1 (en) * | 2001-03-07 | 2002-09-12 | Nec Corporation | Sound encoding apparatus and method, and sound decoding apparatus and method |
US7680669B2 (en) | 2001-03-07 | 2010-03-16 | Nec Corporation | Sound encoding apparatus and method, and sound decoding apparatus and method |
JP6996185B2 (ja) | 2017-09-15 | 2022-01-17 | 富士通株式会社 | 発話区間検出装置、発話区間検出方法及び発話区間検出用コンピュータプログラム |
Also Published As
Publication number | Publication date |
---|---|
CA2336360C (en) | 2006-08-01 |
CA2336360A1 (en) | 2000-01-06 |
EP1093230A1 (en) | 2001-04-18 |
US6973424B1 (en) | 2005-12-06 |
EP1093230A4 (en) | 2005-07-13 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
JP3180762B2 (ja) | 音声符号化装置及び音声復号化装置 | |
JPH0353300A (ja) | 音声符号化装置 | |
JPH0990995A (ja) | 音声符号化装置 | |
JPH09281998A (ja) | 音声符号化装置 | |
JP3582589B2 (ja) | 音声符号化装置及び音声復号化装置 | |
JP3266178B2 (ja) | 音声符号化装置 | |
JP3558031B2 (ja) | 音声復号化装置 | |
WO2000000963A1 (fr) | Codeur vocal | |
JP3531780B2 (ja) | 音声符号化方法および復号化方法 | |
JP2001318698A (ja) | 音声符号化装置及び音声復号化装置 | |
JP3319396B2 (ja) | 音声符号化装置ならびに音声符号化復号化装置 | |
JP3003531B2 (ja) | 音声符号化装置 | |
JP3299099B2 (ja) | 音声符号化装置 | |
JP3144284B2 (ja) | 音声符号化装置 | |
JP2001142499A (ja) | 音声符号化装置ならびに音声復号化装置 | |
JP2853170B2 (ja) | 音声符号化復号化方式 | |
JP3006790B2 (ja) | 音声符号化復号化方法及びその装置 | |
JP3063087B2 (ja) | 音声符号化復号化装置及び音声符号化装置ならびに音声復号化装置 | |
JP3845316B2 (ja) | 音声符号化装置及び音声復号装置 | |
JP3092654B2 (ja) | 信号符号化装置 | |
JP3563400B2 (ja) | 音声復号化装置及び音声復号化方法 | |
JPH0291697A (ja) | 音声符号化復号化方式とその装置 | |
JPH0291699A (ja) | 音声符号化復号化方式 |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
AK | Designated states |
Kind code of ref document: A1 Designated state(s): CA JP US |
|
AL | Designated countries for regional patents |
Kind code of ref document: A1 Designated state(s): DE FI FR GB NL SE |
|
DFPE | Request for preliminary examination filed prior to expiration of 19th month from priority date (pct application filed before 20040101) | ||
121 | Ep: the epo has been informed by wipo that ep was designated in this application | ||
WWE | Wipo information: entry into national phase |
Ref document number: 1999957654 Country of ref document: EP |
|
ENP | Entry into the national phase |
Ref document number: 2336360 Country of ref document: CA |
|
WWE | Wipo information: entry into national phase |
Ref document number: 09720767 Country of ref document: US |
|
WWP | Wipo information: published in national office |
Ref document number: 1999957654 Country of ref document: EP |