EP0662682A2 - Speech signal coding - Google Patents
Speech signal coding Download PDFInfo
- Publication number
- EP0662682A2 EP0662682A2 EP94120542A EP94120542A EP0662682A2 EP 0662682 A2 EP0662682 A2 EP 0662682A2 EP 94120542 A EP94120542 A EP 94120542A EP 94120542 A EP94120542 A EP 94120542A EP 0662682 A2 EP0662682 A2 EP 0662682A2
- Authority
- EP
- European Patent Office
- Prior art keywords
- sound source
- short time
- time interval
- speech signal
- code
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Withdrawn
Links
- 230000007774 longterm Effects 0.000 claims abstract description 25
- 239000013598 vector Substances 0.000 claims description 28
- 230000003044 adaptive effect Effects 0.000 description 36
- 238000001228 spectrum Methods 0.000 description 18
- 230000002194 synthesizing effect Effects 0.000 description 13
- 238000000034 method Methods 0.000 description 9
- 238000013139 quantization Methods 0.000 description 6
- 238000010586 diagram Methods 0.000 description 4
- 238000004364 calculation method Methods 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 230000001413 cellular effect Effects 0.000 description 1
- 238000004891 communication Methods 0.000 description 1
- 238000004513 sizing Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L19/00—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
- G10L19/04—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using predictive techniques
- G10L19/08—Determination or coding of the excitation function; Determination or coding of the long-term prediction parameters
- G10L19/12—Determination or coding of the excitation function; Determination or coding of the long-term prediction parameters the excitation function being a code excitation, e.g. in code excited linear prediction [CELP] vocoders
- G10L19/135—Vector sum excited linear prediction [VSELP]
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L19/00—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
- G10L2019/0001—Codebooks
- G10L2019/0013—Codebook search algorithms
Definitions
- the present invention relates to speech signal coding, and more particularly, to a method and speech signal coding system for encoding a speech signal with a search region of a sound source code book being limited.
- the encoding process is performed on the transmitting side in accordance with the following procedure.
- a short term prediction is performed.
- a speech signal is divided into a plurality of frames of, for example, 20 ms and then a spectrum parameter indicative of the frequency characteristic of the speech signal is extracted from the speech signal for every frame (This process is referred to as "short term prediction").
- LPC Linear Predictive Coding
- the spectrum parameter obtained in the short term prediction is used as coefficients for a synthetic filter in the encoding and decoding processes.
- a long term prediction is performed.
- Each frame is divided into N subframes each having a shorter time interval of, for example, 5 ms.
- a pitch parameter (delay) L indicative of a long term correlation (a pitch correlation) and a gain is determined for every subframe based on the spectrum parameter and a sound source signal obtained from previous sound source codes.
- the long term prediction thus performed is referred to as "an adaptive code book”. This is the problem of a least square error between the speech signal and a synthetic signal h[n] * r[n - L] (the symbol "*" indicates a convolution operation) obtained from a signal r[n - L] which is obtained by delaying a previous sound source signal r[n] by L, as shown in the following equation (1).
- p[n] is a speech signal
- ⁇ is an amplitude or gain
- h[n] is an impulse response of a synthetic filter determined based on the result of the short term prediction
- r[n] is a signal for a previous sound source code
- L is a delay value. More specifically, there are determined the delay value L and the gain ⁇ when a value VL obtained by dividing the square of cross correlation CL between the speech signal and the synthetic signal by self-correlation GL of the synthetic signal is maximum while the delay value L is varied in the range of, for example, 20 to 147 which is considered to cover the basic frequency range of speech, as shown in the following equations (2) to (5).
- values of V j shown in the following equations (7) to (10) are calculated over all the sound source codes while the index j of the sound source code is varied from 0 to 2 B - 1 and then j and ⁇ are determined by determining V j taking a maximum value.
- ⁇ C j / G j (10) That is, the search is not performed for a limited search region of sound source codes but for the whole search region of sound source codes.
- the determined index j of a sound source code, gain ⁇ , pitch parameter (delay) L, gain ⁇ and spectrum parameter are transmitted.
- VSELP Vector Sum Excited LPC
- This type of speech signal coding system is disclosed in Laid Open Japanese Patent Application (JP-A-hei2-502135: the reference 4) in detail.
- this speech signal coding system there can be calculated from the following equations (12) to (15) the cross correlation C j between a speech signal and a synthetic signal produced based on previous sound source codes and a self-correlation G j by using the special structure of the sound source code book. As a result, the amount of calculation can be reduced to M/N.
- G u G i + ⁇ uv ⁇ R v (12)
- u and j are sound source codes which are different from each other in only one bit.
- V is indicative of a bit number of different bit and u is produced from a gray code (alternative binary code) represented by the following equation (16).
- u i ⁇ (i>>1) (16) where i is an integer in a range of 0 to 2 M -1, ⁇ represents an exclusive OR, and >> represents shift in a right direction.
- the speech signal coding system includes a buffer circuit 110 which is a circuit for storing a speech signal inputted via a terminal 100.
- a FLAT analyzing circuit 350 is a circuit for performing speech analysis for each of a plurality of frames of the speech signal stored in the buffer circuit 110 to calculate reflection coefficients and frame energy.
- the reflection coefficients and frame energy are quantized and supplied to a multiplexer 250 as a spectrum parameter.
- An ⁇ parameter converting circuit 360 performs an inverse quantization processing for the reflection coefficients supplied from the FLAT analyzing circuit 350 to produce an ⁇ parameter, i.e., converts the reflection coefficients into the ⁇ parameter.
- An interpolating circuit 160 interpolates an ⁇ parameter between subframes from the ⁇ parameters supplied from the converting circuit 360.
- the interpolated ⁇ parameter is supplied to various circuits such as an acoustic sensibility weighting circuit 170, an influence signal generating circuit 180, a weighting and synthesizing circuit 240, an adaptive code book 190, a VSELP sound source code searching circuit 340, and a GPSO gain code searching circuit 330, which will be all described later in detail.
- a dividing circuit 120 connected to the buffer circuit 110 divides each of the plurality of frames of the speech signal stored in the buffer circuit 110 into a plurality of subframes.
- the acoustic sensibility weighting circuit 170 performs acoustic sensibility weighting for each subframe of the speech signal using the interpolated ⁇ parameter and supplies the weighted subframes of the speech signal to a subtracting circuit 185.
- the influence signal generating circuit 180 generates an influence signal as a zero input response of a weighted synthetic filter using the interpolated ⁇ parameter and a weighted synthetic signal from the weighting and synthesizing circuit 240.
- a subtracter 185 subtracts the influence signal supplied from the influence signal generating circuit 180 from the acoustic sensibility weighted subframe of the speech signal and supplies a difference signal as the subtracting result to the adaptive code book 190.
- the adaptive code book 190 performs a long term prediction for each subframe of the difference signal using the interpolated ⁇ parameter and a sound source signal supplied from the sound source signal producing circuit 230 to determine an optimal delay. The determined delay is supplied to the multiplexer 250, the VSELP sound source code searching circuit 340, the GPSO gain code searching circuit 330 and the sound source signal producing circuit 230.
- the base vector table 310 stores base vectors various combinations of which represent sound source codes of a sound source code book. Each of the base victors is assigned with an index.
- the VSELP sound source code searching circuit 340 searches the base vector table 310 for an optimal code book using all the base vectors in response to the interpolated ⁇ parameter from the interpolating circuit 160, the subtracted signal from the subtracter 185, and the result of the long term prediction in the adaptive code book 190.
- the searched sound source code is supplied to the GPSO gain code searching circuit 330 the sound source signal producing circuit 230 and the indexes for the base vectors used in the searched sound source code are supplied to the multiplexer 250.
- a GSPO gain code book 280 is a code book which stores gains of the sound source of the adaptive code book and the sound source code of the sound source code book.
- the GPSO gain code searching circuit 330 searches each of gains of the sound source of the adaptive code book and the sound source code from the subtracted signal from the subtracter 185 based on the interpolated ⁇ parameter from the interpolating circuit 160, the searched sound source code from the searching circuit 340, and the pitch parameter from the adaptive code book 190.
- the searched gains are supplied to the multiplexer 250 and the sound source signal producing circuit 230.
- the sound source signal producing circuit 230 receives the pitch parameter from the adaptive code book 190, the sound source code from the searching circuit 340 and the gains from the searching circuit 330 and produces a sound source signal for the subframe by multiplying a gain term with each of the sound sources of the adaptive code book and sound source code book and by adding the multiplied results to each other and stores it as a previous sound source signal.
- the stored sound source signal is supplied to the weighting and synthesizing circuit 240 and the adaptive code book 190.
- the weighting and synthesizing circuit 240 is a circuit for performing weighting synthesization.
- the multiplexer 250 outputs via a terminal 260 a combination of code sequences from the FLAT analyzing circuit 350, the adaptive code book 190, the VSELP sound source code searching circuit 340, and the GPSO gain code searching circuit 330.
- a digital signal processor In the speech signal coding system of CELP type, as the very much amount of calculation is required, a digital signal processor must operates with a clock signal having a frequency as high as 30 MHz in order to implement an apparatu. As a result, the power consumption of the digital signal processor increases remarkably. In a case of a portable telephone which cannot have a large battery, the power would be consumed with the operation of about 45 minutes.
- An object of the present invention is to provide a method of encoding a speech signal with a good quality regardless of a low bit rate and with possible communication for a longer time.
- Another object of the present invention is to provide a speech signal coding system for achieving the method.
- the method of encoding a speech signal includes the steps of: providing a sound source code book for a plurality of sound source codes; providing a gain code book storing gain codes; performing a short term prediction for each of long time intervals of a speech signal to produce a first parameter; performing a long term prediction for each of short time intervals of the each long time interval of a difference signal between the speech signal and a signal obtained based on the first parameter and previous sound source codes to produce a second parameter; designating a search region of the sound source code book; searching the designated search region of the sound source code book for a sound source code optimal to said each short time interval of the difference signal to determine a third parameter for the optimal sound source code; searching the gain code book to determine a fourth parameter for gain of the second and third parameters; and outputting a combination of the first to fourth parameters.
- a speech signal coding system in which a speech signal is represented by data including a code of a long term prediction parameter of the speech signal, a code of a short term prediction parameter thereof, and a quantized code indicative of a difference signal between a signal for short term prediction and a signal for long term prediction
- the speech signal coding system includes a sound source code book for a plurality of sound source codes, a designating section for designating a search region, a searching section for searching the designated search region of the sound source code book for an optimal sound source code for a current short time interval, the speech signal being divided into portions for a plurality of long time intervals, each long time intervals being divided into a plurality of short time intervals.
- the search region of a sound source code for a current subframe is designated and limited to the range of the predetermined fixed number of sound source codes before and after a sound source code for a subframe previous to the current subframe by one, for example, the range of 10 sound source codes in each of directions before and after the previous subframe sound source code.
- a noise sequence is used as a sound source code book, it could be considered that the current subframe sound source code is very similar to the previous subframe sound source code. Therefore, even if the search region is limited in a predetermined range with respect to the previous subframe sound source code, an optimal sound source code to the current subframe could be found in the limited search region.
- the speech signal coding system includes a buffer circuit 110, a subframe dividing circuit 120, an LPC analyzing circuit 130, a parameter quantizing circuit (Q) 140, a parameter inversely quantizing circuit (Q ⁇ 1) 150, a parameter interpolating circuit 160, an acoustic sensibility weighting circuit 170, an influence signal generating circuit 180, a subtracter 185, an adaptive code book 190, a search region designating circuit 200, a sound source searching circuit 210, a gain code searching circuit 220, a sound source signal producing circuit 230, a weighting and synthesizing circuit 240, a multiplexer 250, a noise sequence 270 as a sound source code book, and a gain code book 280.
- the buffer circuit 110 receives a speech signal inputted via a terminal 100 and stores it therein.
- the LPC analyzing circuit 130 performs speech analysis for each of a plurality of frames of the speech signal stored in the buffer circuit 110 to extract LPC coefficients (a ⁇ parameter) as a spectrum parameter.
- the parameter quantizing circuit 140 quantizes the LPC coefficients as the ⁇ parameter and the quantized coefficients are supplied to the multiplexer 250 as a spectrum parameter.
- An inversely quantizing circuit 150 performs an inverse quantization processing for the quantized coefficients supplied from the quantizing circuit 140 to produce or recover an ⁇ parameter, i.e., converts the quantized LPC coefficients into the ⁇ parameter.
- the parameter interpolating circuit 160 interpolates an ⁇ parameter between subframes from the ⁇ parameters supplied from the converting circuit 360.
- the interpolated ⁇ parameter is supplied to various circuits such as the acoustic sensibility weighting circuit 170, the influence signal generating circuit 180, the weighting and synthesizing circuit 240, the adaptive code book 190, the sound source code searching circuit 210 and the gain code searching circuit 220, which will be all described later in detail.
- the dividing circuit 120 connected to the buffer circuit 110 divides each of the plurality of frames of the speech signal stored in the buffer circuit 110 into a plurality of subframes.
- the acoustic sensibility weighting circuit 170 performs acoustic sensibility weighting for each subframe of the speech signal using the interpolated ⁇ parameter and supplies the weighted subframes of the speech signal to the subtracter circuit 185.
- the influence signal generating circuit 180 generates an influence signal as a zero input response of a weighted synthetic filter using the interpolated ⁇ parameter and a weighted synthetic signal from the weighting and synthesizing circuit 240.
- the subtracter 185 subtracts the influence signal supplied from the influence signal generating circuit 180 from the acoustic sensibility weighted subframe of the speech signal and supplies a difference signal as the subtracting result to the adaptive code book 190, the sound source code searching circuit 210, and the gain code searching circuit 220.
- the adaptive code book 190 performs a long term prediction for each subframe of the difference signal using the interpolated ⁇ parameter and a sound source signal supplied from the sound source signal producing circuit 230 to determine an optimal delay. The determined delay is supplied to the multiplexer 250, the sound source code searching circuit 210, the gain code searching circuit 220 and the sound source signal producing circuit 230.
- the sound source code book 270 stores a noise sequence as the sound source codes. Each of the sound source codes is assigned with an index.
- the search region designating circuit 200 receives the identifier, i.e., index of a sound source code for a previous subframe before the current subframe by one to store it and designates the search region for the next subfame based on the stored identifier for the previous subframe in response to the pitch parameter from the adaptive code book 190. In this embodiment, the designating circuit 200 designates as the search region a region of 20 sound source codes before and after the sound source code for the previous subframe.
- the sound source code searching circuit 340 searches the designated search region of the sound source code book 270 for an optimal code book for the current subframe in response to the interpolated ⁇ parameter from the interpolating circuit 160, the difference signal from the subtracter 185, and the result of the long term prediction in the adaptive code book 190.
- the optimal searched sound source code is supplied to the gain code searching circuit 220, the sound source signal producing circuit 230 and the search region designating circuit 200 and the identifier of the searched sound source code is supplied to the multiplexer 250.
- the gain code book 280 is a code book which stores gains of the sound source of the adaptive code book and the sound source code of the sound source code book.
- the gain code searching circuit 330 searches the gain code book 280 for each of gains of the sound source of the adaptive code book and the sound source code searched by the searching circuit 210 from the difference signal from the subtracter 185 based on the interpolated ⁇ parameter from the interpolating circuit 160, the searched sound source code from the searching circuit 340, and the pitch parameter from the adaptive code book 190.
- the searched gains are supplied to the multiplexer 250 and the sound source signal producing circuit 230.
- the sound source signal producing circuit 230 receives the pitch parameter from the adaptive code book 190, the sound source code from the searching circuit 340 and the gains from the searching circuit 330 and produces a sound source signal for the current subframe by multiplying a gain term with each of the sound sources of the adaptive code book and sound source code book and by adding the multiplied results to each other and stores it as the sound source signal for a previous subframe.
- the stored sound source signal is supplied to the weighting and synthesizing circuit 240 and the adaptive code book 190.
- the weighting and synthesizing circuit 240 is a circuit for performing weighting synthesization.
- the multiplexer 250 outputs via a terminal 260 a combination of a code sequence of the spectrum parameter from the quantizing circuit 140 (for a short term prediction), a code sequence of the pitch parameter from the adaptive code book 190 (a long term prediction), a code sequence of the sound source code from the sound source code searching circuit 210, and two code sequences of the gains from the gain code searching circuit 220.
- a learned sound source code book 300 is used in place of the sound source code book 270 and stores vectors of a plurality of sound source codes having a subframe length obtained through the vector quantization.
- the learned sound source code book 300 is known to a person skilled in the art and disclosed in the references 2 and 3. Because the learned sound source code book 300 is used in the second embodiment, it is not ensured that a sound source code adjacent to the sound source code for the previous subframe has a high similarity to the adjacent sound source code. Therefore, a similar code table 290 is prepared in the embodiment.
- the similar code table 290 is a table for storing identifiers of a predetermined number of sound source codes, for example, 10 sound source codes in this embodiment, similar to each of the learned sound source codes of the learned sound source code book 300 in order of higher similarity.
- the search region designating circuit 205 is similar to the search region designating circuit 200 in the first embodiment and refers to the similar code table 290 based on the identifier of the learned sound source code for the previous subframe to designate the search region of the learned sound source code book 300.
- the search region designating circuit 200 receives the identifier, i.e., index of a learned sound source code for a previous subframe before the current subframe by one to store it and refers to the similar code table 290 to designate the search region of the learned sound source code book 300 for the next subfame based on the stored identifier for the previous subframe in response to the pitch parameter from the adaptive code book 190.
- the designating circuit 200 designates as the search region a region of 10 learned sound source codes written in the similar code table 290 with respect to the learned sound source code for the previous subframe.
- the sound source code searching circuit 340 searches the designated search region of the learned sound source code book 300 for an optimal code book for the current subframe in response to the interpolated ⁇ parameter from the interpolating circuit 160, the difference signal from the subtracter 185, and the result of the long term prediction in the adaptive code book 190.
- the optimal searched sound source code is supplied to the gain code searching circuit 220 and the sound source signal producing circuit 230 and the identifier of the searched sound source code is supplied to the multiplexer 250 and the search region designating circuit 200.
- the speech signal coding system includes a buffer circuit 110 which is a circuit for storing a speech signal inputted via a terminal 100.
- a FLAT analyzing circuit 350 is a circuit for performing speech analysis for each of a plurality of frames of the speech signal stored in the buffer circuit 110 to calculate reflection coefficients and frame energy. The reflection coefficients and frame energy are quantized and supplied to a multiplexer 250 as a spectrum parameter.
- An ⁇ parameter converting circuit 360 performs an inverse quantization processing for the reflection coefficients supplied from the FLAT analyzing circuit 350 to produce an ⁇ parameter, i.e., converts the reflection coefficients into the ⁇ parameter.
- An interpolating circuit 160 interpolates an ⁇ parameter between subframes from the ⁇ parameters supplied from the converting circuit 360.
- the interpolated ⁇ parameter is supplied to various circuits such as an acoustic sensibility weighting circuit 170, an influence signal generating circuit 180, a weighting and synthesizing circuit 240, an adaptive code book 190, a VSELP sound source code searching circuit 340, and a GPSO gain code searching circuit 330, which will be all described later in detail.
- a dividing circuit 120 connected to the buffer circuit 110 divides each of the plurality of frames of the speech signal stored in the buffer circuit 110 into a plurality of subframes.
- the acoustic sensibility weighting circuit 170 performs acoustic sensibility weighting for each subframe of the speech signal using the interpolated ⁇ parameter and supplies the weighted subframes of the speech signal to a subtracting circuit 185.
- the influence signal generating circuit 180 generates an influence signal as a zero input response of a weighted synthetic filter using the interpolated ⁇ parameter and a weighted synthetic signal from the weighting and synthesizing circuit 240.
- a subtracter 185 subtracts the influence signal supplied from the influence signal generating circuit 180 from the acoustic sensibility weighted subframe of the speech signal and supplies a difference signal as the subtracting result to the adaptive code book 190, the VSELP sound source code searching circuit 340 and the GPSO gain code searching circuit 330.
- the adaptive code book 190 performs a long term prediction for each subframe of the difference signal using the interpolated ⁇ parameter and a sound source signal for a previous subframe supplied from the sound source signal producing circuit 230 to determine an optimal delay. The determined delay is supplied to the multiplexer 250, the VSELP sound source code searching circuit 340, the GPSO gain code searching circuit 330 and the sound source signal producing circuit 230.
- a sound source code book includes the gray code table 290 and the base vector table 310.
- the gray code table 290 stores a plurality of gray codes.
- the base vector table 310 stores base vectors various combinations of which represent sound source codes of the sound source code book. Each of the base victors is assigned with an index.
- a gray code search region designating circuit 195 designates as the search region 10 gray codes before and after the gray code for previous subframe before the current subframe by one.
- a VSELP sound source code searching circuit 340 refers to the gray code table 290 and the base vector table 310 to produce sound source codes in combination of the base vectors and determine an optimal sound source code.
- the VSELP sound source code searching circuit 340 searches the base vector table 310 for an optimal code book using all the base vectors in response to the interpolated ⁇ parameter from the interpolating circuit 160, the subtracted signal from the subtracter 185, and the result of the long term prediction in the adaptive code book 190.
- the searching processing of the optimal sound source code is disclosed in the reference 4.
- the searched sound source code is supplied to the GPSO gain code searching circuit 330 and the sound source signal producing circuit 230 and the index of the gray code for the optimal sound source code is supplied to the multiplexer 250 and the gray code designating circuit 195.
- a GSPO gain code book 320 is a code book which stores gains obtained by converting the gains of the adaptive code book 190 and sound source code into two parameters GS and PO and performing a two-dimensional vector quantization to the two parameter.
- the search operation of the GPSO gain code searching circuit 330 is disclosed in detail in, for example, the TIA Recommendation in U. S. A. , TIA Technical Subcommittee TR. 45.3, "Digital Cellular Standards, Baseline Text for Speech coder and Speech decoder".
- the searched gains are supplied to the multiplexer 250 and the sound source signal producing circuit 230.
- the sound source signal producing circuit 230 receives the pitch parameter from the adaptive code book 190, the sound source code from the searching circuit 340 and the gains from the searching circuit 330 and produces a sound source signal for the subframe by multiplying a gain term with each of the sound sources of the adaptive code book and sound source code book and by adding the multiplied results to each other and stores it as a previous sound source signal.
- the stored sound source signal is supplied to the weighting and synthesizing circuit 240 and the adaptive code book 190.
- the weighting and synthesizing circuit 240 is a circuit for performing weighting synthesization.
- the multiplexer 250 outputs via a terminal 260 a combination of code sequences from the FLAT analyzing circuit 350, the adaptive code book 190, the VSELP sound source code searching circuit 340, and the GPSO gain code searching circuit 330.
- the search region is always designated based on the sound source code for a subframe immediately before the current subframe.
- the whole region of the sound source code book may be designated as the search region for the first subframe of the speech signal and a limited region thereof may be designated for the second subframe and the subsequent subframes.
- the sound source codes for a plurality of subframes previous to the current subframe may be used to determined a sound source code for the current subframe.
- the whole region and a limited region of the sound source code book may be alternatively designated as the search region for each of the continuous subframes.
- the sound source code book searching circuit is of one stage.
- the same advantage could be obtained.
- the search region to be designated is fixed.
- the search region may be varied based on another data such as the spectrum parameter or pitch parameter.
- the search region is changed based on the spectrum parameter.
- the first to third embodiments are modified such that the quantized spectrum parameter is supplied to the search region desinating circuit 200, 205, or 195, as shown in Figs. 2, 3 and 4 by the dashed lines.
- the search region is limited to 10 sound source codes when the quantaized spectrum parameter is in a range of 0 to 2 B-2 -1, 15 sound source codes when the quantized spectrum parameter is in a range of 2 B-2 to 2 B-1 -1, 20 sound source codes when the quantized spectrum parameter is in a range of 2 B-1 to 2 B-1 + 2 B-2 -1, and 25 sound source codes when the quantized spectrum parameter is in a range of 2 B-1 + 2 B-2 to 2 B -1.
- the search region can be flexibily varied based on the quantized spectrum parameter, so that the search region can be optimized. The same matter-can be applied to the pitch parameter.
Landscapes
- Engineering & Computer Science (AREA)
- Computational Linguistics (AREA)
- Signal Processing (AREA)
- Health & Medical Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Human Computer Interaction (AREA)
- Physics & Mathematics (AREA)
- Acoustics & Sound (AREA)
- Multimedia (AREA)
- Compression, Expansion, Code Conversion, And Decoders (AREA)
- Transmission Systems Not Characterized By The Medium Used For Transmission (AREA)
Abstract
In a speech signal coding system, the speech signal is divided into a plurality of frames and each frame is divided into a plurality of subframes. The speech signal is represented by data including a code of a long term prediction parameter of the speech signal, a code of a short term prediction parameter thereof, and a quantized code indicative of a difference signal between a signal for short term prediction and a signal for long term prediction. The speech signal coding system includes a sound source code book (270; 300; 295, 310) for a plurality of sound source codes, a designating section (200, 290, 195) for designating a search region, a searching section (210, 340) for searching the designated search region of the sound source code book for an optimal sound source code for a current subframe.
Description
- The present invention relates to speech signal coding, and more particularly, to a method and speech signal coding system for encoding a speech signal with a search region of a sound source code book being limited.
- Recently, telephones such as an automobile telephone and a cordless telephone using a radio frequency band have been used widely. In such a telephone, the high efficiency speech signal encoding technique by which a speech signal can be efficiently digitalized and compressed is the most important problem as well as the effective use of frequency bandwidth, the down sizing of an antenna, and the low power consumption. Since only a narrow frequency bandwidth is available in a radio frequency bandwidth, it is important to develop a system for encoding a speech signal at a low bit rate. As a system in which a speech signal can be encoded at a bit rate as low as 4 to 8 Kbps, there is known, for example, a Code Excited Linear Predictive Coding (CELP) system which is described in a paper (the reference 1) by M. Schroeder and B. S. Atal, entitled "Code-excited linear prediction: High quality speech at low bit rates", (ICASSP proc. 85, pp. 937-940, 1985).
- In this system, the encoding process is performed on the transmitting side in accordance with the following procedure. First, a short term prediction is performed. For this purpose, a speech signal is divided into a plurality of frames of, for example, 20 ms and then a spectrum parameter indicative of the frequency characteristic of the speech signal is extracted from the speech signal for every frame (This process is referred to as "short term prediction"). Linear Predictive Coding (LPC) is used for the short term prediction in many cases. The spectrum parameter obtained in the short term prediction is used as coefficients for a synthetic filter in the encoding and decoding processes.
- Next, a long term prediction is performed. Each frame is divided into N subframes each having a shorter time interval of, for example, 5 ms. A pitch parameter (delay) L indicative of a long term correlation (a pitch correlation) and a gain is determined for every subframe based on the spectrum parameter and a sound source signal obtained from previous sound source codes. The long term prediction thus performed is referred to as "an adaptive code book". This is the problem of a least square error between the speech signal and a synthetic signal
where p[n] is a speech signal, β is an amplitude or gain, h[n] is an impulse response of a synthetic filter determined based on the result of the short term prediction, r[n] is a signal for a previous sound source code, and L is a delay value. More specifically, there are determined the delay value L and the gain β when a value VL obtained by dividing the square of cross correlation CL between the speech signal and the synthetic signal by self-correlation GL of the synthetic signal is maximum while the delay value L is varied in the range of, for example, 20 to 147 which is considered to cover the basic frequency range of speech, as shown in the following equations (2) to (5).
Next, there are determined code vectors ej (ej corresponds to a sound source code and j is an index indicative of the code vector ej) and a gain γ when the error power between a synthetic signal h[n] * ej[n] obtained based on code vectors ej[n] which is extracted from a noise sequence (sound source code book) prepared in advance and an difference signal d[n] after the long term prediction is performed minimum. This corresponds to a least square error problem of the synthetic signal h[n] * ej[n] which is produced from the code vectors ej[n] and the difference signal d[n], as shown in the following equation (6), similar to the long term prediction.
where
That is, the search is not performed for a limited search region of sound source codes but for the whole search region of sound source codes. - The determined index j of a sound source code, gain γ, pitch parameter (delay) L, gain β and spectrum parameter are transmitted.
- Also, there is proposed as one of the speech signal coding systems a learned sound source code book which is produced by performing vector quantization to the difference signal after the short term prediction and the long term prediction are performed. This system is disclosed in the Laid Open Japanese Patent Applications (JP-A-hei3-243998: the reference 2 and JP-A-hei3-243999: the reference 3).
- The system in which a code vector eI[n] (I is an index of a sound source code) is represented by a linear sum of base vectors Vm[n] (m is an index of a base vector) as less as M (M is as many as 9), as shown in the following equation (11) is referred to as a Vector Sum Excited LPC (VSELP) system and standardized as a speech signal coding system of a full rate for an automobile telephone in the North America and Japan.
where ϑIm is 1 when the m-th bit of a sound source code having the index I is 1 and -1 when the m-th bit thereof is 0. This type of speech signal coding system is disclosed in Laid Open Japanese Patent Application (JP-A-hei2-502135: the reference 4) in detail. In this speech signal coding system, there can be calculated from the following equations (12) to (15) the cross correlation Cj between a speech signal and a synthetic signal produced based on previous sound source codes and a self-correlation Gj by using the special structure of the sound source code book. As a result, the amount of calculation can be reduced to M/N.
In the above equations (12) to (15), u and j are sound source codes which are different from each other in only one bit. V is indicative of a bit number of different bit and u is produced from a gray code (alternative binary code) represented by the following equation (16).
where i is an integer in a range of 0 to 2M-1, ⊕ represents an exclusive OR, and >> represents shift in a right direction. The search for sound source code is sequentially performed for the gray codes u corresponding to i = 0 to 2M -1 when the number of base vectors is M. That is, the search is performed for all the gray codes without limiting the range. - Next, a speech signal coding system of the VSELP type as a representative one of conventional systems will be described below with reference to Fig. 1. The speech signal coding system includes a
buffer circuit 110 which is a circuit for storing a speech signal inputted via aterminal 100. AFLAT analyzing circuit 350 is a circuit for performing speech analysis for each of a plurality of frames of the speech signal stored in thebuffer circuit 110 to calculate reflection coefficients and frame energy. The reflection coefficients and frame energy are quantized and supplied to amultiplexer 250 as a spectrum parameter. An αparameter converting circuit 360 performs an inverse quantization processing for the reflection coefficients supplied from theFLAT analyzing circuit 350 to produce an α parameter, i.e., converts the reflection coefficients into the α parameter. Aninterpolating circuit 160 interpolates an α parameter between subframes from the α parameters supplied from theconverting circuit 360. The interpolated α parameter is supplied to various circuits such as an acousticsensibility weighting circuit 170, an influencesignal generating circuit 180, a weighting and synthesizingcircuit 240, anadaptive code book 190, a VSELP sound sourcecode searching circuit 340, and a GPSO gaincode searching circuit 330, which will be all described later in detail. - A dividing
circuit 120 connected to thebuffer circuit 110 divides each of the plurality of frames of the speech signal stored in thebuffer circuit 110 into a plurality of subframes. The acousticsensibility weighting circuit 170 performs acoustic sensibility weighting for each subframe of the speech signal using the interpolated α parameter and supplies the weighted subframes of the speech signal to asubtracting circuit 185. The influencesignal generating circuit 180 generates an influence signal as a zero input response of a weighted synthetic filter using the interpolated α parameter and a weighted synthetic signal from the weighting and synthesizingcircuit 240. Asubtracter 185 subtracts the influence signal supplied from the influencesignal generating circuit 180 from the acoustic sensibility weighted subframe of the speech signal and supplies a difference signal as the subtracting result to theadaptive code book 190. Theadaptive code book 190 performs a long term prediction for each subframe of the difference signal using the interpolated α parameter and a sound source signal supplied from the sound sourcesignal producing circuit 230 to determine an optimal delay. The determined delay is supplied to themultiplexer 250, the VSELP sound sourcecode searching circuit 340, the GPSO gaincode searching circuit 330 and the sound sourcesignal producing circuit 230. - The base vector table 310 stores base vectors various combinations of which represent sound source codes of a sound source code book. Each of the base victors is assigned with an index. The VSELP sound source
code searching circuit 340 searches the base vector table 310 for an optimal code book using all the base vectors in response to the interpolated α parameter from the interpolatingcircuit 160, the subtracted signal from thesubtracter 185, and the result of the long term prediction in theadaptive code book 190. The searched sound source code is supplied to the GPSO gaincode searching circuit 330 the sound sourcesignal producing circuit 230 and the indexes for the base vectors used in the searched sound source code are supplied to themultiplexer 250. - A GSPO
gain code book 280 is a code book which stores gains of the sound source of the adaptive code book and the sound source code of the sound source code book. The GPSO gaincode searching circuit 330 searches each of gains of the sound source of the adaptive code book and the sound source code from the subtracted signal from thesubtracter 185 based on the interpolated α parameter from the interpolatingcircuit 160, the searched sound source code from thesearching circuit 340, and the pitch parameter from theadaptive code book 190. The searched gains are supplied to themultiplexer 250 and the sound sourcesignal producing circuit 230. - The sound source
signal producing circuit 230 receives the pitch parameter from theadaptive code book 190, the sound source code from thesearching circuit 340 and the gains from thesearching circuit 330 and produces a sound source signal for the subframe by multiplying a gain term with each of the sound sources of the adaptive code book and sound source code book and by adding the multiplied results to each other and stores it as a previous sound source signal. The stored sound source signal is supplied to the weighting and synthesizingcircuit 240 and theadaptive code book 190. The weighting and synthesizingcircuit 240 is a circuit for performing weighting synthesization. Themultiplexer 250 outputs via a terminal 260 a combination of code sequences from theFLAT analyzing circuit 350, theadaptive code book 190, the VSELP sound sourcecode searching circuit 340, and the GPSO gaincode searching circuit 330. - The above mentioned processing is performed for each subframe and the speech signal is encoded.
- In the speech signal coding system of CELP type, as the very much amount of calculation is required, a digital signal processor must operates with a clock signal having a frequency as high as 30 MHz in order to implement an apparatu. As a result, the power consumption of the digital signal processor increases remarkably. In a case of a portable telephone which cannot have a large battery, the power would be consumed with the operation of about 45 minutes.
- An object of the present invention is to provide a method of encoding a speech signal with a good quality regardless of a low bit rate and with possible communication for a longer time.
- Another object of the present invention is to provide a speech signal coding system for achieving the method.
- In order to achieve an object, the method of encoding a speech signal according to the present invention includes the steps of:
providing a sound source code book for a plurality of sound source codes;
providing a gain code book storing gain codes;
performing a short term prediction for each of long time intervals of a speech signal to produce a first parameter;
performing a long term prediction for each of short time intervals of the each long time interval of a difference signal between the speech signal and a signal obtained based on the first parameter and previous sound source codes to produce a second parameter;
designating a search region of the sound source code book;
searching the designated search region of the sound source code book for a sound source code optimal to said each short time interval of the difference signal to determine a third parameter for the optimal sound source code;
searching the gain code book to determine a fourth parameter for gain of the second and third parameters; and
outputting a combination of the first to fourth parameters. - In order to achieve another aspect of the present invention, a speech signal coding system according to the present invention in which a speech signal is represented by data including a code of a long term prediction parameter of the speech signal, a code of a short term prediction parameter thereof, and a quantized code indicative of a difference signal between a signal for short term prediction and a signal for long term prediction, the speech signal coding system includes a sound source code book for a plurality of sound source codes, a designating section for designating a search region, a searching section for searching the designated search region of the sound source code book for an optimal sound source code for a current short time interval, the speech signal being divided into portions for a plurality of long time intervals, each long time intervals being divided into a plurality of short time intervals.
-
- Fig. 1 is a block diagram showing a conventional speech signal coding system;
- Fig. 2 is a block diagram showing a speech signal coding system according to a first embodiment of the present invention;
- Fig. 3 is a block diagram showing a speech signal coding system according to a second embodiment of the present invention; and
- Fig. 4 is a block diagram showing a speech signal coding system according to a third embodiment of the present invention.
- A speech signal coding system according to the present invention will be described below in detail with reference to the accompanying drawings.
- In the first embodiment of the present invention, the search region of a sound source code for a current subframe is designated and limited to the range of the predetermined fixed number of sound source codes before and after a sound source code for a subframe previous to the current subframe by one, for example, the range of 10 sound source codes in each of directions before and after the previous subframe sound source code. In the first embodiment, because a noise sequence is used as a sound source code book, it could be considered that the current subframe sound source code is very similar to the previous subframe sound source code. Therefore, even if the search region is limited in a predetermined range with respect to the previous subframe sound source code, an optimal sound source code to the current subframe could be found in the limited search region.
- The speech signal coding system according to a first embodiment of the present invention will be described below with reference to Fig. 2. In Fig. 2, the speech signal coding system includes a
buffer circuit 110, asubframe dividing circuit 120, anLPC analyzing circuit 130, a parameter quantizing circuit (Q) 140, a parameter inversely quantizing circuit (Q⁻¹) 150, aparameter interpolating circuit 160, an acousticsensibility weighting circuit 170, an influencesignal generating circuit 180, asubtracter 185, anadaptive code book 190, a searchregion designating circuit 200, a soundsource searching circuit 210, a gaincode searching circuit 220, a sound sourcesignal producing circuit 230, a weighting and synthesizingcircuit 240, amultiplexer 250, anoise sequence 270 as a sound source code book, and again code book 280. - In the speech signal coding system according to the first embodiment of the present invention, the
buffer circuit 110 receives a speech signal inputted via aterminal 100 and stores it therein. TheLPC analyzing circuit 130 performs speech analysis for each of a plurality of frames of the speech signal stored in thebuffer circuit 110 to extract LPC coefficients (a α parameter) as a spectrum parameter. Theparameter quantizing circuit 140 quantizes the LPC coefficients as the α parameter and the quantized coefficients are supplied to themultiplexer 250 as a spectrum parameter. An inversely quantizingcircuit 150 performs an inverse quantization processing for the quantized coefficients supplied from thequantizing circuit 140 to produce or recover an α parameter, i.e., converts the quantized LPC coefficients into the α parameter. Theparameter interpolating circuit 160 interpolates an α parameter between subframes from the α parameters supplied from the convertingcircuit 360. The interpolated α parameter is supplied to various circuits such as the acousticsensibility weighting circuit 170, the influencesignal generating circuit 180, the weighting and synthesizingcircuit 240, theadaptive code book 190, the sound sourcecode searching circuit 210 and the gaincode searching circuit 220, which will be all described later in detail. - The dividing
circuit 120 connected to thebuffer circuit 110 divides each of the plurality of frames of the speech signal stored in thebuffer circuit 110 into a plurality of subframes. The acousticsensibility weighting circuit 170 performs acoustic sensibility weighting for each subframe of the speech signal using the interpolated α parameter and supplies the weighted subframes of the speech signal to thesubtracter circuit 185. The influencesignal generating circuit 180 generates an influence signal as a zero input response of a weighted synthetic filter using the interpolated α parameter and a weighted synthetic signal from the weighting and synthesizingcircuit 240. Thesubtracter 185 subtracts the influence signal supplied from the influencesignal generating circuit 180 from the acoustic sensibility weighted subframe of the speech signal and supplies a difference signal as the subtracting result to theadaptive code book 190, the sound sourcecode searching circuit 210, and the gaincode searching circuit 220. Theadaptive code book 190 performs a long term prediction for each subframe of the difference signal using the interpolated α parameter and a sound source signal supplied from the sound sourcesignal producing circuit 230 to determine an optimal delay. The determined delay is supplied to themultiplexer 250, the sound sourcecode searching circuit 210, the gaincode searching circuit 220 and the sound sourcesignal producing circuit 230. - The sound
source code book 270 stores a noise sequence as the sound source codes. Each of the sound source codes is assigned with an index. The searchregion designating circuit 200 receives the identifier, i.e., index of a sound source code for a previous subframe before the current subframe by one to store it and designates the search region for the next subfame based on the stored identifier for the previous subframe in response to the pitch parameter from theadaptive code book 190. In this embodiment, the designatingcircuit 200 designates as the search region a region of 20 sound source codes before and after the sound source code for the previous subframe. The sound sourcecode searching circuit 340 searches the designated search region of the soundsource code book 270 for an optimal code book for the current subframe in response to the interpolated α parameter from the interpolatingcircuit 160, the difference signal from thesubtracter 185, and the result of the long term prediction in theadaptive code book 190. The optimal searched sound source code is supplied to the gaincode searching circuit 220, the sound sourcesignal producing circuit 230 and the searchregion designating circuit 200 and the identifier of the searched sound source code is supplied to themultiplexer 250. - The
gain code book 280 is a code book which stores gains of the sound source of the adaptive code book and the sound source code of the sound source code book. The gaincode searching circuit 330 searches thegain code book 280 for each of gains of the sound source of the adaptive code book and the sound source code searched by the searchingcircuit 210 from the difference signal from thesubtracter 185 based on the interpolated α parameter from the interpolatingcircuit 160, the searched sound source code from the searchingcircuit 340, and the pitch parameter from theadaptive code book 190. The searched gains are supplied to themultiplexer 250 and the sound sourcesignal producing circuit 230. - The sound source
signal producing circuit 230 receives the pitch parameter from theadaptive code book 190, the sound source code from the searchingcircuit 340 and the gains from the searchingcircuit 330 and produces a sound source signal for the current subframe by multiplying a gain term with each of the sound sources of the adaptive code book and sound source code book and by adding the multiplied results to each other and stores it as the sound source signal for a previous subframe. The stored sound source signal is supplied to the weighting and synthesizingcircuit 240 and theadaptive code book 190. The weighting and synthesizingcircuit 240 is a circuit for performing weighting synthesization. Themultiplexer 250 outputs via a terminal 260 a combination of a code sequence of the spectrum parameter from the quantizing circuit 140 (for a short term prediction), a code sequence of the pitch parameter from the adaptive code book 190 (a long term prediction), a code sequence of the sound source code from the sound sourcecode searching circuit 210, and two code sequences of the gains from the gaincode searching circuit 220. - Next, the speech signal coding system according to the second embodiment of the present invention will be described below with reference to Fig. 3. The same components as those in Fig. 2 are assigned with the same reference numerals and the descriptio will be omitted. The different components and the operation will be described.
- In the second embodiment, a learned sound
source code book 300 is used in place of the soundsource code book 270 and stores vectors of a plurality of sound source codes having a subframe length obtained through the vector quantization. The learned soundsource code book 300 is known to a person skilled in the art and disclosed in the references 2 and 3. Because the learned soundsource code book 300 is used in the second embodiment, it is not ensured that a sound source code adjacent to the sound source code for the previous subframe has a high similarity to the adjacent sound source code. Therefore, a similar code table 290 is prepared in the embodiment. The similar code table 290 is a table for storing identifiers of a predetermined number of sound source codes, for example, 10 sound source codes in this embodiment, similar to each of the learned sound source codes of the learned soundsource code book 300 in order of higher similarity. The searchregion designating circuit 205 is similar to the searchregion designating circuit 200 in the first embodiment and refers to the similar code table 290 based on the identifier of the learned sound source code for the previous subframe to designate the search region of the learned soundsource code book 300. - The search
region designating circuit 200 receives the identifier, i.e., index of a learned sound source code for a previous subframe before the current subframe by one to store it and refers to the similar code table 290 to designate the search region of the learned soundsource code book 300 for the next subfame based on the stored identifier for the previous subframe in response to the pitch parameter from theadaptive code book 190. In this embodiment, the designatingcircuit 200 designates as the search region a region of 10 learned sound source codes written in the similar code table 290 with respect to the learned sound source code for the previous subframe. The sound sourcecode searching circuit 340 searches the designated search region of the learned soundsource code book 300 for an optimal code book for the current subframe in response to the interpolated α parameter from the interpolatingcircuit 160, the difference signal from thesubtracter 185, and the result of the long term prediction in theadaptive code book 190. The optimal searched sound source code is supplied to the gaincode searching circuit 220 and the sound sourcesignal producing circuit 230 and the identifier of the searched sound source code is supplied to themultiplexer 250 and the searchregion designating circuit 200. - Next, the speech signal coding system according to the third embodiment of the present invention will be described below with reference to Fig. 4.
- The speech signal coding system according to the third embodiment includes a
buffer circuit 110 which is a circuit for storing a speech signal inputted via aterminal 100. AFLAT analyzing circuit 350 is a circuit for performing speech analysis for each of a plurality of frames of the speech signal stored in thebuffer circuit 110 to calculate reflection coefficients and frame energy. The reflection coefficients and frame energy are quantized and supplied to amultiplexer 250 as a spectrum parameter. An αparameter converting circuit 360 performs an inverse quantization processing for the reflection coefficients supplied from theFLAT analyzing circuit 350 to produce an α parameter, i.e., converts the reflection coefficients into the α parameter. An interpolatingcircuit 160 interpolates an α parameter between subframes from the α parameters supplied from the convertingcircuit 360. The interpolated α parameter is supplied to various circuits such as an acousticsensibility weighting circuit 170, an influencesignal generating circuit 180, a weighting and synthesizingcircuit 240, anadaptive code book 190, a VSELP sound sourcecode searching circuit 340, and a GPSO gaincode searching circuit 330, which will be all described later in detail. - A dividing
circuit 120 connected to thebuffer circuit 110 divides each of the plurality of frames of the speech signal stored in thebuffer circuit 110 into a plurality of subframes. The acousticsensibility weighting circuit 170 performs acoustic sensibility weighting for each subframe of the speech signal using the interpolated α parameter and supplies the weighted subframes of the speech signal to asubtracting circuit 185. The influencesignal generating circuit 180 generates an influence signal as a zero input response of a weighted synthetic filter using the interpolated α parameter and a weighted synthetic signal from the weighting and synthesizingcircuit 240. Asubtracter 185 subtracts the influence signal supplied from the influencesignal generating circuit 180 from the acoustic sensibility weighted subframe of the speech signal and supplies a difference signal as the subtracting result to theadaptive code book 190, the VSELP sound sourcecode searching circuit 340 and the GPSO gaincode searching circuit 330. Theadaptive code book 190 performs a long term prediction for each subframe of the difference signal using the interpolated α parameter and a sound source signal for a previous subframe supplied from the sound sourcesignal producing circuit 230 to determine an optimal delay. The determined delay is supplied to themultiplexer 250, the VSELP sound sourcecode searching circuit 340, the GPSO gaincode searching circuit 330 and the sound sourcesignal producing circuit 230. - A sound source code book includes the gray code table 290 and the base vector table 310. The gray code table 290 stores a plurality of gray codes. The base vector table 310 stores base vectors various combinations of which represent sound source codes of the sound source code book. Each of the base victors is assigned with an index. A gray code search
region designating circuit 195 designates as the search region 10 gray codes before and after the gray code for previous subframe before the current subframe by one. A VSELP sound sourcecode searching circuit 340 refers to the gray code table 290 and the base vector table 310 to produce sound source codes in combination of the base vectors and determine an optimal sound source code. In other words, the VSELP sound sourcecode searching circuit 340 searches the base vector table 310 for an optimal code book using all the base vectors in response to the interpolated α parameter from the interpolatingcircuit 160, the subtracted signal from thesubtracter 185, and the result of the long term prediction in theadaptive code book 190. The searching processing of the optimal sound source code is disclosed in the reference 4. The searched sound source code is supplied to the GPSO gaincode searching circuit 330 and the sound sourcesignal producing circuit 230 and the index of the gray code for the optimal sound source code is supplied to themultiplexer 250 and the graycode designating circuit 195. - A GSPO
gain code book 320 is a code book which stores gains obtained by converting the gains of theadaptive code book 190 and sound source code into two parameters GS and PO and performing a two-dimensional vector quantization to the two parameter. The search operation of the GPSO gaincode searching circuit 330 is disclosed in detail in, for example, the TIA Recommendation in U. S. A. , TIA Technical Subcommittee TR. 45.3, "Digital Cellular Standards, Baseline Text for Speech coder and Speech decoder". The searched gains are supplied to themultiplexer 250 and the sound sourcesignal producing circuit 230. - The sound source
signal producing circuit 230 receives the pitch parameter from theadaptive code book 190, the sound source code from the searchingcircuit 340 and the gains from the searchingcircuit 330 and produces a sound source signal for the subframe by multiplying a gain term with each of the sound sources of the adaptive code book and sound source code book and by adding the multiplied results to each other and stores it as a previous sound source signal. The stored sound source signal is supplied to the weighting and synthesizingcircuit 240 and theadaptive code book 190. The weighting and synthesizingcircuit 240 is a circuit for performing weighting synthesization. Themultiplexer 250 outputs via a terminal 260 a combination of code sequences from theFLAT analyzing circuit 350, theadaptive code book 190, the VSELP sound sourcecode searching circuit 340, and the GPSO gaincode searching circuit 330. - In the speech signal coding system of either of the first, second or third embodiment, the search region is always designated based on the sound source code for a subframe immediately before the current subframe. However, the whole region of the sound source code book may be designated as the search region for the first subframe of the speech signal and a limited region thereof may be designated for the second subframe and the subsequent subframes. Also, the sound source codes for a plurality of subframes previous to the current subframe may be used to determined a sound source code for the current subframe. Further, the whole region and a limited region of the sound source code book may be alternatively designated as the search region for each of the continuous subframes.
- In addition, in the above embodiments, the sound source code book searching circuit is of one stage. However, even if a multistage sound source code book searching circuit may be used, the same advantage could be obtained.
- Further, the description is made using the LPC analyzing circuit in the first and second embodiment and the FLAT analyzing circuit in the third embodiment. However, it is apparent that the same advantage can be obtained even in the other spectrum parameter such as PARCOR coefficients and cepstrum coefficents.
- Furthermore, in the above embodimetns, the search region to be designated is fixed. However, the search region may be varied based on another data such as the spectrum parameter or pitch parameter. Such a modification will be described below. In this modification, the search region is changed based on the spectrum parameter. The first to third embodiments are modified such that the quantized spectrum parameter is supplied to the search
region desinating circuit
Claims (13)
- A speech signal coding system in which a speech signal is represented by data including a code of a long term prediction parameter of the speech signal, a code of a short term prediction parameter thereof, and a quantized code indicative of a difference signal between a signal for short term prediction and a signal for long term prediction, said speech signal coding system comprising:
a sound source code book (270;300; 290 and 310) for a plurality of sound source codes;
designating means (200; 205, 290; 195) for designating a search region;
searching means (210; 340) for searching said designated search region of said sound source code book for an optimal sound source code for a current short time interval, the speech signal being divided into portions for a plurality of long time intervals, each long time intervals being divided into a plurality of short time intervals. - A speech signal coding system according to claim 1, wherein said designating means (200; 205, 290; 195) designates as said search region for the current short time interval a fixed range with respect to at least one of sound source codes for the short time intervals before the current short time interval.
- A speech signal coding system according to claim 2, wherein said designating means (200; 205, 290; 195) designates as said search region for the current short time interval the fixed range with respect to the sound source code for the short time interval immediately before the current short time interval.
- A speech signal coding system according to claim 1, wherein said designating means (200; 205, 290; 195) alternatively designates as said search region for the current short time interval one of a whole of said plurality of sound source codes and a fixed range with respect to the sound source code for the short time interval immediately before the current short time interval.
- A speech signal coding system according to claim 1, wherein said designating means (200; 205, 290; 195) designates as said search region for the current short time interval a variable range determined based on the long term prediction parameter code with respect to at least one of sound source codes for the short time intervals before the current short time interval.
- A speech signal coding system according to claim 5, wherein said designating means (200; 205, 290; 195) designates as said search region for the current short time interval the variable range determined based on the long term prediction parameter code with respect to the sound source code for the short time interval immediately before the current short time interval.
- A speech signal coding system according to claim 1, wherein said designating means (200; 205, 290; 195) alternatively designates as said search region for the current short time interval one of a whole of said plurality of sound source codes and a variable range determined based on the long term prediction parameter code with respect to the sound source code for the short time interval immediately before the current short time interval.
- A speech signal coding system according to claim 1, wherein said designating means (200; 205, 290; 195) designates as said search region for the current short time interval a variable range determined based on the short term prediction parameter code with respect to at least one of sound source codes for the short time intervals before the current short time interval.
- A speech signal coding system according to claim 5, wherein said designating means (200; 205, 290; 195) designates as said search region for the current short time interval the variable range determined based on the short term prediction parameter code with respect to the sound source code for the short time interval immediately before the current short time interval.
- A speech signal coding system according to claim 1, wherein said designating means (200; 205, 290; 195) alternatively designates as said search region for the current short time interval one of a whole of said plurality of sound source codes and a variable range determined based on the short term prediction parameter code with respect to the sound source code for the short time interval immediately before the current short time interval.
- A speech signal coding system according to claim 1, wherein said sound source code book (270) is a noise sequence.
- A speech signal coding system according to claim 1, wherein said sound source code book (300) is a learned sound source code book, and
wherein said designating means (205, 290) includes:
a table (290) for storing an indication data indicative of learned sound source codes similar to each of said plurality of learned sound source codes; and
means (205)for designating the search region based on the indication data. - A speech signal coding system according to claim 1, wherein said sound source code book includes a gray code table (290) and a vector table (310) storing a plurality of base vectors, and
wherein said designating means (195) designates the search region in said gray code table with respect to the sound source code for at least one of the short time intervals before the current short time interval, and
wherein said search means (340) searches said search region of said gray code table (290) for gray codes, refers to said vector table (310) based on the searched gray codes to determine sound source codes, and determines said optimal sound source code for the current short time interval.
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
JP337160/93 | 1993-12-28 | ||
JP5337160A JPH07199994A (en) | 1993-12-28 | 1993-12-28 | Speech encoding system |
Publications (1)
Publication Number | Publication Date |
---|---|
EP0662682A2 true EP0662682A2 (en) | 1995-07-12 |
Family
ID=18306013
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
EP94120542A Withdrawn EP0662682A2 (en) | 1993-12-28 | 1994-12-23 | Speech signal coding |
Country Status (2)
Country | Link |
---|---|
EP (1) | EP0662682A2 (en) |
JP (1) | JPH07199994A (en) |
Families Citing this family (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP3593839B2 (en) * | 1997-03-28 | 2004-11-24 | ソニー株式会社 | Vector search method |
CN108962225B (en) * | 2018-06-27 | 2020-10-23 | 西安理工大学 | Multi-scale self-adaptive voice endpoint detection method |
-
1993
- 1993-12-28 JP JP5337160A patent/JPH07199994A/en active Pending
-
1994
- 1994-12-23 EP EP94120542A patent/EP0662682A2/en not_active Withdrawn
Also Published As
Publication number | Publication date |
---|---|
JPH07199994A (en) | 1995-08-04 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Campbell Jr et al. | The DoD 4.8 kbps standard (proposed federal standard 1016) | |
EP0409239B1 (en) | Speech coding/decoding method | |
US8364473B2 (en) | Method and apparatus for receiving an encoded speech signal based on codebooks | |
EP0573398B1 (en) | C.E.L.P. Vocoder | |
US5729655A (en) | Method and apparatus for speech compression using multi-mode code excited linear predictive coding | |
US5125030A (en) | Speech signal coding/decoding system based on the type of speech signal | |
EP0331857B1 (en) | Improved low bit rate voice coding method and system | |
KR100275054B1 (en) | Speech coding apparatus and a method of encoding speech | |
US5339384A (en) | Code-excited linear predictive coding with low delay for speech or audio signals | |
US7426465B2 (en) | Speech signal decoding method and apparatus using decoded information smoothed to produce reconstructed speech signal to enhanced quality | |
EP0957472B1 (en) | Speech coding apparatus and speech decoding apparatus | |
US20040023677A1 (en) | Method, device and program for coding and decoding acoustic parameter, and method, device and program for coding and decoding sound | |
JPH08263099A (en) | Encoder | |
KR20010024935A (en) | Speech coding | |
US5682407A (en) | Voice coder for coding voice signal with code-excited linear prediction coding | |
US6804639B1 (en) | Celp voice encoder | |
US6910009B1 (en) | Speech signal decoding method and apparatus, speech signal encoding/decoding method and apparatus, and program product therefor | |
US20070136051A1 (en) | Pitch cycle search range setting apparatus and pitch cycle search apparatus | |
US6470310B1 (en) | Method and system for speech encoding involving analyzing search range for current period according to length of preceding pitch period | |
EP0557940A2 (en) | Speech coding system | |
EP0662682A2 (en) | Speech signal coding | |
EP0694907A2 (en) | Speech coder | |
EP0658877A2 (en) | Speech coding apparatus | |
JP3249144B2 (en) | Audio coding device | |
JP2700974B2 (en) | Audio coding method |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PUAI | Public reference made under article 153(3) epc to a published international application that has entered the european phase |
Free format text: ORIGINAL CODE: 0009012 |
|
AK | Designated contracting states |
Kind code of ref document: A2 Designated state(s): DE FR GB |
|
STAA | Information on the status of an ep patent application or granted ep patent |
Free format text: STATUS: THE APPLICATION HAS BEEN WITHDRAWN |
|
18W | Application withdrawn |
Withdrawal date: 19970113 |