EP0662682A2 - Speech signal coding - Google Patents

Speech signal coding Download PDF

Info

Publication number
EP0662682A2
EP0662682A2 EP94120542A EP94120542A EP0662682A2 EP 0662682 A2 EP0662682 A2 EP 0662682A2 EP 94120542 A EP94120542 A EP 94120542A EP 94120542 A EP94120542 A EP 94120542A EP 0662682 A2 EP0662682 A2 EP 0662682A2
Authority
EP
European Patent Office
Prior art keywords
sound source
short time
time interval
speech signal
code
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Withdrawn
Application number
EP94120542A
Other languages
German (de)
French (fr)
Inventor
Keiichi C/O Nec Corporation Funaki
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
NEC Corp
Original Assignee
NEC Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by NEC Corp filed Critical NEC Corp
Publication of EP0662682A2 publication Critical patent/EP0662682A2/en
Withdrawn legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
    • G10L19/04Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using predictive techniques
    • G10L19/08Determination or coding of the excitation function; Determination or coding of the long-term prediction parameters
    • G10L19/12Determination or coding of the excitation function; Determination or coding of the long-term prediction parameters the excitation function being a code excitation, e.g. in code excited linear prediction [CELP] vocoders
    • G10L19/135Vector sum excited linear prediction [VSELP]
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
    • G10L2019/0001Codebooks
    • G10L2019/0013Codebook search algorithms

Definitions

  • the present invention relates to speech signal coding, and more particularly, to a method and speech signal coding system for encoding a speech signal with a search region of a sound source code book being limited.
  • the encoding process is performed on the transmitting side in accordance with the following procedure.
  • a short term prediction is performed.
  • a speech signal is divided into a plurality of frames of, for example, 20 ms and then a spectrum parameter indicative of the frequency characteristic of the speech signal is extracted from the speech signal for every frame (This process is referred to as "short term prediction").
  • LPC Linear Predictive Coding
  • the spectrum parameter obtained in the short term prediction is used as coefficients for a synthetic filter in the encoding and decoding processes.
  • a long term prediction is performed.
  • Each frame is divided into N subframes each having a shorter time interval of, for example, 5 ms.
  • a pitch parameter (delay) L indicative of a long term correlation (a pitch correlation) and a gain is determined for every subframe based on the spectrum parameter and a sound source signal obtained from previous sound source codes.
  • the long term prediction thus performed is referred to as "an adaptive code book”. This is the problem of a least square error between the speech signal and a synthetic signal h[n] * r[n - L] (the symbol "*" indicates a convolution operation) obtained from a signal r[n - L] which is obtained by delaying a previous sound source signal r[n] by L, as shown in the following equation (1).
  • p[n] is a speech signal
  • is an amplitude or gain
  • h[n] is an impulse response of a synthetic filter determined based on the result of the short term prediction
  • r[n] is a signal for a previous sound source code
  • L is a delay value. More specifically, there are determined the delay value L and the gain ⁇ when a value VL obtained by dividing the square of cross correlation CL between the speech signal and the synthetic signal by self-correlation GL of the synthetic signal is maximum while the delay value L is varied in the range of, for example, 20 to 147 which is considered to cover the basic frequency range of speech, as shown in the following equations (2) to (5).
  • values of V j shown in the following equations (7) to (10) are calculated over all the sound source codes while the index j of the sound source code is varied from 0 to 2 B - 1 and then j and ⁇ are determined by determining V j taking a maximum value.
  • C j / G j (10) That is, the search is not performed for a limited search region of sound source codes but for the whole search region of sound source codes.
  • the determined index j of a sound source code, gain ⁇ , pitch parameter (delay) L, gain ⁇ and spectrum parameter are transmitted.
  • VSELP Vector Sum Excited LPC
  • This type of speech signal coding system is disclosed in Laid Open Japanese Patent Application (JP-A-hei2-502135: the reference 4) in detail.
  • this speech signal coding system there can be calculated from the following equations (12) to (15) the cross correlation C j between a speech signal and a synthetic signal produced based on previous sound source codes and a self-correlation G j by using the special structure of the sound source code book. As a result, the amount of calculation can be reduced to M/N.
  • G u G i + ⁇ uv ⁇ R v (12)
  • u and j are sound source codes which are different from each other in only one bit.
  • V is indicative of a bit number of different bit and u is produced from a gray code (alternative binary code) represented by the following equation (16).
  • u i ⁇ (i>>1) (16) where i is an integer in a range of 0 to 2 M -1, ⁇ represents an exclusive OR, and >> represents shift in a right direction.
  • the speech signal coding system includes a buffer circuit 110 which is a circuit for storing a speech signal inputted via a terminal 100.
  • a FLAT analyzing circuit 350 is a circuit for performing speech analysis for each of a plurality of frames of the speech signal stored in the buffer circuit 110 to calculate reflection coefficients and frame energy.
  • the reflection coefficients and frame energy are quantized and supplied to a multiplexer 250 as a spectrum parameter.
  • An ⁇ parameter converting circuit 360 performs an inverse quantization processing for the reflection coefficients supplied from the FLAT analyzing circuit 350 to produce an ⁇ parameter, i.e., converts the reflection coefficients into the ⁇ parameter.
  • An interpolating circuit 160 interpolates an ⁇ parameter between subframes from the ⁇ parameters supplied from the converting circuit 360.
  • the interpolated ⁇ parameter is supplied to various circuits such as an acoustic sensibility weighting circuit 170, an influence signal generating circuit 180, a weighting and synthesizing circuit 240, an adaptive code book 190, a VSELP sound source code searching circuit 340, and a GPSO gain code searching circuit 330, which will be all described later in detail.
  • a dividing circuit 120 connected to the buffer circuit 110 divides each of the plurality of frames of the speech signal stored in the buffer circuit 110 into a plurality of subframes.
  • the acoustic sensibility weighting circuit 170 performs acoustic sensibility weighting for each subframe of the speech signal using the interpolated ⁇ parameter and supplies the weighted subframes of the speech signal to a subtracting circuit 185.
  • the influence signal generating circuit 180 generates an influence signal as a zero input response of a weighted synthetic filter using the interpolated ⁇ parameter and a weighted synthetic signal from the weighting and synthesizing circuit 240.
  • a subtracter 185 subtracts the influence signal supplied from the influence signal generating circuit 180 from the acoustic sensibility weighted subframe of the speech signal and supplies a difference signal as the subtracting result to the adaptive code book 190.
  • the adaptive code book 190 performs a long term prediction for each subframe of the difference signal using the interpolated ⁇ parameter and a sound source signal supplied from the sound source signal producing circuit 230 to determine an optimal delay. The determined delay is supplied to the multiplexer 250, the VSELP sound source code searching circuit 340, the GPSO gain code searching circuit 330 and the sound source signal producing circuit 230.
  • the base vector table 310 stores base vectors various combinations of which represent sound source codes of a sound source code book. Each of the base victors is assigned with an index.
  • the VSELP sound source code searching circuit 340 searches the base vector table 310 for an optimal code book using all the base vectors in response to the interpolated ⁇ parameter from the interpolating circuit 160, the subtracted signal from the subtracter 185, and the result of the long term prediction in the adaptive code book 190.
  • the searched sound source code is supplied to the GPSO gain code searching circuit 330 the sound source signal producing circuit 230 and the indexes for the base vectors used in the searched sound source code are supplied to the multiplexer 250.
  • a GSPO gain code book 280 is a code book which stores gains of the sound source of the adaptive code book and the sound source code of the sound source code book.
  • the GPSO gain code searching circuit 330 searches each of gains of the sound source of the adaptive code book and the sound source code from the subtracted signal from the subtracter 185 based on the interpolated ⁇ parameter from the interpolating circuit 160, the searched sound source code from the searching circuit 340, and the pitch parameter from the adaptive code book 190.
  • the searched gains are supplied to the multiplexer 250 and the sound source signal producing circuit 230.
  • the sound source signal producing circuit 230 receives the pitch parameter from the adaptive code book 190, the sound source code from the searching circuit 340 and the gains from the searching circuit 330 and produces a sound source signal for the subframe by multiplying a gain term with each of the sound sources of the adaptive code book and sound source code book and by adding the multiplied results to each other and stores it as a previous sound source signal.
  • the stored sound source signal is supplied to the weighting and synthesizing circuit 240 and the adaptive code book 190.
  • the weighting and synthesizing circuit 240 is a circuit for performing weighting synthesization.
  • the multiplexer 250 outputs via a terminal 260 a combination of code sequences from the FLAT analyzing circuit 350, the adaptive code book 190, the VSELP sound source code searching circuit 340, and the GPSO gain code searching circuit 330.
  • a digital signal processor In the speech signal coding system of CELP type, as the very much amount of calculation is required, a digital signal processor must operates with a clock signal having a frequency as high as 30 MHz in order to implement an apparatu. As a result, the power consumption of the digital signal processor increases remarkably. In a case of a portable telephone which cannot have a large battery, the power would be consumed with the operation of about 45 minutes.
  • An object of the present invention is to provide a method of encoding a speech signal with a good quality regardless of a low bit rate and with possible communication for a longer time.
  • Another object of the present invention is to provide a speech signal coding system for achieving the method.
  • the method of encoding a speech signal includes the steps of: providing a sound source code book for a plurality of sound source codes; providing a gain code book storing gain codes; performing a short term prediction for each of long time intervals of a speech signal to produce a first parameter; performing a long term prediction for each of short time intervals of the each long time interval of a difference signal between the speech signal and a signal obtained based on the first parameter and previous sound source codes to produce a second parameter; designating a search region of the sound source code book; searching the designated search region of the sound source code book for a sound source code optimal to said each short time interval of the difference signal to determine a third parameter for the optimal sound source code; searching the gain code book to determine a fourth parameter for gain of the second and third parameters; and outputting a combination of the first to fourth parameters.
  • a speech signal coding system in which a speech signal is represented by data including a code of a long term prediction parameter of the speech signal, a code of a short term prediction parameter thereof, and a quantized code indicative of a difference signal between a signal for short term prediction and a signal for long term prediction
  • the speech signal coding system includes a sound source code book for a plurality of sound source codes, a designating section for designating a search region, a searching section for searching the designated search region of the sound source code book for an optimal sound source code for a current short time interval, the speech signal being divided into portions for a plurality of long time intervals, each long time intervals being divided into a plurality of short time intervals.
  • the search region of a sound source code for a current subframe is designated and limited to the range of the predetermined fixed number of sound source codes before and after a sound source code for a subframe previous to the current subframe by one, for example, the range of 10 sound source codes in each of directions before and after the previous subframe sound source code.
  • a noise sequence is used as a sound source code book, it could be considered that the current subframe sound source code is very similar to the previous subframe sound source code. Therefore, even if the search region is limited in a predetermined range with respect to the previous subframe sound source code, an optimal sound source code to the current subframe could be found in the limited search region.
  • the speech signal coding system includes a buffer circuit 110, a subframe dividing circuit 120, an LPC analyzing circuit 130, a parameter quantizing circuit (Q) 140, a parameter inversely quantizing circuit (Q ⁇ 1) 150, a parameter interpolating circuit 160, an acoustic sensibility weighting circuit 170, an influence signal generating circuit 180, a subtracter 185, an adaptive code book 190, a search region designating circuit 200, a sound source searching circuit 210, a gain code searching circuit 220, a sound source signal producing circuit 230, a weighting and synthesizing circuit 240, a multiplexer 250, a noise sequence 270 as a sound source code book, and a gain code book 280.
  • the buffer circuit 110 receives a speech signal inputted via a terminal 100 and stores it therein.
  • the LPC analyzing circuit 130 performs speech analysis for each of a plurality of frames of the speech signal stored in the buffer circuit 110 to extract LPC coefficients (a ⁇ parameter) as a spectrum parameter.
  • the parameter quantizing circuit 140 quantizes the LPC coefficients as the ⁇ parameter and the quantized coefficients are supplied to the multiplexer 250 as a spectrum parameter.
  • An inversely quantizing circuit 150 performs an inverse quantization processing for the quantized coefficients supplied from the quantizing circuit 140 to produce or recover an ⁇ parameter, i.e., converts the quantized LPC coefficients into the ⁇ parameter.
  • the parameter interpolating circuit 160 interpolates an ⁇ parameter between subframes from the ⁇ parameters supplied from the converting circuit 360.
  • the interpolated ⁇ parameter is supplied to various circuits such as the acoustic sensibility weighting circuit 170, the influence signal generating circuit 180, the weighting and synthesizing circuit 240, the adaptive code book 190, the sound source code searching circuit 210 and the gain code searching circuit 220, which will be all described later in detail.
  • the dividing circuit 120 connected to the buffer circuit 110 divides each of the plurality of frames of the speech signal stored in the buffer circuit 110 into a plurality of subframes.
  • the acoustic sensibility weighting circuit 170 performs acoustic sensibility weighting for each subframe of the speech signal using the interpolated ⁇ parameter and supplies the weighted subframes of the speech signal to the subtracter circuit 185.
  • the influence signal generating circuit 180 generates an influence signal as a zero input response of a weighted synthetic filter using the interpolated ⁇ parameter and a weighted synthetic signal from the weighting and synthesizing circuit 240.
  • the subtracter 185 subtracts the influence signal supplied from the influence signal generating circuit 180 from the acoustic sensibility weighted subframe of the speech signal and supplies a difference signal as the subtracting result to the adaptive code book 190, the sound source code searching circuit 210, and the gain code searching circuit 220.
  • the adaptive code book 190 performs a long term prediction for each subframe of the difference signal using the interpolated ⁇ parameter and a sound source signal supplied from the sound source signal producing circuit 230 to determine an optimal delay. The determined delay is supplied to the multiplexer 250, the sound source code searching circuit 210, the gain code searching circuit 220 and the sound source signal producing circuit 230.
  • the sound source code book 270 stores a noise sequence as the sound source codes. Each of the sound source codes is assigned with an index.
  • the search region designating circuit 200 receives the identifier, i.e., index of a sound source code for a previous subframe before the current subframe by one to store it and designates the search region for the next subfame based on the stored identifier for the previous subframe in response to the pitch parameter from the adaptive code book 190. In this embodiment, the designating circuit 200 designates as the search region a region of 20 sound source codes before and after the sound source code for the previous subframe.
  • the sound source code searching circuit 340 searches the designated search region of the sound source code book 270 for an optimal code book for the current subframe in response to the interpolated ⁇ parameter from the interpolating circuit 160, the difference signal from the subtracter 185, and the result of the long term prediction in the adaptive code book 190.
  • the optimal searched sound source code is supplied to the gain code searching circuit 220, the sound source signal producing circuit 230 and the search region designating circuit 200 and the identifier of the searched sound source code is supplied to the multiplexer 250.
  • the gain code book 280 is a code book which stores gains of the sound source of the adaptive code book and the sound source code of the sound source code book.
  • the gain code searching circuit 330 searches the gain code book 280 for each of gains of the sound source of the adaptive code book and the sound source code searched by the searching circuit 210 from the difference signal from the subtracter 185 based on the interpolated ⁇ parameter from the interpolating circuit 160, the searched sound source code from the searching circuit 340, and the pitch parameter from the adaptive code book 190.
  • the searched gains are supplied to the multiplexer 250 and the sound source signal producing circuit 230.
  • the sound source signal producing circuit 230 receives the pitch parameter from the adaptive code book 190, the sound source code from the searching circuit 340 and the gains from the searching circuit 330 and produces a sound source signal for the current subframe by multiplying a gain term with each of the sound sources of the adaptive code book and sound source code book and by adding the multiplied results to each other and stores it as the sound source signal for a previous subframe.
  • the stored sound source signal is supplied to the weighting and synthesizing circuit 240 and the adaptive code book 190.
  • the weighting and synthesizing circuit 240 is a circuit for performing weighting synthesization.
  • the multiplexer 250 outputs via a terminal 260 a combination of a code sequence of the spectrum parameter from the quantizing circuit 140 (for a short term prediction), a code sequence of the pitch parameter from the adaptive code book 190 (a long term prediction), a code sequence of the sound source code from the sound source code searching circuit 210, and two code sequences of the gains from the gain code searching circuit 220.
  • a learned sound source code book 300 is used in place of the sound source code book 270 and stores vectors of a plurality of sound source codes having a subframe length obtained through the vector quantization.
  • the learned sound source code book 300 is known to a person skilled in the art and disclosed in the references 2 and 3. Because the learned sound source code book 300 is used in the second embodiment, it is not ensured that a sound source code adjacent to the sound source code for the previous subframe has a high similarity to the adjacent sound source code. Therefore, a similar code table 290 is prepared in the embodiment.
  • the similar code table 290 is a table for storing identifiers of a predetermined number of sound source codes, for example, 10 sound source codes in this embodiment, similar to each of the learned sound source codes of the learned sound source code book 300 in order of higher similarity.
  • the search region designating circuit 205 is similar to the search region designating circuit 200 in the first embodiment and refers to the similar code table 290 based on the identifier of the learned sound source code for the previous subframe to designate the search region of the learned sound source code book 300.
  • the search region designating circuit 200 receives the identifier, i.e., index of a learned sound source code for a previous subframe before the current subframe by one to store it and refers to the similar code table 290 to designate the search region of the learned sound source code book 300 for the next subfame based on the stored identifier for the previous subframe in response to the pitch parameter from the adaptive code book 190.
  • the designating circuit 200 designates as the search region a region of 10 learned sound source codes written in the similar code table 290 with respect to the learned sound source code for the previous subframe.
  • the sound source code searching circuit 340 searches the designated search region of the learned sound source code book 300 for an optimal code book for the current subframe in response to the interpolated ⁇ parameter from the interpolating circuit 160, the difference signal from the subtracter 185, and the result of the long term prediction in the adaptive code book 190.
  • the optimal searched sound source code is supplied to the gain code searching circuit 220 and the sound source signal producing circuit 230 and the identifier of the searched sound source code is supplied to the multiplexer 250 and the search region designating circuit 200.
  • the speech signal coding system includes a buffer circuit 110 which is a circuit for storing a speech signal inputted via a terminal 100.
  • a FLAT analyzing circuit 350 is a circuit for performing speech analysis for each of a plurality of frames of the speech signal stored in the buffer circuit 110 to calculate reflection coefficients and frame energy. The reflection coefficients and frame energy are quantized and supplied to a multiplexer 250 as a spectrum parameter.
  • An ⁇ parameter converting circuit 360 performs an inverse quantization processing for the reflection coefficients supplied from the FLAT analyzing circuit 350 to produce an ⁇ parameter, i.e., converts the reflection coefficients into the ⁇ parameter.
  • An interpolating circuit 160 interpolates an ⁇ parameter between subframes from the ⁇ parameters supplied from the converting circuit 360.
  • the interpolated ⁇ parameter is supplied to various circuits such as an acoustic sensibility weighting circuit 170, an influence signal generating circuit 180, a weighting and synthesizing circuit 240, an adaptive code book 190, a VSELP sound source code searching circuit 340, and a GPSO gain code searching circuit 330, which will be all described later in detail.
  • a dividing circuit 120 connected to the buffer circuit 110 divides each of the plurality of frames of the speech signal stored in the buffer circuit 110 into a plurality of subframes.
  • the acoustic sensibility weighting circuit 170 performs acoustic sensibility weighting for each subframe of the speech signal using the interpolated ⁇ parameter and supplies the weighted subframes of the speech signal to a subtracting circuit 185.
  • the influence signal generating circuit 180 generates an influence signal as a zero input response of a weighted synthetic filter using the interpolated ⁇ parameter and a weighted synthetic signal from the weighting and synthesizing circuit 240.
  • a subtracter 185 subtracts the influence signal supplied from the influence signal generating circuit 180 from the acoustic sensibility weighted subframe of the speech signal and supplies a difference signal as the subtracting result to the adaptive code book 190, the VSELP sound source code searching circuit 340 and the GPSO gain code searching circuit 330.
  • the adaptive code book 190 performs a long term prediction for each subframe of the difference signal using the interpolated ⁇ parameter and a sound source signal for a previous subframe supplied from the sound source signal producing circuit 230 to determine an optimal delay. The determined delay is supplied to the multiplexer 250, the VSELP sound source code searching circuit 340, the GPSO gain code searching circuit 330 and the sound source signal producing circuit 230.
  • a sound source code book includes the gray code table 290 and the base vector table 310.
  • the gray code table 290 stores a plurality of gray codes.
  • the base vector table 310 stores base vectors various combinations of which represent sound source codes of the sound source code book. Each of the base victors is assigned with an index.
  • a gray code search region designating circuit 195 designates as the search region 10 gray codes before and after the gray code for previous subframe before the current subframe by one.
  • a VSELP sound source code searching circuit 340 refers to the gray code table 290 and the base vector table 310 to produce sound source codes in combination of the base vectors and determine an optimal sound source code.
  • the VSELP sound source code searching circuit 340 searches the base vector table 310 for an optimal code book using all the base vectors in response to the interpolated ⁇ parameter from the interpolating circuit 160, the subtracted signal from the subtracter 185, and the result of the long term prediction in the adaptive code book 190.
  • the searching processing of the optimal sound source code is disclosed in the reference 4.
  • the searched sound source code is supplied to the GPSO gain code searching circuit 330 and the sound source signal producing circuit 230 and the index of the gray code for the optimal sound source code is supplied to the multiplexer 250 and the gray code designating circuit 195.
  • a GSPO gain code book 320 is a code book which stores gains obtained by converting the gains of the adaptive code book 190 and sound source code into two parameters GS and PO and performing a two-dimensional vector quantization to the two parameter.
  • the search operation of the GPSO gain code searching circuit 330 is disclosed in detail in, for example, the TIA Recommendation in U. S. A. , TIA Technical Subcommittee TR. 45.3, "Digital Cellular Standards, Baseline Text for Speech coder and Speech decoder".
  • the searched gains are supplied to the multiplexer 250 and the sound source signal producing circuit 230.
  • the sound source signal producing circuit 230 receives the pitch parameter from the adaptive code book 190, the sound source code from the searching circuit 340 and the gains from the searching circuit 330 and produces a sound source signal for the subframe by multiplying a gain term with each of the sound sources of the adaptive code book and sound source code book and by adding the multiplied results to each other and stores it as a previous sound source signal.
  • the stored sound source signal is supplied to the weighting and synthesizing circuit 240 and the adaptive code book 190.
  • the weighting and synthesizing circuit 240 is a circuit for performing weighting synthesization.
  • the multiplexer 250 outputs via a terminal 260 a combination of code sequences from the FLAT analyzing circuit 350, the adaptive code book 190, the VSELP sound source code searching circuit 340, and the GPSO gain code searching circuit 330.
  • the search region is always designated based on the sound source code for a subframe immediately before the current subframe.
  • the whole region of the sound source code book may be designated as the search region for the first subframe of the speech signal and a limited region thereof may be designated for the second subframe and the subsequent subframes.
  • the sound source codes for a plurality of subframes previous to the current subframe may be used to determined a sound source code for the current subframe.
  • the whole region and a limited region of the sound source code book may be alternatively designated as the search region for each of the continuous subframes.
  • the sound source code book searching circuit is of one stage.
  • the same advantage could be obtained.
  • the search region to be designated is fixed.
  • the search region may be varied based on another data such as the spectrum parameter or pitch parameter.
  • the search region is changed based on the spectrum parameter.
  • the first to third embodiments are modified such that the quantized spectrum parameter is supplied to the search region desinating circuit 200, 205, or 195, as shown in Figs. 2, 3 and 4 by the dashed lines.
  • the search region is limited to 10 sound source codes when the quantaized spectrum parameter is in a range of 0 to 2 B-2 -1, 15 sound source codes when the quantized spectrum parameter is in a range of 2 B-2 to 2 B-1 -1, 20 sound source codes when the quantized spectrum parameter is in a range of 2 B-1 to 2 B-1 + 2 B-2 -1, and 25 sound source codes when the quantized spectrum parameter is in a range of 2 B-1 + 2 B-2 to 2 B -1.
  • the search region can be flexibily varied based on the quantized spectrum parameter, so that the search region can be optimized. The same matter-can be applied to the pitch parameter.

Landscapes

  • Engineering & Computer Science (AREA)
  • Computational Linguistics (AREA)
  • Signal Processing (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Compression, Expansion, Code Conversion, And Decoders (AREA)
  • Transmission Systems Not Characterized By The Medium Used For Transmission (AREA)

Abstract

In a speech signal coding system, the speech signal is divided into a plurality of frames and each frame is divided into a plurality of subframes. The speech signal is represented by data including a code of a long term prediction parameter of the speech signal, a code of a short term prediction parameter thereof, and a quantized code indicative of a difference signal between a signal for short term prediction and a signal for long term prediction. The speech signal coding system includes a sound source code book (270; 300; 295, 310) for a plurality of sound source codes, a designating section (200, 290, 195) for designating a search region, a searching section (210, 340) for searching the designated search region of the sound source code book for an optimal sound source code for a current subframe.

Description

    Background of the Invention Field of the Invention
  • The present invention relates to speech signal coding, and more particularly, to a method and speech signal coding system for encoding a speech signal with a search region of a sound source code book being limited.
  • Description of the Related Art
  • Recently, telephones such as an automobile telephone and a cordless telephone using a radio frequency band have been used widely. In such a telephone, the high efficiency speech signal encoding technique by which a speech signal can be efficiently digitalized and compressed is the most important problem as well as the effective use of frequency bandwidth, the down sizing of an antenna, and the low power consumption. Since only a narrow frequency bandwidth is available in a radio frequency bandwidth, it is important to develop a system for encoding a speech signal at a low bit rate. As a system in which a speech signal can be encoded at a bit rate as low as 4 to 8 Kbps, there is known, for example, a Code Excited Linear Predictive Coding (CELP) system which is described in a paper (the reference 1) by M. Schroeder and B. S. Atal, entitled "Code-excited linear prediction: High quality speech at low bit rates", (ICASSP proc. 85, pp. 937-940, 1985).
  • In this system, the encoding process is performed on the transmitting side in accordance with the following procedure. First, a short term prediction is performed. For this purpose, a speech signal is divided into a plurality of frames of, for example, 20 ms and then a spectrum parameter indicative of the frequency characteristic of the speech signal is extracted from the speech signal for every frame (This process is referred to as "short term prediction"). Linear Predictive Coding (LPC) is used for the short term prediction in many cases. The spectrum parameter obtained in the short term prediction is used as coefficients for a synthetic filter in the encoding and decoding processes.
  • Next, a long term prediction is performed. Each frame is divided into N subframes each having a shorter time interval of, for example, 5 ms. A pitch parameter (delay) L indicative of a long term correlation (a pitch correlation) and a gain is determined for every subframe based on the spectrum parameter and a sound source signal obtained from previous sound source codes. The long term prediction thus performed is referred to as "an adaptive code book". This is the problem of a least square error between the speech signal and a synthetic signal h[n] * r[n - L]
    Figure imgb0001
    (the symbol "*" indicates a convolution operation) obtained from a signal r[n - L] which is obtained by delaying a previous sound source signal r[n] by L, as shown in the following equation (1).
    Figure imgb0002

    where p[n] is a speech signal, β is an amplitude or gain, h[n] is an impulse response of a synthetic filter determined based on the result of the short term prediction, r[n] is a signal for a previous sound source code, and L is a delay value. More specifically, there are determined the delay value L and the gain β when a value VL obtained by dividing the square of cross correlation CL between the speech signal and the synthetic signal by self-correlation GL of the synthetic signal is maximum while the delay value L is varied in the range of, for example, 20 to 147 which is considered to cover the basic frequency range of speech, as shown in the following equations (2) to (5).
    Figure imgb0003

    VL = CL ²/ GL   (4)
    Figure imgb0004


    β = CL / GL   (5)
    Figure imgb0005


       Next, there are determined code vectors ej (ej corresponds to a sound source code and j is an index indicative of the code vector ej) and a gain γ when the error power between a synthetic signal h[n] * ej[n] obtained based on code vectors ej[n] which is extracted from a noise sequence (sound source code book) prepared in advance and an difference signal d[n] after the long term prediction is performed minimum. This corresponds to a least square error problem of the synthetic signal h[n] * ej[n] which is produced from the code vectors ej[n] and the difference signal d[n], as shown in the following equation (6), similar to the long term prediction.
    Figure imgb0006

    where d[n] =p[n] - 6·h[n]*r[n - L]
    Figure imgb0007
    , d[n] is a speech signal after the long term prediction is performed, γ is an amplitude or gain, j is an index of a sound source code in the sound source code book, and ej is the sound source code specified by the index j. More specifically, in a case that B-bit sound source code book is used, values of Vj shown in the following equations (7) to (10) are calculated over all the sound source codes while the index j of the sound source code is varied from 0 to 2B - 1 and then j and γ are determined by determining Vj taking a maximum value.
    Figure imgb0008
    Figure imgb0009

    γ = C j / G j    (10)
    Figure imgb0010


    That is, the search is not performed for a limited search region of sound source codes but for the whole search region of sound source codes.
  • The determined index j of a sound source code, gain γ, pitch parameter (delay) L, gain β and spectrum parameter are transmitted.
  • Also, there is proposed as one of the speech signal coding systems a learned sound source code book which is produced by performing vector quantization to the difference signal after the short term prediction and the long term prediction are performed. This system is disclosed in the Laid Open Japanese Patent Applications (JP-A-hei3-243998: the reference 2 and JP-A-hei3-243999: the reference 3).
  • The system in which a code vector eI[n] (I is an index of a sound source code) is represented by a linear sum of base vectors Vm[n] (m is an index of a base vector) as less as M (M is as many as 9), as shown in the following equation (11) is referred to as a Vector Sum Excited LPC (VSELP) system and standardized as a speech signal coding system of a full rate for an automobile telephone in the North America and Japan.
    Figure imgb0011

    where ϑIm is 1 when the m-th bit of a sound source code having the index I is 1 and -1 when the m-th bit thereof is 0. This type of speech signal coding system is disclosed in Laid Open Japanese Patent Application (JP-A-hei2-502135: the reference 4) in detail. In this speech signal coding system, there can be calculated from the following equations (12) to (15) the cross correlation Cj between a speech signal and a synthetic signal produced based on previous sound source codes and a self-correlation Gj by using the special structure of the sound source code book. As a result, the amount of calculation can be reduced to M/N.

    G u = G i + ϑ uv · R v    (12)
    Figure imgb0012
    Figure imgb0013

    In the above equations (12) to (15), u and j are sound source codes which are different from each other in only one bit. V is indicative of a bit number of different bit and u is produced from a gray code (alternative binary code) represented by the following equation (16).

    u = i⊕(i>>1)   (16)
    Figure imgb0014


    where i is an integer in a range of 0 to 2M-1, ⊕ represents an exclusive OR, and >> represents shift in a right direction. The search for sound source code is sequentially performed for the gray codes u corresponding to i = 0 to 2M -1 when the number of base vectors is M. That is, the search is performed for all the gray codes without limiting the range.
  • Next, a speech signal coding system of the VSELP type as a representative one of conventional systems will be described below with reference to Fig. 1. The speech signal coding system includes a buffer circuit 110 which is a circuit for storing a speech signal inputted via a terminal 100. A FLAT analyzing circuit 350 is a circuit for performing speech analysis for each of a plurality of frames of the speech signal stored in the buffer circuit 110 to calculate reflection coefficients and frame energy. The reflection coefficients and frame energy are quantized and supplied to a multiplexer 250 as a spectrum parameter. An α parameter converting circuit 360 performs an inverse quantization processing for the reflection coefficients supplied from the FLAT analyzing circuit 350 to produce an α parameter, i.e., converts the reflection coefficients into the α parameter. An interpolating circuit 160 interpolates an α parameter between subframes from the α parameters supplied from the converting circuit 360. The interpolated α parameter is supplied to various circuits such as an acoustic sensibility weighting circuit 170, an influence signal generating circuit 180, a weighting and synthesizing circuit 240, an adaptive code book 190, a VSELP sound source code searching circuit 340, and a GPSO gain code searching circuit 330, which will be all described later in detail.
  • A dividing circuit 120 connected to the buffer circuit 110 divides each of the plurality of frames of the speech signal stored in the buffer circuit 110 into a plurality of subframes. The acoustic sensibility weighting circuit 170 performs acoustic sensibility weighting for each subframe of the speech signal using the interpolated α parameter and supplies the weighted subframes of the speech signal to a subtracting circuit 185. The influence signal generating circuit 180 generates an influence signal as a zero input response of a weighted synthetic filter using the interpolated α parameter and a weighted synthetic signal from the weighting and synthesizing circuit 240. A subtracter 185 subtracts the influence signal supplied from the influence signal generating circuit 180 from the acoustic sensibility weighted subframe of the speech signal and supplies a difference signal as the subtracting result to the adaptive code book 190. The adaptive code book 190 performs a long term prediction for each subframe of the difference signal using the interpolated α parameter and a sound source signal supplied from the sound source signal producing circuit 230 to determine an optimal delay. The determined delay is supplied to the multiplexer 250, the VSELP sound source code searching circuit 340, the GPSO gain code searching circuit 330 and the sound source signal producing circuit 230.
  • The base vector table 310 stores base vectors various combinations of which represent sound source codes of a sound source code book. Each of the base victors is assigned with an index. The VSELP sound source code searching circuit 340 searches the base vector table 310 for an optimal code book using all the base vectors in response to the interpolated α parameter from the interpolating circuit 160, the subtracted signal from the subtracter 185, and the result of the long term prediction in the adaptive code book 190. The searched sound source code is supplied to the GPSO gain code searching circuit 330 the sound source signal producing circuit 230 and the indexes for the base vectors used in the searched sound source code are supplied to the multiplexer 250.
  • A GSPO gain code book 280 is a code book which stores gains of the sound source of the adaptive code book and the sound source code of the sound source code book. The GPSO gain code searching circuit 330 searches each of gains of the sound source of the adaptive code book and the sound source code from the subtracted signal from the subtracter 185 based on the interpolated α parameter from the interpolating circuit 160, the searched sound source code from the searching circuit 340, and the pitch parameter from the adaptive code book 190. The searched gains are supplied to the multiplexer 250 and the sound source signal producing circuit 230.
  • The sound source signal producing circuit 230 receives the pitch parameter from the adaptive code book 190, the sound source code from the searching circuit 340 and the gains from the searching circuit 330 and produces a sound source signal for the subframe by multiplying a gain term with each of the sound sources of the adaptive code book and sound source code book and by adding the multiplied results to each other and stores it as a previous sound source signal. The stored sound source signal is supplied to the weighting and synthesizing circuit 240 and the adaptive code book 190. The weighting and synthesizing circuit 240 is a circuit for performing weighting synthesization. The multiplexer 250 outputs via a terminal 260 a combination of code sequences from the FLAT analyzing circuit 350, the adaptive code book 190, the VSELP sound source code searching circuit 340, and the GPSO gain code searching circuit 330.
  • The above mentioned processing is performed for each subframe and the speech signal is encoded.
  • In the speech signal coding system of CELP type, as the very much amount of calculation is required, a digital signal processor must operates with a clock signal having a frequency as high as 30 MHz in order to implement an apparatu. As a result, the power consumption of the digital signal processor increases remarkably. In a case of a portable telephone which cannot have a large battery, the power would be consumed with the operation of about 45 minutes.
  • Summary of the Invention
  • An object of the present invention is to provide a method of encoding a speech signal with a good quality regardless of a low bit rate and with possible communication for a longer time.
  • Another object of the present invention is to provide a speech signal coding system for achieving the method.
  • In order to achieve an object, the method of encoding a speech signal according to the present invention includes the steps of:
       providing a sound source code book for a plurality of sound source codes;
       providing a gain code book storing gain codes;
       performing a short term prediction for each of long time intervals of a speech signal to produce a first parameter;
       performing a long term prediction for each of short time intervals of the each long time interval of a difference signal between the speech signal and a signal obtained based on the first parameter and previous sound source codes to produce a second parameter;
       designating a search region of the sound source code book;
       searching the designated search region of the sound source code book for a sound source code optimal to said each short time interval of the difference signal to determine a third parameter for the optimal sound source code;
       searching the gain code book to determine a fourth parameter for gain of the second and third parameters; and
       outputting a combination of the first to fourth parameters.
  • In order to achieve another aspect of the present invention, a speech signal coding system according to the present invention in which a speech signal is represented by data including a code of a long term prediction parameter of the speech signal, a code of a short term prediction parameter thereof, and a quantized code indicative of a difference signal between a signal for short term prediction and a signal for long term prediction, the speech signal coding system includes a sound source code book for a plurality of sound source codes, a designating section for designating a search region, a searching section for searching the designated search region of the sound source code book for an optimal sound source code for a current short time interval, the speech signal being divided into portions for a plurality of long time intervals, each long time intervals being divided into a plurality of short time intervals.
  • Brief Description of the Drawings
    • Fig. 1 is a block diagram showing a conventional speech signal coding system;
    • Fig. 2 is a block diagram showing a speech signal coding system according to a first embodiment of the present invention;
    • Fig. 3 is a block diagram showing a speech signal coding system according to a second embodiment of the present invention; and
    • Fig. 4 is a block diagram showing a speech signal coding system according to a third embodiment of the present invention.
    Description of the Preferred Embodiments
  • A speech signal coding system according to the present invention will be described below in detail with reference to the accompanying drawings.
  • In the first embodiment of the present invention, the search region of a sound source code for a current subframe is designated and limited to the range of the predetermined fixed number of sound source codes before and after a sound source code for a subframe previous to the current subframe by one, for example, the range of 10 sound source codes in each of directions before and after the previous subframe sound source code. In the first embodiment, because a noise sequence is used as a sound source code book, it could be considered that the current subframe sound source code is very similar to the previous subframe sound source code. Therefore, even if the search region is limited in a predetermined range with respect to the previous subframe sound source code, an optimal sound source code to the current subframe could be found in the limited search region.
  • The speech signal coding system according to a first embodiment of the present invention will be described below with reference to Fig. 2. In Fig. 2, the speech signal coding system includes a buffer circuit 110, a subframe dividing circuit 120, an LPC analyzing circuit 130, a parameter quantizing circuit (Q) 140, a parameter inversely quantizing circuit (Q⁻¹) 150, a parameter interpolating circuit 160, an acoustic sensibility weighting circuit 170, an influence signal generating circuit 180, a subtracter 185, an adaptive code book 190, a search region designating circuit 200, a sound source searching circuit 210, a gain code searching circuit 220, a sound source signal producing circuit 230, a weighting and synthesizing circuit 240, a multiplexer 250, a noise sequence 270 as a sound source code book, and a gain code book 280.
  • In the speech signal coding system according to the first embodiment of the present invention, the buffer circuit 110 receives a speech signal inputted via a terminal 100 and stores it therein. The LPC analyzing circuit 130 performs speech analysis for each of a plurality of frames of the speech signal stored in the buffer circuit 110 to extract LPC coefficients (a α parameter) as a spectrum parameter. The parameter quantizing circuit 140 quantizes the LPC coefficients as the α parameter and the quantized coefficients are supplied to the multiplexer 250 as a spectrum parameter. An inversely quantizing circuit 150 performs an inverse quantization processing for the quantized coefficients supplied from the quantizing circuit 140 to produce or recover an α parameter, i.e., converts the quantized LPC coefficients into the α parameter. The parameter interpolating circuit 160 interpolates an α parameter between subframes from the α parameters supplied from the converting circuit 360. The interpolated α parameter is supplied to various circuits such as the acoustic sensibility weighting circuit 170, the influence signal generating circuit 180, the weighting and synthesizing circuit 240, the adaptive code book 190, the sound source code searching circuit 210 and the gain code searching circuit 220, which will be all described later in detail.
  • The dividing circuit 120 connected to the buffer circuit 110 divides each of the plurality of frames of the speech signal stored in the buffer circuit 110 into a plurality of subframes. The acoustic sensibility weighting circuit 170 performs acoustic sensibility weighting for each subframe of the speech signal using the interpolated α parameter and supplies the weighted subframes of the speech signal to the subtracter circuit 185. The influence signal generating circuit 180 generates an influence signal as a zero input response of a weighted synthetic filter using the interpolated α parameter and a weighted synthetic signal from the weighting and synthesizing circuit 240. The subtracter 185 subtracts the influence signal supplied from the influence signal generating circuit 180 from the acoustic sensibility weighted subframe of the speech signal and supplies a difference signal as the subtracting result to the adaptive code book 190, the sound source code searching circuit 210, and the gain code searching circuit 220. The adaptive code book 190 performs a long term prediction for each subframe of the difference signal using the interpolated α parameter and a sound source signal supplied from the sound source signal producing circuit 230 to determine an optimal delay. The determined delay is supplied to the multiplexer 250, the sound source code searching circuit 210, the gain code searching circuit 220 and the sound source signal producing circuit 230.
  • The sound source code book 270 stores a noise sequence as the sound source codes. Each of the sound source codes is assigned with an index. The search region designating circuit 200 receives the identifier, i.e., index of a sound source code for a previous subframe before the current subframe by one to store it and designates the search region for the next subfame based on the stored identifier for the previous subframe in response to the pitch parameter from the adaptive code book 190. In this embodiment, the designating circuit 200 designates as the search region a region of 20 sound source codes before and after the sound source code for the previous subframe. The sound source code searching circuit 340 searches the designated search region of the sound source code book 270 for an optimal code book for the current subframe in response to the interpolated α parameter from the interpolating circuit 160, the difference signal from the subtracter 185, and the result of the long term prediction in the adaptive code book 190. The optimal searched sound source code is supplied to the gain code searching circuit 220, the sound source signal producing circuit 230 and the search region designating circuit 200 and the identifier of the searched sound source code is supplied to the multiplexer 250.
  • The gain code book 280 is a code book which stores gains of the sound source of the adaptive code book and the sound source code of the sound source code book. The gain code searching circuit 330 searches the gain code book 280 for each of gains of the sound source of the adaptive code book and the sound source code searched by the searching circuit 210 from the difference signal from the subtracter 185 based on the interpolated α parameter from the interpolating circuit 160, the searched sound source code from the searching circuit 340, and the pitch parameter from the adaptive code book 190. The searched gains are supplied to the multiplexer 250 and the sound source signal producing circuit 230.
  • The sound source signal producing circuit 230 receives the pitch parameter from the adaptive code book 190, the sound source code from the searching circuit 340 and the gains from the searching circuit 330 and produces a sound source signal for the current subframe by multiplying a gain term with each of the sound sources of the adaptive code book and sound source code book and by adding the multiplied results to each other and stores it as the sound source signal for a previous subframe. The stored sound source signal is supplied to the weighting and synthesizing circuit 240 and the adaptive code book 190. The weighting and synthesizing circuit 240 is a circuit for performing weighting synthesization. The multiplexer 250 outputs via a terminal 260 a combination of a code sequence of the spectrum parameter from the quantizing circuit 140 (for a short term prediction), a code sequence of the pitch parameter from the adaptive code book 190 (a long term prediction), a code sequence of the sound source code from the sound source code searching circuit 210, and two code sequences of the gains from the gain code searching circuit 220.
  • Next, the speech signal coding system according to the second embodiment of the present invention will be described below with reference to Fig. 3. The same components as those in Fig. 2 are assigned with the same reference numerals and the descriptio will be omitted. The different components and the operation will be described.
  • In the second embodiment, a learned sound source code book 300 is used in place of the sound source code book 270 and stores vectors of a plurality of sound source codes having a subframe length obtained through the vector quantization. The learned sound source code book 300 is known to a person skilled in the art and disclosed in the references 2 and 3. Because the learned sound source code book 300 is used in the second embodiment, it is not ensured that a sound source code adjacent to the sound source code for the previous subframe has a high similarity to the adjacent sound source code. Therefore, a similar code table 290 is prepared in the embodiment. The similar code table 290 is a table for storing identifiers of a predetermined number of sound source codes, for example, 10 sound source codes in this embodiment, similar to each of the learned sound source codes of the learned sound source code book 300 in order of higher similarity. The search region designating circuit 205 is similar to the search region designating circuit 200 in the first embodiment and refers to the similar code table 290 based on the identifier of the learned sound source code for the previous subframe to designate the search region of the learned sound source code book 300.
  • The search region designating circuit 200 receives the identifier, i.e., index of a learned sound source code for a previous subframe before the current subframe by one to store it and refers to the similar code table 290 to designate the search region of the learned sound source code book 300 for the next subfame based on the stored identifier for the previous subframe in response to the pitch parameter from the adaptive code book 190. In this embodiment, the designating circuit 200 designates as the search region a region of 10 learned sound source codes written in the similar code table 290 with respect to the learned sound source code for the previous subframe. The sound source code searching circuit 340 searches the designated search region of the learned sound source code book 300 for an optimal code book for the current subframe in response to the interpolated α parameter from the interpolating circuit 160, the difference signal from the subtracter 185, and the result of the long term prediction in the adaptive code book 190. The optimal searched sound source code is supplied to the gain code searching circuit 220 and the sound source signal producing circuit 230 and the identifier of the searched sound source code is supplied to the multiplexer 250 and the search region designating circuit 200.
  • Next, the speech signal coding system according to the third embodiment of the present invention will be described below with reference to Fig. 4.
  • The speech signal coding system according to the third embodiment includes a buffer circuit 110 which is a circuit for storing a speech signal inputted via a terminal 100. A FLAT analyzing circuit 350 is a circuit for performing speech analysis for each of a plurality of frames of the speech signal stored in the buffer circuit 110 to calculate reflection coefficients and frame energy. The reflection coefficients and frame energy are quantized and supplied to a multiplexer 250 as a spectrum parameter. An α parameter converting circuit 360 performs an inverse quantization processing for the reflection coefficients supplied from the FLAT analyzing circuit 350 to produce an α parameter, i.e., converts the reflection coefficients into the α parameter. An interpolating circuit 160 interpolates an α parameter between subframes from the α parameters supplied from the converting circuit 360. The interpolated α parameter is supplied to various circuits such as an acoustic sensibility weighting circuit 170, an influence signal generating circuit 180, a weighting and synthesizing circuit 240, an adaptive code book 190, a VSELP sound source code searching circuit 340, and a GPSO gain code searching circuit 330, which will be all described later in detail.
  • A dividing circuit 120 connected to the buffer circuit 110 divides each of the plurality of frames of the speech signal stored in the buffer circuit 110 into a plurality of subframes. The acoustic sensibility weighting circuit 170 performs acoustic sensibility weighting for each subframe of the speech signal using the interpolated α parameter and supplies the weighted subframes of the speech signal to a subtracting circuit 185. The influence signal generating circuit 180 generates an influence signal as a zero input response of a weighted synthetic filter using the interpolated α parameter and a weighted synthetic signal from the weighting and synthesizing circuit 240. A subtracter 185 subtracts the influence signal supplied from the influence signal generating circuit 180 from the acoustic sensibility weighted subframe of the speech signal and supplies a difference signal as the subtracting result to the adaptive code book 190, the VSELP sound source code searching circuit 340 and the GPSO gain code searching circuit 330. The adaptive code book 190 performs a long term prediction for each subframe of the difference signal using the interpolated α parameter and a sound source signal for a previous subframe supplied from the sound source signal producing circuit 230 to determine an optimal delay. The determined delay is supplied to the multiplexer 250, the VSELP sound source code searching circuit 340, the GPSO gain code searching circuit 330 and the sound source signal producing circuit 230.
  • A sound source code book includes the gray code table 290 and the base vector table 310. The gray code table 290 stores a plurality of gray codes. The base vector table 310 stores base vectors various combinations of which represent sound source codes of the sound source code book. Each of the base victors is assigned with an index. A gray code search region designating circuit 195 designates as the search region 10 gray codes before and after the gray code for previous subframe before the current subframe by one. A VSELP sound source code searching circuit 340 refers to the gray code table 290 and the base vector table 310 to produce sound source codes in combination of the base vectors and determine an optimal sound source code. In other words, the VSELP sound source code searching circuit 340 searches the base vector table 310 for an optimal code book using all the base vectors in response to the interpolated α parameter from the interpolating circuit 160, the subtracted signal from the subtracter 185, and the result of the long term prediction in the adaptive code book 190. The searching processing of the optimal sound source code is disclosed in the reference 4. The searched sound source code is supplied to the GPSO gain code searching circuit 330 and the sound source signal producing circuit 230 and the index of the gray code for the optimal sound source code is supplied to the multiplexer 250 and the gray code designating circuit 195.
  • A GSPO gain code book 320 is a code book which stores gains obtained by converting the gains of the adaptive code book 190 and sound source code into two parameters GS and PO and performing a two-dimensional vector quantization to the two parameter. The search operation of the GPSO gain code searching circuit 330 is disclosed in detail in, for example, the TIA Recommendation in U. S. A. , TIA Technical Subcommittee TR. 45.3, "Digital Cellular Standards, Baseline Text for Speech coder and Speech decoder". The searched gains are supplied to the multiplexer 250 and the sound source signal producing circuit 230.
  • The sound source signal producing circuit 230 receives the pitch parameter from the adaptive code book 190, the sound source code from the searching circuit 340 and the gains from the searching circuit 330 and produces a sound source signal for the subframe by multiplying a gain term with each of the sound sources of the adaptive code book and sound source code book and by adding the multiplied results to each other and stores it as a previous sound source signal. The stored sound source signal is supplied to the weighting and synthesizing circuit 240 and the adaptive code book 190. The weighting and synthesizing circuit 240 is a circuit for performing weighting synthesization. The multiplexer 250 outputs via a terminal 260 a combination of code sequences from the FLAT analyzing circuit 350, the adaptive code book 190, the VSELP sound source code searching circuit 340, and the GPSO gain code searching circuit 330.
  • In the speech signal coding system of either of the first, second or third embodiment, the search region is always designated based on the sound source code for a subframe immediately before the current subframe. However, the whole region of the sound source code book may be designated as the search region for the first subframe of the speech signal and a limited region thereof may be designated for the second subframe and the subsequent subframes. Also, the sound source codes for a plurality of subframes previous to the current subframe may be used to determined a sound source code for the current subframe. Further, the whole region and a limited region of the sound source code book may be alternatively designated as the search region for each of the continuous subframes.
  • In addition, in the above embodiments, the sound source code book searching circuit is of one stage. However, even if a multistage sound source code book searching circuit may be used, the same advantage could be obtained.
  • Further, the description is made using the LPC analyzing circuit in the first and second embodiment and the FLAT analyzing circuit in the third embodiment. However, it is apparent that the same advantage can be obtained even in the other spectrum parameter such as PARCOR coefficients and cepstrum coefficents.
  • Furthermore, in the above embodimetns, the search region to be designated is fixed. However, the search region may be varied based on another data such as the spectrum parameter or pitch parameter. Such a modification will be described below. In this modification, the search region is changed based on the spectrum parameter. The first to third embodiments are modified such that the quantized spectrum parameter is supplied to the search region desinating circuit 200, 205, or 195, as shown in Figs. 2, 3 and 4 by the dashed lines. When Each of the sound source codes of the sound source code book is of B bits, there are many codes in a range of 0 to 2B - 1. In this case, for instance, the search region is limited to 10 sound source codes when the quantaized spectrum parameter is in a range of 0 to 2B-2 -1, 15 sound source codes when the quantized spectrum parameter is in a range of 2B-2 to 2B-1 -1, 20 sound source codes when the quantized spectrum parameter is in a range of 2B-1 to 2B-1 + 2B-2 -1, and 25 sound source codes when the quantized spectrum parameter is in a range of 2B-1 + 2B-2 to 2B -1. In this manner, the search region can be flexibily varied based on the quantized spectrum parameter, so that the search region can be optimized. The same matter-can be applied to the pitch parameter.

Claims (13)

  1. A speech signal coding system in which a speech signal is represented by data including a code of a long term prediction parameter of the speech signal, a code of a short term prediction parameter thereof, and a quantized code indicative of a difference signal between a signal for short term prediction and a signal for long term prediction, said speech signal coding system comprising:
       a sound source code book (270;300; 290 and 310) for a plurality of sound source codes;
       designating means (200; 205, 290; 195) for designating a search region;
       searching means (210; 340) for searching said designated search region of said sound source code book for an optimal sound source code for a current short time interval, the speech signal being divided into portions for a plurality of long time intervals, each long time intervals being divided into a plurality of short time intervals.
  2. A speech signal coding system according to claim 1, wherein said designating means (200; 205, 290; 195) designates as said search region for the current short time interval a fixed range with respect to at least one of sound source codes for the short time intervals before the current short time interval.
  3. A speech signal coding system according to claim 2, wherein said designating means (200; 205, 290; 195) designates as said search region for the current short time interval the fixed range with respect to the sound source code for the short time interval immediately before the current short time interval.
  4. A speech signal coding system according to claim 1, wherein said designating means (200; 205, 290; 195) alternatively designates as said search region for the current short time interval one of a whole of said plurality of sound source codes and a fixed range with respect to the sound source code for the short time interval immediately before the current short time interval.
  5. A speech signal coding system according to claim 1, wherein said designating means (200; 205, 290; 195) designates as said search region for the current short time interval a variable range determined based on the long term prediction parameter code with respect to at least one of sound source codes for the short time intervals before the current short time interval.
  6. A speech signal coding system according to claim 5, wherein said designating means (200; 205, 290; 195) designates as said search region for the current short time interval the variable range determined based on the long term prediction parameter code with respect to the sound source code for the short time interval immediately before the current short time interval.
  7. A speech signal coding system according to claim 1, wherein said designating means (200; 205, 290; 195) alternatively designates as said search region for the current short time interval one of a whole of said plurality of sound source codes and a variable range determined based on the long term prediction parameter code with respect to the sound source code for the short time interval immediately before the current short time interval.
  8. A speech signal coding system according to claim 1, wherein said designating means (200; 205, 290; 195) designates as said search region for the current short time interval a variable range determined based on the short term prediction parameter code with respect to at least one of sound source codes for the short time intervals before the current short time interval.
  9. A speech signal coding system according to claim 5, wherein said designating means (200; 205, 290; 195) designates as said search region for the current short time interval the variable range determined based on the short term prediction parameter code with respect to the sound source code for the short time interval immediately before the current short time interval.
  10. A speech signal coding system according to claim 1, wherein said designating means (200; 205, 290; 195) alternatively designates as said search region for the current short time interval one of a whole of said plurality of sound source codes and a variable range determined based on the short term prediction parameter code with respect to the sound source code for the short time interval immediately before the current short time interval.
  11. A speech signal coding system according to claim 1, wherein said sound source code book (270) is a noise sequence.
  12. A speech signal coding system according to claim 1, wherein said sound source code book (300) is a learned sound source code book, and
       wherein said designating means (205, 290) includes:
       a table (290) for storing an indication data indicative of learned sound source codes similar to each of said plurality of learned sound source codes; and
       means (205)for designating the search region based on the indication data.
  13. A speech signal coding system according to claim 1, wherein said sound source code book includes a gray code table (290) and a vector table (310) storing a plurality of base vectors, and
    wherein said designating means (195) designates the search region in said gray code table with respect to the sound source code for at least one of the short time intervals before the current short time interval, and
    wherein said search means (340) searches said search region of said gray code table (290) for gray codes, refers to said vector table (310) based on the searched gray codes to determine sound source codes, and determines said optimal sound source code for the current short time interval.
EP94120542A 1993-12-28 1994-12-23 Speech signal coding Withdrawn EP0662682A2 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
JP337160/93 1993-12-28
JP5337160A JPH07199994A (en) 1993-12-28 1993-12-28 Speech encoding system

Publications (1)

Publication Number Publication Date
EP0662682A2 true EP0662682A2 (en) 1995-07-12

Family

ID=18306013

Family Applications (1)

Application Number Title Priority Date Filing Date
EP94120542A Withdrawn EP0662682A2 (en) 1993-12-28 1994-12-23 Speech signal coding

Country Status (2)

Country Link
EP (1) EP0662682A2 (en)
JP (1) JPH07199994A (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP3593839B2 (en) * 1997-03-28 2004-11-24 ソニー株式会社 Vector search method
CN108962225B (en) * 2018-06-27 2020-10-23 西安理工大学 Multi-scale self-adaptive voice endpoint detection method

Also Published As

Publication number Publication date
JPH07199994A (en) 1995-08-04

Similar Documents

Publication Publication Date Title
Campbell Jr et al. The DoD 4.8 kbps standard (proposed federal standard 1016)
EP0409239B1 (en) Speech coding/decoding method
US8364473B2 (en) Method and apparatus for receiving an encoded speech signal based on codebooks
EP0573398B1 (en) C.E.L.P. Vocoder
US5729655A (en) Method and apparatus for speech compression using multi-mode code excited linear predictive coding
US5125030A (en) Speech signal coding/decoding system based on the type of speech signal
EP0331857B1 (en) Improved low bit rate voice coding method and system
KR100275054B1 (en) Speech coding apparatus and a method of encoding speech
US5339384A (en) Code-excited linear predictive coding with low delay for speech or audio signals
US7426465B2 (en) Speech signal decoding method and apparatus using decoded information smoothed to produce reconstructed speech signal to enhanced quality
EP0957472B1 (en) Speech coding apparatus and speech decoding apparatus
US20040023677A1 (en) Method, device and program for coding and decoding acoustic parameter, and method, device and program for coding and decoding sound
JPH08263099A (en) Encoder
KR20010024935A (en) Speech coding
US5682407A (en) Voice coder for coding voice signal with code-excited linear prediction coding
US6804639B1 (en) Celp voice encoder
US6910009B1 (en) Speech signal decoding method and apparatus, speech signal encoding/decoding method and apparatus, and program product therefor
US20070136051A1 (en) Pitch cycle search range setting apparatus and pitch cycle search apparatus
US6470310B1 (en) Method and system for speech encoding involving analyzing search range for current period according to length of preceding pitch period
EP0557940A2 (en) Speech coding system
EP0662682A2 (en) Speech signal coding
EP0694907A2 (en) Speech coder
EP0658877A2 (en) Speech coding apparatus
JP3249144B2 (en) Audio coding device
JP2700974B2 (en) Audio coding method

Legal Events

Date Code Title Description
PUAI Public reference made under article 153(3) epc to a published international application that has entered the european phase

Free format text: ORIGINAL CODE: 0009012

AK Designated contracting states

Kind code of ref document: A2

Designated state(s): DE FR GB

STAA Information on the status of an ep patent application or granted ep patent

Free format text: STATUS: THE APPLICATION HAS BEEN WITHDRAWN

18W Application withdrawn

Withdrawal date: 19970113