CA1336841C - Multi-pulse type coding system - Google Patents
Multi-pulse type coding systemInfo
- Publication number
- CA1336841C CA1336841C CA 563492 CA563492A CA1336841C CA 1336841 C CA1336841 C CA 1336841C CA 563492 CA563492 CA 563492 CA 563492 A CA563492 A CA 563492A CA 1336841 C CA1336841 C CA 1336841C
- Authority
- CA
- Canada
- Prior art keywords
- parameters
- speech signal
- lpc
- frame period
- search frame
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Expired - Lifetime
Links
- 230000004044 response Effects 0.000 claims abstract description 32
- 230000003595 spectral effect Effects 0.000 claims abstract description 23
- 238000001914 filtration Methods 0.000 claims abstract description 22
- 230000007704 transition Effects 0.000 claims description 4
- 230000005284 excitation Effects 0.000 abstract description 2
- 238000000034 method Methods 0.000 description 20
- 230000002194 synthesizing effect Effects 0.000 description 6
- 238000010586 diagram Methods 0.000 description 5
- 238000004458 analytical method Methods 0.000 description 4
- 238000004364 calculation method Methods 0.000 description 4
- 238000013139 quantization Methods 0.000 description 4
- 101150087426 Gnal gene Proteins 0.000 description 2
- 230000005540 biological transmission Effects 0.000 description 2
- 239000000284 extract Substances 0.000 description 1
- 238000000605 extraction Methods 0.000 description 1
- 230000000717 retained effect Effects 0.000 description 1
- 238000005070 sampling Methods 0.000 description 1
- 230000001131 transforming effect Effects 0.000 description 1
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L19/00—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
- G10L19/04—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using predictive techniques
- G10L19/08—Determination or coding of the excitation function; Determination or coding of the long-term prediction parameters
- G10L19/10—Determination or coding of the excitation function; Determination or coding of the long-term prediction parameters the excitation function being a multipulse excitation
Landscapes
- Engineering & Computer Science (AREA)
- Computational Linguistics (AREA)
- Signal Processing (AREA)
- Health & Medical Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Human Computer Interaction (AREA)
- Physics & Mathematics (AREA)
- Acoustics & Sound (AREA)
- Multimedia (AREA)
- Compression, Expansion, Code Conversion, And Decoders (AREA)
Abstract
A multi-pulse type coding system in which an update period of LPC parameters for searching multi-pulses (excitation pulses) is shortened. The multi-pulse type coding system of the present invention includes an LPC analyzer for producing LPC
parameters indicative of spectral envelope of a speech signal for each search frame period. An interpolator for producing inter-polated LPC parameters during the search frame period in response to the LPC parameters, is also included. A filtering processor for calculating correlation between the speech signal and an impulse response signal associated with the LPC parameters by filtering the speech signal in accordance with the interpolated parameters forms part of the system. Finally, the system also includes a multi-pulse calculator for calculating multi-pulses in response to-the correlation. The invention provides a multi-pulse type coding system which is capable of keeping a practically sufficient quality even when a lower bit rate, e.g., a bit rate less than 8 Kb/s, is applied for coding.
parameters indicative of spectral envelope of a speech signal for each search frame period. An interpolator for producing inter-polated LPC parameters during the search frame period in response to the LPC parameters, is also included. A filtering processor for calculating correlation between the speech signal and an impulse response signal associated with the LPC parameters by filtering the speech signal in accordance with the interpolated parameters forms part of the system. Finally, the system also includes a multi-pulse calculator for calculating multi-pulses in response to-the correlation. The invention provides a multi-pulse type coding system which is capable of keeping a practically sufficient quality even when a lower bit rate, e.g., a bit rate less than 8 Kb/s, is applied for coding.
Description
~_ 1 336841 MULTI-PULSE TYPE CODING SYSTEM
BACKGROUND OF THE INVENTION
The present invention relates to a multi-pulse type coding system and, more particularly, to a multi-pulse type coding system for coding a speech signal at a low bit rate (a low transmission rate).
Efficient coding of an input speech signal is classified into two large methods. One is a spectral coding method which codes a spectral structure of the speech signal, and the other is a waveform coding method which codes a waveform of the speech signal itself. The spectral coding method is capable of transforming a speech signal at a remarkably low bit rate, e.g., 4.8 Kb/s, but degrates the quality of a replica speech waveform. On the other hand, the waveform coding method is capable of realizing a replica speech signal of relatively higher quality. However, the coding bit rate according to the waveform coding method is generally higher than that by the spectral coding method.
In the waveform coding method, an input speech signal is whitized so as to improve coding efficiency. This whitizing operation performs flattening a spectral structure of the speech signal. Information on the spectral speech structure is, otherwise, required for reproducing the speech signal. In the waveform coding '-'''Y.
- - 2 _ 1 33684 ~
method, generally speaking, the spectral structure of the speech signal is transmitted by utilizing the spectral coding method.
In the waveform coding method, when a whitized speech signal is coded, an amount of information after coding depènds upon a degree of whitizing. For the higher degree of whitizing, more specifically, the amount of information necessary for coding the whitized speech signal can be reduced the more.
Multi-pulse type coding is known as one of more efficient waveform coding methods. In the multi-pulse type coding, the spectral structure of the speech signal is expressed by a set of LPC parameters. On the other hand, the whitized speech signal is additionally expressed by a plurality of excitation pulses (multi-pulses) featured by their amplitudes and their position during a frame period. Such multi-pulse type coding is disclosed in U.S.
Patents No. 4,282,405; No. 4,472,832 and No. 4,701,954;
for example.
One subject in the multi-pulse type coding is to reduce an arithmetic amount necessary for searching the multi-pulses. As a solution for this subject, there is known a method of searching the multi-pulses through correlation calculation. In this method, the search of the multi-pulses is performed by considering correlations between a filtered impulse response waveform derived from `` - 3 - 1 336841 the LPC parameters and the speech signal. Therefore, it is necessary to determine LPC parameters in a period sufficiently exceeding a duration time of an impulse response. Accordingly, the LPC parameters have been conventionally updated every 20 msecs, for example.
In order to precisely express the spectral structure of a speech signal, it is empirically known that a shorter period, e.g., about 5 msecs is preferable for updating the LPC parameters. However, for the aforementioned reason, the updating period of the LPC parameters has to be set at about 20 msecs in the multi-pulse type coding, causing limitation of expressiveness of the spectral structure.
As a result, the coding efficiency is limited to a coding bit rate of about 8 Kb/s to maintain the coding quality.
Namely,when the multi-pulse type coding of a coding rate less than 8 Kb/s is applied, the coding quality cannot be retained but may be inferior to that by the spectral coding method.
SUMMARY OF THE INVENTION
An object of the present invention is to provide a multi-pulse type coding sys.em which is capable of keeping a practically sufficient quality even when a lower bit rate, e.g., a bit rate less than 8 Kb/s is applied for coding.
According to the present invention, there is provided 1 33684 t a multi-pulse type coding system for coding a speech signal into a plurality of pulse signals, comprisingS means for producing one set of parameters indicative of a spectral envelope of said speech signal for each search frame period; means, coupled to said parameter producing means and responsive to a plurality of sets of parameters, each of said sets of parameters belonging to adjacent search frame periods, for producing a plurality of sets of interpolated parameters during each search frame period; means for extracting a segmented speech signal from said speech signal and for delivering said segmented speech signal in backward time sequence, said segmented speech signal having a period corresponding to said each search frame period; means for filtering said segmented speech signal in accordance with a filtering characteristic defined by said set of parameters and said plurality of sets of interpolated parameters during said each search frame period and for directly producing a cross-correlation signal representative of a transition of cross-correlation between said segmented speech signal and an impulse response defined by said set of parameters and said sets of interpolated parameters during said each search frame period, said filtering characteristic being varied during said each search frame period in accordance with a backward time sequence of said set of parameters and said plurality of interpolated parameters, said segmented speech signal being received in backward time sequence;
and means for generating said plurality of pulse signals in response to said cross-correlation signal.
Accor*ing to another aspect of the invention there is ~"
-provided a multi-pulse type coding system for coding a speech signal into a plurality of pulse signals, comprising: LPC analyzer means for producing one set of LPC parameters indicative of a spectral envelope of the speech signal for each search frame period; interpolation means responsive to said one set of LPC
parameters and another set of LPC parameters for an adjacent search frame period for producing a plurality of interpolated LPC
parameters during said search frame period; means for receiving said speech signal in backward time sequence; filter means for backwardly filtering said speech signal under control of said LPC
parameters and said interpolated LPC parameters to produce cross-correlation between said speech signal and an impulse response defined by said LPC parameters of said interpolated LPC
parameters, said cross-correlation being representative of a correlative transition during said each search frame period, a filtering characteristic of said filtering means being varied during said each search frame period in accordance with a backward time sequence of said LPC parameters and said interpolated LPC
parameters, said speech signal being received in backward time sequence; and search means for searching said plurality of pulse signals in response to said cross-correlation.
BRIEF DESCRIPTION OF THE DRAWINGS
Fig. 1 is a block diagram æhowing one embodiment of a multi-pulse type coding system according to the present invention;
Fig. 2 is a diagram for explaining a frame extraction operation in the coding system of the present invention;
Figs. 3~a) and 3(b) are explanatory diagrams showing a ~, backward filtering processor according to the present invention;
and Figs. 4(a) to 4(f) are diagrams for explaining the operations of an impulse response unit in the embodiment of the present invention;
4b DESCRIPTION OF THE PREFERRED EMBODIMENTS
First of all, the summary of the present invention will be described in the following.
According to the present invention, in a multi-pulse type coding at a relatively low bit rate, an updating period of a spectral envelope parameter is set to be shorter than a frame period for searching the multi-pulses in order to enhance the expressiveness of the spectral envelope.
Accordingly, respective spectral envelope parameters can be obtained for individual plural blocks in one frame period so that the spectral envelope information can be expressed more reliably. In other words, a low bit rate coding for a narrow frequency band can be accomplished while applying the multi-pulse type coding.
The embodiment of the present invention will be described with reference to Figs. 1 to 4, hereinafter.
As shown in Fig. 1, the multi-pulse type coding system of the presënt invention comprises an LPC analyzing unit 1, a backward processing unit 2, a waveform coding unit 3, a waveform decoding unit 4 and an LPC synthesizer 5O
These individual circuit components will be described in detail in the following.
A degitized input speech signal 100 is supplied to the LPC analyzing unit 1 and the backward processing unit 2.
The LPC analyzing unit 1 comprises a first waveform extractor 11 and an LPC analyzer 12. In this unit 1, a frame period for producing LPC parameters to be transmitted and for searching multi-pulses is set to 20 msecs. However, as shown in Fig. 2, the first waveform extractor 11 segments the input speech signal into a period of 30 msecs in an overlaped manner, for example, and supplies the segmented speech signal to the LPC analyzer 120 As a result, the LPC analyzer 12 produces LPC parameters A
associated with each frame period of 20 msecs, as shown in Fig. 2, and delivers an LPC parameter signal 101 indicative of the LPC parameters A to a K-quantization decoder 13 in the backward processing unit 2.
The backward processing unit 2 comprises the K-quantization decoder 13, a K-interpolator 14, a K- a -converter 15, a temporary memory 16, a second waveform extractor 17 and a backward filtering processor 18.
In this unit 2, the LPC parameter signal 101 is supplied to the K-quantization decoder 13 and the quantized LPC
parameter signal 104 delivered from the decoder 13 is outputted to a multiplexer 23 in the waveform coder 3 to be transmitted. Moreover, the LPC parameter signal thus quantized and decoded is supplied from the decoder 13 to the K-interpolator 14.
The K-interpolator 14 produces a plurality of interpolated LPC parameters during each frame period of 20 msec on the basis of two successively supplied LPC
~ 7 ~ 133684~
parameters A. In this interpolator 14, as shown in Fig. 2, three interpolated LPC parameters B, C and D are produced from two adjacent LPC parameters of two adjacent frame periods and, thus, four respective LPC parameters A, B, C
and D are obtained during each frame period. Generally, the LPC parameters include a plurality of coefficient data associated with respective orders and, therefore, the interpolating calculation using a linear interpolation method, for example, is performed for the respective coefficient data in practice. Also, other various interpolation methods can be applied to the interpolating calculator and, further, it is possible to produce interpolated LPC parameters from more than two LPC
parameters, i.e., from more than two frame periods.
The plurality of LPC parameters A, B, C and D from the K-interpolator 14 are supplied through the K-~-converter 15 to the temporary memory 16.
Next, the backward filtering processor 18 for equivalently producing the correlation between the speech signal and an impulse response associated with LPC
parameters will be described, hereinafter.
The first step of the multi-pulse search is to determined the correlation between an impulse response of a LPC synthesizing filter, which is based upon the result of the LPC analysis of the input speech signal, and the input~speech signal. For this first step, there are B
calculated products between the value of a certain time of the input speech signal and the values of the individual points (i.e., slots split from a predetermined block) of the predetermined block of the impulse response signal of the filter, which are constructed on the basis of the result of the LPC analysis of the input speech signal.
For each of these products, a sum is calculated of the predetermined block. This sum is the correlation signal between the input speech signal and the impulse response.
Conventionally, the aforementioned calculation requires a great deal of an amount of arithmatic operation. Moreover, if the coefficient of the LPC synthesizing filter is frequently updated during the impulse response, the calculation to obtain the impulse response should be done at 160 sample points to compute the correlation during one frame period, in a case the sampling frequency of 8 KHz and the frame period of 20 msecs are applied. Therefore, the arithmetic amount further increases. This increase of the arithmetic amount is a cause for disabling the search period of the LPC parameters to become shorter than the frame period in the prior art. This problem is solved in the present invention by using a filtering operation instead of using the impulse response to compute the correlation.
It is assumed that the impulse response of a LPC
synthesizing filter is indicated by Ii (i = 0, 1, 2, ---), the output at a time point j corresponding to the filter input "1" at a time points j-k is expressed by Ik, and the output corresponding to a filter input Sk is expressed by Ik- Sk. When the filter inputs So, Sl, S2, ---, Sk/
and so on are applied at time points j, j-l, j-2, ---, j-k, and so on, the filter output Bj at the time point j is expressed by the following formula (1):
N
Bj = ~ I~-S~ ..., ,,. (1) This formula (1) implies that the correlation between the speech waveform samples S0, Sl~-S2~ ---, Sk, and so on and the filtered im ulse response Ii can be determined as an output of a IIR filter. In this case, the input order of the speech waveform samples to the filter is directed backward, i.e., from a future sample to a past sample. Further, it is quite apparent according to this method that the filter output Bj_l at the time point j-l is outputted continuously as a filter output after the output Bj and that the arithmetic amount does not increase even if filter coefficients are updated midway.
Referring back to Fig. 1, the temporary memory 16 stores the LPC parameters including the interpolated parameters. The LPC parameters 103 for each frame period are read out in the reverse sequence order, as shown in Fig. 3(a), from the memory 6 and supplied to the backward -- - lo 1 336841 filtering processor 18 and to an impulse response arithmetic circuit 24 and an autocorrelation arithmetic circuit 25 in the waveform coding unit 3.
In response to the input digitized speech signal 100, on the other hand, a second waveform extractor 17 extracts each segmented signal of the prame period of 20 msecs, as shown in Fig. 2, in synchronism with the operation of the first waveform extractor 11. In this case, the segmented speech signal is delivered from the extractor 17 to the backward filtering processor 18 in the reverse time direction in synchronism with the operation of the processor 18.
The backward filtering processor 18 is constructed, as shown in Fig. 3, of an LPC synthesizing filter which is controlled by the LPC parameters 103 for each frame period.
As described above, the LPC parameters 103 are inputted in the backward manner (i.e., while having the leading and - trailing ends of the signal reversed). On the other hand, the input speech s~gnal for each frame period delivered from the second waveform extractor 17 is inputted in the backward manner to the backward filtering processor 18.
Here, the relation between the LPC parameters A, B, C and D
during one frame period and the input speech signal of one frame period are shown in Fig. 3(a). In this way, a correlation signal 102 representative of the correlation between the impulse response of the LPC synthesizing filter and the input speech signal is obtained for each frame period and supplied to a temporary memory 19 in the waveform coding unit 3.
Next, this waveform coding unit 3 will be described in the following. This coding unit 3 is composed of the temporary member 19, a maximum value searching circuit 20, an amplitude normalizer 21, a pulse quantizer 22, the - multiplexer 23, the impuse response arithmetic circuit 24, the autocorrelation arithmetic circuit 25 and a compensator 26.
When the correlation signal 102 of one frame is stored in the temporary memory 19, as shown in Fig. 4(a), it is supplied to the maximum value searching circuit 20, in which the amplitude and the position in the frame period associated with the maximum value of the correlation signal is searched, as shown in Fig. 4(b). As a result, a position signal 117 is supplied to the impulse response arithmetic circuit 24, the autocorrelation arithmetic circuit 25 and the'compensator 26, and an amplitude signal 116 is supplied to the amplitude normalizer 21.
The impulse response arithmetic circuit 24 receives the LPC parameters 103 shown in Fig. 4(d), and the position signal 117, in the normal (forward) order, as indicated by an arrow in Fig. 4(e), so that the impulse response of the corresponding LPC synthesizing filter is calculated. The autocorrelation arithmetic circuit 25 {~:
~ 336841 receives the LPC parameters-103 shown in Fig. 4(d), the impulse response signal obtained by the impulse response arithmetic circuit 24 and shown in Fig. 4(c), and the position signal 117 in the backward order as shown in Fig. 4(f), and the autocorrelation is calculated under backward processing of the autocorrelation filter and the position signal 117, so that the autocorrelation signal is obtained and it is supplied to the amplitude normalizer 21 and the compensator 26.
On the other hand, the amplitude signal 116 and the autocorrelation signal are supplled to the amplitude normalizer 21. In the amplitude normalizer 21, the amplitude signal 116 is normalized such that the maximum value of the autocorrelation signal becomes equal to the quantized and decoded amplitude of the amplitude signal 116, and supplied to the pulse quantizer 22 and the compensator 26. The normalized amplitude signal and the position signal 117 are quantized in the pulse quantizer 22. Moreover, the multi-pulse signal 111 which shows the maximum pulse position and its amplitude is supplied to the multiplexer 23.
The autocorrelation signal delivered from the autocorrelation arithmetic circuit 25, the quantized and decoded amplitude signal delivered from the amplitude normalizer 21, and the position signal 117 are supplied to the compensator 26. As a result, this corrector 26 _ - 13 - 1 336841 generates an autocorrelation signal in which the maximum amplitude and the position on the frame period are determined upon the reception of those signals.
On the other hand, the correlation signal stored in the temporary memory 19 is read out to the compensator 26, and the aforementioned autocorrelation having the same maximum amplitude and the same position is subtracted from that correlation signal and the result is returned to the temporary memory 19. Next, the correlation signal stored in the temporary memory 19 is read out and supplied to the maximum value searching circuit 20 so that the multi-pulse signal having the second maximum amplitude is obtained from the maximum value search circuit 20. This procedure is continued until the number of the multi-pulses reaches 15- a predetermined value or until an amplitude of a detected pulse becomes smaller than a predetermined amplitude, so that the multi-pulse signal lll indicative of a plurality of multi-pulses is completely inputted to the - multiplexer 23.
The multiplexer 23 recelves the LPC parameter signal 104 and the multi-pulse signal 111 and multiplexes them.
The resultant multiplexed signal 105 is outputted from the multiplexer 23 and transmitted to the waveform decoding unit 4 through a transmission line.
Next, the waveform decoding unit 4 will be described in the following. The waveform decoding unit 4 is composed ~ - 14 - 1 336841 of a demultiplexer 31, a pulse decoder 32, a K-decoder 33, a K-lnterpolator 34, and a K-~-converter 35. When a multiplexed signal 105 is inputted to the demultiplexer 31 from the waveform decoding unit 3, the demultiplexer 31 outputs both an LPC parameter signal 114 corresponding to the LPC `parameter signal 104 and a multi-pulse signal 121 - corresponding to the multi-pulse si,gnal 111.
The LPC parameter signal 114 delivered from the demultiplexer 31 is decoded by the K-decoder so that the decoded signal is inputted to the K-interpolator 34. This K-interpolator 34 interpolates the LPC parameter signal of one frame like the aforementioned K-interpolator 14 so that the representative LPC parameter signal is converted by the K-~-converter into a converted LPC parameter signal ~-107 and supplied to the LPC synthesizer 5. On the other hand, the multi-pulse signal 121 delivered from the demultiplexer 31 is decoded by the pulse decoder 32 into a decoded multi-pulse signal 116, which is then outputted to the LPC synthesizer 5.
The multi-pulse signal 106 is inputted to the LPC
synthesizer 5 and controlled in accordance with the LPC
parameter signal 107 so that a decoded outputdigitieed speech signal 108 is outputted.
In the embodiment, a plurality of the LPC parameters during one frame period are produced by interpolating the two LPC parameters of adjacent frame periods so as to ~ - 15 - 1 336841 enhance the expressiveness of the spectral envelope of the input speech signal.
Otherwise, it is also possible to obtain a plurality - of the LPC parameters during one frame period by accomplishing a plurality of LPC analyses in one frame period. In this case, high speed arithmetic operation is required in circuit components such as the waveform extractor and the LPC analyzer of Fig. 1. Therefore, when this alternative method is applied, in the block diagram of Fig. 1 showing the structure of the embodiment, the K-interpolators 13 and 34 can be omitted and their input and output terminals are connected directly.
However, the LPC analyzing unit 1 has to accomplish the LPC analysis once during each of segmented portions provided by dividing one frame period to obtain a plurality of the LPC parameters for one frame period. Further, the operations are similar to those of the embodiment except that the LPC parameter signal to pass through the K-quantization decoder 13, the multiplexer 23, the demultiplexer 31 and .he K-decoder 33 experiences several updation for the one frame period.
As has been described in detail hereinbefore, according to the present invention,when the speech signal is to be coded into the multi-pulses, in order to attain accurate spectral envelope information, a plurality of the LPC parameters are produced for one frame period.
B
_ - 16 - I 33684 1 As a result, a low bit rate coding is realized while keeping practically efficient coding quality.
BACKGROUND OF THE INVENTION
The present invention relates to a multi-pulse type coding system and, more particularly, to a multi-pulse type coding system for coding a speech signal at a low bit rate (a low transmission rate).
Efficient coding of an input speech signal is classified into two large methods. One is a spectral coding method which codes a spectral structure of the speech signal, and the other is a waveform coding method which codes a waveform of the speech signal itself. The spectral coding method is capable of transforming a speech signal at a remarkably low bit rate, e.g., 4.8 Kb/s, but degrates the quality of a replica speech waveform. On the other hand, the waveform coding method is capable of realizing a replica speech signal of relatively higher quality. However, the coding bit rate according to the waveform coding method is generally higher than that by the spectral coding method.
In the waveform coding method, an input speech signal is whitized so as to improve coding efficiency. This whitizing operation performs flattening a spectral structure of the speech signal. Information on the spectral speech structure is, otherwise, required for reproducing the speech signal. In the waveform coding '-'''Y.
- - 2 _ 1 33684 ~
method, generally speaking, the spectral structure of the speech signal is transmitted by utilizing the spectral coding method.
In the waveform coding method, when a whitized speech signal is coded, an amount of information after coding depènds upon a degree of whitizing. For the higher degree of whitizing, more specifically, the amount of information necessary for coding the whitized speech signal can be reduced the more.
Multi-pulse type coding is known as one of more efficient waveform coding methods. In the multi-pulse type coding, the spectral structure of the speech signal is expressed by a set of LPC parameters. On the other hand, the whitized speech signal is additionally expressed by a plurality of excitation pulses (multi-pulses) featured by their amplitudes and their position during a frame period. Such multi-pulse type coding is disclosed in U.S.
Patents No. 4,282,405; No. 4,472,832 and No. 4,701,954;
for example.
One subject in the multi-pulse type coding is to reduce an arithmetic amount necessary for searching the multi-pulses. As a solution for this subject, there is known a method of searching the multi-pulses through correlation calculation. In this method, the search of the multi-pulses is performed by considering correlations between a filtered impulse response waveform derived from `` - 3 - 1 336841 the LPC parameters and the speech signal. Therefore, it is necessary to determine LPC parameters in a period sufficiently exceeding a duration time of an impulse response. Accordingly, the LPC parameters have been conventionally updated every 20 msecs, for example.
In order to precisely express the spectral structure of a speech signal, it is empirically known that a shorter period, e.g., about 5 msecs is preferable for updating the LPC parameters. However, for the aforementioned reason, the updating period of the LPC parameters has to be set at about 20 msecs in the multi-pulse type coding, causing limitation of expressiveness of the spectral structure.
As a result, the coding efficiency is limited to a coding bit rate of about 8 Kb/s to maintain the coding quality.
Namely,when the multi-pulse type coding of a coding rate less than 8 Kb/s is applied, the coding quality cannot be retained but may be inferior to that by the spectral coding method.
SUMMARY OF THE INVENTION
An object of the present invention is to provide a multi-pulse type coding sys.em which is capable of keeping a practically sufficient quality even when a lower bit rate, e.g., a bit rate less than 8 Kb/s is applied for coding.
According to the present invention, there is provided 1 33684 t a multi-pulse type coding system for coding a speech signal into a plurality of pulse signals, comprisingS means for producing one set of parameters indicative of a spectral envelope of said speech signal for each search frame period; means, coupled to said parameter producing means and responsive to a plurality of sets of parameters, each of said sets of parameters belonging to adjacent search frame periods, for producing a plurality of sets of interpolated parameters during each search frame period; means for extracting a segmented speech signal from said speech signal and for delivering said segmented speech signal in backward time sequence, said segmented speech signal having a period corresponding to said each search frame period; means for filtering said segmented speech signal in accordance with a filtering characteristic defined by said set of parameters and said plurality of sets of interpolated parameters during said each search frame period and for directly producing a cross-correlation signal representative of a transition of cross-correlation between said segmented speech signal and an impulse response defined by said set of parameters and said sets of interpolated parameters during said each search frame period, said filtering characteristic being varied during said each search frame period in accordance with a backward time sequence of said set of parameters and said plurality of interpolated parameters, said segmented speech signal being received in backward time sequence;
and means for generating said plurality of pulse signals in response to said cross-correlation signal.
Accor*ing to another aspect of the invention there is ~"
-provided a multi-pulse type coding system for coding a speech signal into a plurality of pulse signals, comprising: LPC analyzer means for producing one set of LPC parameters indicative of a spectral envelope of the speech signal for each search frame period; interpolation means responsive to said one set of LPC
parameters and another set of LPC parameters for an adjacent search frame period for producing a plurality of interpolated LPC
parameters during said search frame period; means for receiving said speech signal in backward time sequence; filter means for backwardly filtering said speech signal under control of said LPC
parameters and said interpolated LPC parameters to produce cross-correlation between said speech signal and an impulse response defined by said LPC parameters of said interpolated LPC
parameters, said cross-correlation being representative of a correlative transition during said each search frame period, a filtering characteristic of said filtering means being varied during said each search frame period in accordance with a backward time sequence of said LPC parameters and said interpolated LPC
parameters, said speech signal being received in backward time sequence; and search means for searching said plurality of pulse signals in response to said cross-correlation.
BRIEF DESCRIPTION OF THE DRAWINGS
Fig. 1 is a block diagram æhowing one embodiment of a multi-pulse type coding system according to the present invention;
Fig. 2 is a diagram for explaining a frame extraction operation in the coding system of the present invention;
Figs. 3~a) and 3(b) are explanatory diagrams showing a ~, backward filtering processor according to the present invention;
and Figs. 4(a) to 4(f) are diagrams for explaining the operations of an impulse response unit in the embodiment of the present invention;
4b DESCRIPTION OF THE PREFERRED EMBODIMENTS
First of all, the summary of the present invention will be described in the following.
According to the present invention, in a multi-pulse type coding at a relatively low bit rate, an updating period of a spectral envelope parameter is set to be shorter than a frame period for searching the multi-pulses in order to enhance the expressiveness of the spectral envelope.
Accordingly, respective spectral envelope parameters can be obtained for individual plural blocks in one frame period so that the spectral envelope information can be expressed more reliably. In other words, a low bit rate coding for a narrow frequency band can be accomplished while applying the multi-pulse type coding.
The embodiment of the present invention will be described with reference to Figs. 1 to 4, hereinafter.
As shown in Fig. 1, the multi-pulse type coding system of the presënt invention comprises an LPC analyzing unit 1, a backward processing unit 2, a waveform coding unit 3, a waveform decoding unit 4 and an LPC synthesizer 5O
These individual circuit components will be described in detail in the following.
A degitized input speech signal 100 is supplied to the LPC analyzing unit 1 and the backward processing unit 2.
The LPC analyzing unit 1 comprises a first waveform extractor 11 and an LPC analyzer 12. In this unit 1, a frame period for producing LPC parameters to be transmitted and for searching multi-pulses is set to 20 msecs. However, as shown in Fig. 2, the first waveform extractor 11 segments the input speech signal into a period of 30 msecs in an overlaped manner, for example, and supplies the segmented speech signal to the LPC analyzer 120 As a result, the LPC analyzer 12 produces LPC parameters A
associated with each frame period of 20 msecs, as shown in Fig. 2, and delivers an LPC parameter signal 101 indicative of the LPC parameters A to a K-quantization decoder 13 in the backward processing unit 2.
The backward processing unit 2 comprises the K-quantization decoder 13, a K-interpolator 14, a K- a -converter 15, a temporary memory 16, a second waveform extractor 17 and a backward filtering processor 18.
In this unit 2, the LPC parameter signal 101 is supplied to the K-quantization decoder 13 and the quantized LPC
parameter signal 104 delivered from the decoder 13 is outputted to a multiplexer 23 in the waveform coder 3 to be transmitted. Moreover, the LPC parameter signal thus quantized and decoded is supplied from the decoder 13 to the K-interpolator 14.
The K-interpolator 14 produces a plurality of interpolated LPC parameters during each frame period of 20 msec on the basis of two successively supplied LPC
~ 7 ~ 133684~
parameters A. In this interpolator 14, as shown in Fig. 2, three interpolated LPC parameters B, C and D are produced from two adjacent LPC parameters of two adjacent frame periods and, thus, four respective LPC parameters A, B, C
and D are obtained during each frame period. Generally, the LPC parameters include a plurality of coefficient data associated with respective orders and, therefore, the interpolating calculation using a linear interpolation method, for example, is performed for the respective coefficient data in practice. Also, other various interpolation methods can be applied to the interpolating calculator and, further, it is possible to produce interpolated LPC parameters from more than two LPC
parameters, i.e., from more than two frame periods.
The plurality of LPC parameters A, B, C and D from the K-interpolator 14 are supplied through the K-~-converter 15 to the temporary memory 16.
Next, the backward filtering processor 18 for equivalently producing the correlation between the speech signal and an impulse response associated with LPC
parameters will be described, hereinafter.
The first step of the multi-pulse search is to determined the correlation between an impulse response of a LPC synthesizing filter, which is based upon the result of the LPC analysis of the input speech signal, and the input~speech signal. For this first step, there are B
calculated products between the value of a certain time of the input speech signal and the values of the individual points (i.e., slots split from a predetermined block) of the predetermined block of the impulse response signal of the filter, which are constructed on the basis of the result of the LPC analysis of the input speech signal.
For each of these products, a sum is calculated of the predetermined block. This sum is the correlation signal between the input speech signal and the impulse response.
Conventionally, the aforementioned calculation requires a great deal of an amount of arithmatic operation. Moreover, if the coefficient of the LPC synthesizing filter is frequently updated during the impulse response, the calculation to obtain the impulse response should be done at 160 sample points to compute the correlation during one frame period, in a case the sampling frequency of 8 KHz and the frame period of 20 msecs are applied. Therefore, the arithmetic amount further increases. This increase of the arithmetic amount is a cause for disabling the search period of the LPC parameters to become shorter than the frame period in the prior art. This problem is solved in the present invention by using a filtering operation instead of using the impulse response to compute the correlation.
It is assumed that the impulse response of a LPC
synthesizing filter is indicated by Ii (i = 0, 1, 2, ---), the output at a time point j corresponding to the filter input "1" at a time points j-k is expressed by Ik, and the output corresponding to a filter input Sk is expressed by Ik- Sk. When the filter inputs So, Sl, S2, ---, Sk/
and so on are applied at time points j, j-l, j-2, ---, j-k, and so on, the filter output Bj at the time point j is expressed by the following formula (1):
N
Bj = ~ I~-S~ ..., ,,. (1) This formula (1) implies that the correlation between the speech waveform samples S0, Sl~-S2~ ---, Sk, and so on and the filtered im ulse response Ii can be determined as an output of a IIR filter. In this case, the input order of the speech waveform samples to the filter is directed backward, i.e., from a future sample to a past sample. Further, it is quite apparent according to this method that the filter output Bj_l at the time point j-l is outputted continuously as a filter output after the output Bj and that the arithmetic amount does not increase even if filter coefficients are updated midway.
Referring back to Fig. 1, the temporary memory 16 stores the LPC parameters including the interpolated parameters. The LPC parameters 103 for each frame period are read out in the reverse sequence order, as shown in Fig. 3(a), from the memory 6 and supplied to the backward -- - lo 1 336841 filtering processor 18 and to an impulse response arithmetic circuit 24 and an autocorrelation arithmetic circuit 25 in the waveform coding unit 3.
In response to the input digitized speech signal 100, on the other hand, a second waveform extractor 17 extracts each segmented signal of the prame period of 20 msecs, as shown in Fig. 2, in synchronism with the operation of the first waveform extractor 11. In this case, the segmented speech signal is delivered from the extractor 17 to the backward filtering processor 18 in the reverse time direction in synchronism with the operation of the processor 18.
The backward filtering processor 18 is constructed, as shown in Fig. 3, of an LPC synthesizing filter which is controlled by the LPC parameters 103 for each frame period.
As described above, the LPC parameters 103 are inputted in the backward manner (i.e., while having the leading and - trailing ends of the signal reversed). On the other hand, the input speech s~gnal for each frame period delivered from the second waveform extractor 17 is inputted in the backward manner to the backward filtering processor 18.
Here, the relation between the LPC parameters A, B, C and D
during one frame period and the input speech signal of one frame period are shown in Fig. 3(a). In this way, a correlation signal 102 representative of the correlation between the impulse response of the LPC synthesizing filter and the input speech signal is obtained for each frame period and supplied to a temporary memory 19 in the waveform coding unit 3.
Next, this waveform coding unit 3 will be described in the following. This coding unit 3 is composed of the temporary member 19, a maximum value searching circuit 20, an amplitude normalizer 21, a pulse quantizer 22, the - multiplexer 23, the impuse response arithmetic circuit 24, the autocorrelation arithmetic circuit 25 and a compensator 26.
When the correlation signal 102 of one frame is stored in the temporary memory 19, as shown in Fig. 4(a), it is supplied to the maximum value searching circuit 20, in which the amplitude and the position in the frame period associated with the maximum value of the correlation signal is searched, as shown in Fig. 4(b). As a result, a position signal 117 is supplied to the impulse response arithmetic circuit 24, the autocorrelation arithmetic circuit 25 and the'compensator 26, and an amplitude signal 116 is supplied to the amplitude normalizer 21.
The impulse response arithmetic circuit 24 receives the LPC parameters 103 shown in Fig. 4(d), and the position signal 117, in the normal (forward) order, as indicated by an arrow in Fig. 4(e), so that the impulse response of the corresponding LPC synthesizing filter is calculated. The autocorrelation arithmetic circuit 25 {~:
~ 336841 receives the LPC parameters-103 shown in Fig. 4(d), the impulse response signal obtained by the impulse response arithmetic circuit 24 and shown in Fig. 4(c), and the position signal 117 in the backward order as shown in Fig. 4(f), and the autocorrelation is calculated under backward processing of the autocorrelation filter and the position signal 117, so that the autocorrelation signal is obtained and it is supplied to the amplitude normalizer 21 and the compensator 26.
On the other hand, the amplitude signal 116 and the autocorrelation signal are supplled to the amplitude normalizer 21. In the amplitude normalizer 21, the amplitude signal 116 is normalized such that the maximum value of the autocorrelation signal becomes equal to the quantized and decoded amplitude of the amplitude signal 116, and supplied to the pulse quantizer 22 and the compensator 26. The normalized amplitude signal and the position signal 117 are quantized in the pulse quantizer 22. Moreover, the multi-pulse signal 111 which shows the maximum pulse position and its amplitude is supplied to the multiplexer 23.
The autocorrelation signal delivered from the autocorrelation arithmetic circuit 25, the quantized and decoded amplitude signal delivered from the amplitude normalizer 21, and the position signal 117 are supplied to the compensator 26. As a result, this corrector 26 _ - 13 - 1 336841 generates an autocorrelation signal in which the maximum amplitude and the position on the frame period are determined upon the reception of those signals.
On the other hand, the correlation signal stored in the temporary memory 19 is read out to the compensator 26, and the aforementioned autocorrelation having the same maximum amplitude and the same position is subtracted from that correlation signal and the result is returned to the temporary memory 19. Next, the correlation signal stored in the temporary memory 19 is read out and supplied to the maximum value searching circuit 20 so that the multi-pulse signal having the second maximum amplitude is obtained from the maximum value search circuit 20. This procedure is continued until the number of the multi-pulses reaches 15- a predetermined value or until an amplitude of a detected pulse becomes smaller than a predetermined amplitude, so that the multi-pulse signal lll indicative of a plurality of multi-pulses is completely inputted to the - multiplexer 23.
The multiplexer 23 recelves the LPC parameter signal 104 and the multi-pulse signal 111 and multiplexes them.
The resultant multiplexed signal 105 is outputted from the multiplexer 23 and transmitted to the waveform decoding unit 4 through a transmission line.
Next, the waveform decoding unit 4 will be described in the following. The waveform decoding unit 4 is composed ~ - 14 - 1 336841 of a demultiplexer 31, a pulse decoder 32, a K-decoder 33, a K-lnterpolator 34, and a K-~-converter 35. When a multiplexed signal 105 is inputted to the demultiplexer 31 from the waveform decoding unit 3, the demultiplexer 31 outputs both an LPC parameter signal 114 corresponding to the LPC `parameter signal 104 and a multi-pulse signal 121 - corresponding to the multi-pulse si,gnal 111.
The LPC parameter signal 114 delivered from the demultiplexer 31 is decoded by the K-decoder so that the decoded signal is inputted to the K-interpolator 34. This K-interpolator 34 interpolates the LPC parameter signal of one frame like the aforementioned K-interpolator 14 so that the representative LPC parameter signal is converted by the K-~-converter into a converted LPC parameter signal ~-107 and supplied to the LPC synthesizer 5. On the other hand, the multi-pulse signal 121 delivered from the demultiplexer 31 is decoded by the pulse decoder 32 into a decoded multi-pulse signal 116, which is then outputted to the LPC synthesizer 5.
The multi-pulse signal 106 is inputted to the LPC
synthesizer 5 and controlled in accordance with the LPC
parameter signal 107 so that a decoded outputdigitieed speech signal 108 is outputted.
In the embodiment, a plurality of the LPC parameters during one frame period are produced by interpolating the two LPC parameters of adjacent frame periods so as to ~ - 15 - 1 336841 enhance the expressiveness of the spectral envelope of the input speech signal.
Otherwise, it is also possible to obtain a plurality - of the LPC parameters during one frame period by accomplishing a plurality of LPC analyses in one frame period. In this case, high speed arithmetic operation is required in circuit components such as the waveform extractor and the LPC analyzer of Fig. 1. Therefore, when this alternative method is applied, in the block diagram of Fig. 1 showing the structure of the embodiment, the K-interpolators 13 and 34 can be omitted and their input and output terminals are connected directly.
However, the LPC analyzing unit 1 has to accomplish the LPC analysis once during each of segmented portions provided by dividing one frame period to obtain a plurality of the LPC parameters for one frame period. Further, the operations are similar to those of the embodiment except that the LPC parameter signal to pass through the K-quantization decoder 13, the multiplexer 23, the demultiplexer 31 and .he K-decoder 33 experiences several updation for the one frame period.
As has been described in detail hereinbefore, according to the present invention,when the speech signal is to be coded into the multi-pulses, in order to attain accurate spectral envelope information, a plurality of the LPC parameters are produced for one frame period.
B
_ - 16 - I 33684 1 As a result, a low bit rate coding is realized while keeping practically efficient coding quality.
Claims (5)
PROPERTY OR PRIVILEGE IS CLAIMED ARE DEFINED AS FOLLOWS:
1. A multi-pulse type coding system for coding a speech signal into a plurality of pulse signals, comprising: means for producing one set of parameters indicative of a spectral envelope of said speech signal for each search frame period; means, coupled to said parameter producing means and responsive to a plurality of sets of parameters, each of said sets of parameters belonging to adjacent search frame periods, for producing a plurality of sets of interpolated parameters during each search frame period; means for extracting a segmented speech signal from said speech signal and for delivering said segmented speech signal in backward time sequence, said segmented speech signal having a period corresponding to said each search frame period; means for filtering said segmented speech signal in accordance with a filtering characteristic defined by said set of parameters and said plurality of sets of interpolated parameters during said each search frame period and for directly producing a cross-correlation signal representative of a transition of cross-correlation between said segmented speech signal and an impulse response defined by said set of parameters and said sets of interpolated parameters during said each search frame period, said filtering characteristic being varied during said each search frame period in accordance with a backward time sequence of said set of parameters and said plurality of interpolated parameters, said segmented speech signal being received in backward time sequence;
and means for generating said plurality of pulse signals in response to said cross-correlation signal.
and means for generating said plurality of pulse signals in response to said cross-correlation signal.
2. A multi-pulse type coding system as claimed in claim 1, further comprising means for transmitting said parameters delivered from said parameter producing means and information representative of said plurality of pulse signals.
3. A multi-pulse type coding system as claimed in claim 1, wherein said parameters are LPC coefficients.
4. A multi-pulse type coding system as claimed in claim 1, wherein said means for producing a plurality of sets of parameters includes interpolating means for producing a plurality of interpolated parameters during said search frame period from said parameters associated with two adjacent search frame periods.
5. A multi-pulse type coding system for coding a speech signal into a plurality of pulse signals, comprising: LPC analyzer means for producing one set of LPC parameters indicative of a spectral envelope of the speech signal for each search frame period; interpolation means responsive to said one set of LPC
parameters and another set of LPC parameters for an adjacent search frame period for producing a plurality of interpolated LPC
parameters during said search frame period; means for receiving said speech signal in backward time sequence; filter means for backwardly filtering said speech signal under control of said LPC
parameters and said interpolated LPC parameters to produce cross-correlation between said speech signal and an impulse response defined by said LPC parameters of said interpolated LPC
parameters, said cross-correlation being representative of a correlative transition during said each search frame period, a filtering characteristic of said filtering means being varied during said each search frame period in accordance with a backward time sequence of said LPC parameters and said interpolated LPC
parameters, said speech signal being received in backward time sequence; and search means for searching said plurality of pulse signals in response to said cross-correlation.
parameters and another set of LPC parameters for an adjacent search frame period for producing a plurality of interpolated LPC
parameters during said search frame period; means for receiving said speech signal in backward time sequence; filter means for backwardly filtering said speech signal under control of said LPC
parameters and said interpolated LPC parameters to produce cross-correlation between said speech signal and an impulse response defined by said LPC parameters of said interpolated LPC
parameters, said cross-correlation being representative of a correlative transition during said each search frame period, a filtering characteristic of said filtering means being varied during said each search frame period in accordance with a backward time sequence of said LPC parameters and said interpolated LPC
parameters, said speech signal being received in backward time sequence; and search means for searching said plurality of pulse signals in response to said cross-correlation.
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
JP87669/1987 | 1987-04-08 | ||
JP8766987 | 1987-04-08 |
Publications (1)
Publication Number | Publication Date |
---|---|
CA1336841C true CA1336841C (en) | 1995-08-29 |
Family
ID=13921347
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CA 563492 Expired - Lifetime CA1336841C (en) | 1987-04-08 | 1988-04-07 | Multi-pulse type coding system |
Country Status (3)
Country | Link |
---|---|
AU (1) | AU617993B2 (en) |
CA (1) | CA1336841C (en) |
GB (1) | GB2205469B (en) |
Families Citing this family (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CA2084323C (en) * | 1991-12-03 | 1996-12-03 | Tetsu Taguchi | Speech signal encoding system capable of transmitting a speech signal at a low bit rate |
US5351338A (en) * | 1992-07-06 | 1994-09-27 | Telefonaktiebolaget L M Ericsson | Time variable spectral analysis based on interpolation for speech coding |
RU2464649C1 (en) * | 2011-06-01 | 2012-10-20 | Корпорация "САМСУНГ ЭЛЕКТРОНИКС Ко., Лтд." | Audio signal processing method |
Family Cites Families (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
NL8302985A (en) * | 1983-08-26 | 1985-03-18 | Philips Nv | MULTIPULSE EXCITATION LINEAR PREDICTIVE VOICE CODER. |
NL8500843A (en) * | 1985-03-22 | 1986-10-16 | Koninkl Philips Electronics Nv | MULTIPULS EXCITATION LINEAR-PREDICTIVE VOICE CODER. |
JPH0738118B2 (en) * | 1987-02-04 | 1995-04-26 | 日本電気株式会社 | Multi-pulse encoder |
-
1988
- 1988-04-07 CA CA 563492 patent/CA1336841C/en not_active Expired - Lifetime
- 1988-04-08 GB GB8808318A patent/GB2205469B/en not_active Expired - Lifetime
- 1988-04-08 AU AU14402/88A patent/AU617993B2/en not_active Expired
Also Published As
Publication number | Publication date |
---|---|
GB2205469A (en) | 1988-12-07 |
GB2205469B (en) | 1991-04-03 |
AU617993B2 (en) | 1991-12-12 |
GB8808318D0 (en) | 1988-05-11 |
AU1440288A (en) | 1988-10-13 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US4220819A (en) | Residual excited predictive speech coding system | |
US4701954A (en) | Multipulse LPC speech processing arrangement | |
US4797925A (en) | Method for coding speech at low bit rates | |
CA2160749C (en) | Speech coding apparatus, speech decoding apparatus, speech coding and decoding method and a phase amplitude characteristic extracting apparatus for carrying out the method | |
US4472832A (en) | Digital speech coder | |
US3624302A (en) | Speech analysis and synthesis by the use of the linear prediction of a speech wave | |
US4821324A (en) | Low bit-rate pattern encoding and decoding capable of reducing an information transmission rate | |
US4912764A (en) | Digital speech coder with different excitation types | |
US6600798B2 (en) | Reduced complexity signal transmission system | |
US4945565A (en) | Low bit-rate pattern encoding and decoding with a reduced number of excitation pulses | |
USRE32580E (en) | Digital speech coder | |
EP0374941B1 (en) | Communication system capable of improving a speech quality by effectively calculating excitation multipulses | |
EP0784846B1 (en) | A multi-pulse analysis speech processing system and method | |
US5267317A (en) | Method and apparatus for smoothing pitch-cycle waveforms | |
US5202953A (en) | Multi-pulse type coding system with correlation calculation by backward-filtering operation for multi-pulse searching | |
US5797119A (en) | Comb filter speech coding with preselected excitation code vectors | |
CA1336841C (en) | Multi-pulse type coding system | |
US4873723A (en) | Method and apparatus for multi-pulse speech coding | |
US5809456A (en) | Voiced speech coding and decoding using phase-adapted single excitation | |
CA1308193C (en) | Multi-pulse coding system | |
JP3255190B2 (en) | Speech coding apparatus and its analyzer and synthesizer | |
EP0162585B1 (en) | Encoder capable of removing interaction between adjacent frames | |
EP0537948B1 (en) | Method and apparatus for smoothing pitch-cycle waveforms | |
US5734790A (en) | Low bit rate speech signal transmitting system using an analyzer and synthesizer with calculation reduction | |
JPH043879B2 (en) |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
MKEX | Expiry |