EP0405548B1 - Verfahren und Einrichtung zur Sprachcodierung - Google Patents

Verfahren und Einrichtung zur Sprachcodierung Download PDF

Info

Publication number
EP0405548B1
EP0405548B1 EP90112351A EP90112351A EP0405548B1 EP 0405548 B1 EP0405548 B1 EP 0405548B1 EP 90112351 A EP90112351 A EP 90112351A EP 90112351 A EP90112351 A EP 90112351A EP 0405548 B1 EP0405548 B1 EP 0405548B1
Authority
EP
European Patent Office
Prior art keywords
vector
code book
vectors
signal
white noise
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Lifetime
Application number
EP90112351A
Other languages
English (en)
French (fr)
Other versions
EP0405548A3 (en
EP0405548A2 (de
Inventor
Tomohiko Taniguchi
Yoshinori Tanaka
Yasuji Ohta
Fumio Amano
Shigeyuki Unagami
Akira Sasama
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Fujitsu Ltd
Original Assignee
Fujitsu Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Priority claimed from JP1168645A external-priority patent/JPH0333900A/ja
Priority claimed from JP1195302A external-priority patent/JPH03101800A/ja
Application filed by Fujitsu Ltd filed Critical Fujitsu Ltd
Publication of EP0405548A2 publication Critical patent/EP0405548A2/de
Publication of EP0405548A3 publication Critical patent/EP0405548A3/en
Application granted granted Critical
Publication of EP0405548B1 publication Critical patent/EP0405548B1/de
Anticipated expiration legal-status Critical
Expired - Lifetime legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
    • G10L19/04Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using predictive techniques
    • G10L19/08Determination or coding of the excitation function; Determination or coding of the long-term prediction parameters
    • G10L19/12Determination or coding of the excitation function; Determination or coding of the long-term prediction parameters the excitation function being a code excitation, e.g. in code excited linear prediction [CELP] vocoders
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
    • G10L2019/0001Codebooks
    • G10L2019/0003Backward prediction of gain
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
    • G10L2019/0001Codebooks
    • G10L2019/0004Design or structure of the codebook
    • G10L2019/0005Multi-stage vector quantisation
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
    • G10L2019/0001Codebooks
    • G10L2019/0011Long term prediction filters, i.e. pitch estimation
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/93Discriminating between voiced and unvoiced parts of speech signals

Definitions

  • the present invention relates to a system for speech coding and an apparatus for the same, more particularly relates to a system for high quality speech coding and an apparatus for the same using vector quantization for data compression of speech signals.
  • the vector quantization system is a well known one in which predictive filtering is applied to the signal vectors of a code book to prepare reproduced signals and the error powers between the reproduced signals and an input speech signal are evaluated to determine the index of the signal vector with the smallest error.
  • predictive filtering is applied to the signal vectors of a code book to prepare reproduced signals and the error powers between the reproduced signals and an input speech signal are evaluated to determine the index of the signal vector with the smallest error.
  • Figure 1 shows an example of a system for high quality speech coding using vector quantization.
  • This system is known as the code excited LPC (CELP) system.
  • CELP code excited LPC
  • a code book 10 is preset with 2 m patterns of residual signal vectors produced using N samples of white noise signal which corresponds to N dimensional vector (in this case, shape vectors showing the phase, hereinafter referred to simply as vectors).
  • the vectors are normalized so that the power of N samples (N being, for example 40) becomes a fixed value.
  • Vectors read out from the code book 10 by the command of the evaluating circuit 16 are given a gain by a multiplier unit 11, then converted to reproduced signals through two adaptive prediction units, i.e., a pitch prediction unit 12 which eliminates the long term correlation of the speech signals and a linear prediction unit 13 which eliminates the short term correlation of the same.
  • the reproduced signals are compared with digital speech signals of the N samples input from a terminal 15 in a subtractor 14 and the errors are evaluated by the evaluating circuit 16.
  • the evaluating circuit 16 selects the vector of the code book 10 giving the smallest power of the error and determines the gain of the multiplier unit 11 and a pitch prediction coefficient of the pitch prediction unit 12.
  • the linear prediction unit 13 uses the linear prediction coefficient found from the current frame sample values by a linear prediction analysis unit 18 in a linear difference equation as filter tap coefficients.
  • the pitch prediction unit 12 uses the pitch prediction coefficient and pitch frequency of the input speech signal found by a pitch prediction analysis unit 31 through a reverse linear prediction filter 30 as filter parameters.
  • the index of the optimum vector in the code book 10, the gain of the multiplier unit 11, and the parameters for constituting the prediction units are multiplexed by a multiplexer circuit 17 and become coded information.
  • the pitch period of the pitch prediction unit 12 is, for example, 40 to 167 samples, and each of the possible pitch periods is evaluated and the optimum period and the optimum period is chosen.
  • the transmission function of the linear prediction unit 13 is determined by linear predictive coding (LPC) analysis of the input speech signal.
  • LPC linear predictive coding
  • the evaluating circuit 16 searches through the code book 10 and determines the index giving the smallest error power between the input speech signal and residual signal.
  • the index of the code book 10 which is determined, that is, the phase of the residual vector, the gain of the multiplier unit 11, that is, the amplitude of the residual vector, the frequency and coefficient of the pitch prediction unit 12, and the coefficients of the linear prediction unit 13 are transmitted multiplexed by the multiplexer circuit 17.
  • a vector is read out from a code book 20 having the same construction as the code book 10, in accordance with the index, gain, and prediction unit parameters obtained by demultiplexing by the demultiplexer circuit 19 and is given a gain by a multiplier unit 21, then a reproduced speech signal is obtained by prediction by the prediction units 22 and 23.
  • the code book 10 comprised of white noise and the pitch prediction unit 12 for giving periodicity at the pitch frequencies, but the decision on the phase of the code book 10, the gain (amplitude) of the multiplier unit 11, and the pitch frequency (phase) and pitch prediction coefficient (amplitude) of the prediction unit 12 is made equivalently as shown in Fig. 3.
  • the processing for reproducing the vector of the code book 10 by the pitch prediction unit and linear prediction units for identification of the input signal, considered in terms of the vectors may be considered processing for the identification, by subtraction and evaluation by a subtractor 50, of a target vector X obtained by removing from the input signal S of one frame input from a terminal 40, by a subtractor 41, the effects of the previous frame S0 stored in a previous frame storage 42, with a vector X′ obtained by adding by an adder 49 a code vector gC obtained by applying linear prediction to a vector selected from a code book 10 by a linear prediction unit 44 (corresponding to the linear prediction unit 13 of Fig.
  • speech signals include voiced speech sounds and unvoiced speech sounds which are characterized in that the respective drive source signals sound sources) are periodic pulses or white noise with no periodicity.
  • pitch prediction and linear prediction were applied to the vectors of the code book comprised of white noise as a sound source and the pitch periodicity of the voiced speech sounds was created by the pitch prediction unit 12.
  • the pitch periodicity generated by the pitch prediction unit was created by giving a delay to the past sound source series by pitch prediction analysis, and the past sound source series was series of white noise originally obtained by reading code vectors from a code book, therefore, it was difficult to create a pulse series corresponding to the sound source of a voiced speech sound. This was a problem in that in the transitional state from an unvoiced speech sound to a voiced speech sound, the effect of this was large and high frequency noise was included in the reproduced speech, resulting in a deterioration of the quality.
  • the present invention as defined in the appended independent claims has as its object, in a CELP type speech coding system and apparatus wherein a gain is given to a code vector obtained by applying linear prediction to white noise of a code book and a pitch prediction vector obtained by applying linear prediction to a residual signal of a preceding frame given a delay corresponding to the pitch frequency, a reproduced signal is generated from the same, and the reproduced signal is used to identify the input speech signal, the creation of a pulse series corresponding to the sound source of a voiced speech sound and the accurate identification and coding for even a pulse-like sound source of a voiced speech sound so as to improve the quality of the reproduced speech.
  • a system for speech coding of the CELP type wherein a reproduced signal is generated from a code vector obtained by applying linear prediction to a vector of a residual signal of white noise of a code book and a pitch prediction vector obtained by applying linear prediction to a residual signal of a preceding frame given a delay corresponding to a pitch frequency, the error between the reproduced signal and an input speech signal is evaluated, the vector giving the smallest error is sought, and the input speech signal is encoded accordingly
  • the system for speech coding characterized in that in addition to the code vector and pitch prediction vector, use is made of a residual signal vector of an impulse having a predetermined relationship with the vectors of the white noise code book, variable gains are given to at least the code vector and an impulse vector obtained by applying linear prediction to the vector of the residual signal of the impulse, then the vectors are added to form a reproduced signal and the reproduced signal is used to identify the input speech signal.
  • an apparatus for speech coding characterized by being provided with a pitch frequency delay circuit giving a delay corresponding to a pitch frequency to a vector of a preceding residual signal, a first code book storing a plurality of vectors of residual signals of white noise, an impulse generating circuit generating an impulse having a predetermined relationship with the vectors of the residual signals of the white noise stored in the first code book, linear prediction circuits connected to the pitch frequency delay circuit, the first code book, and the impulse generating circuit, a variable gain circuit for giving a variable gain to vectors output from the linear prediction circuits connected to at least the first code book and the impulse generating circuit, a first addition circuit for adding the outputs of the variable gain circuit and producing a reproduced composite vector, an input speech signal input unit, a second addition circuit for adding the reproduced composite vector and the vector of the input speech signal, and an evaluating circuit for evaluating the output of the second addition circuit and identifying the input speech signal from the vector of the reproduced signal.
  • the basic constitution of the speech coding system of the present invention is that of a conventionally known CELP type speech coding system wherein in addition to the code vector and pitch prediction vector, use is made of a residual signal vector of an impulse having a predetermined relationship with the vectors of the white noise code book, variable gains are given to at least the code vector and an impulse vector obtained by applying linear prediction to the vector of the residual signal of the impulse, then the vectors are added to form a reproduced signal and the reproduced signal is used to identify the input speech signal.
  • the present invention is constituted by a conventionally known system wherein a synchronous pulse serving as a sound source for voiced speech sounds is introduced and a pulse-like sound source of voiced speech sounds is created by the use of a residual signal vector of an impulse having a predetermined relationship with the vectors of the white noise code book.
  • the vector of the residual signal of the white noise and the vector of the residual signal of the impulse are added while varying the amplitude components of the two vectors so as to reproduce a composite vector, so it is possible to accurately identify and code not only the white noise-like sound source of unvoiced speech sounds, but also the periodic pulse series sound source of voiced speech sounds and thereby to improve the quality of the reproduced signal.
  • the residual signal vector of the impulse used in the present invention may be an impulse vector having a predetermined relationship with the residual vectors of white noise stored in the first code book 10, specifically, may be one corresponding to one residual vector of white noise stored in the first code book. Further, the one impulse vector may be one corresponding to one of the predetermined sample positions, i.e., predetermined pulse positions, of a white noise residual vector in the first code book. More specifically, as mentioned later, the impulse vector may be one corresponding to a main element pulse position in the white noise residual vector or, as a simpler method, the impulse vector may be one corresponding to the maximum amplitude pulse position of the white noise residual vector.
  • the impulse residual vector used in the present invention may be one formed by separation from a white noise residual vector stored in the first code book. Further, for that purpose, use may be made of a second code book for storing command information for separating this from the white noise residual vector stored in the first code book. Also, the second code book may store preformed impulse vectors.
  • the second code book preferably is of the same size as the first code book.
  • FIG. 5 is a block diagram of an embodiment of a speech coding system of the present invention.
  • portions the same as in Fig. 1 are given the same reference numerals and explanations of the same are omitted.
  • Figure 5 shows the constitution of the transmission side.
  • the code book 10 are stored 2 m patterns of N dimensional vectors of residual signals formed by white noise, as in the past.
  • the impulse vectors from the code book 60 are supplied through a multiplier unit 61 to an adder 62 where they are added with vectors of white noise supplied from the code book 10 through an adder 11 and the result is supplied to a pitch prediction unit 12.
  • An evaluating circuit 16 searches through the code books 10 and 60 and determines the vector giving the smallest error signal power between the input speech signal and the reproduced signal from the linear prediction unit 13.
  • the index of the code book 10 decided on, that is, the phase-1 of the residual vector of the white noise, the index of the code book 60, that is, the phase-2 of the residual vector of the impulse, and the gains of the multiplier units 11 and 61, i.e., the amplitude-1 and amplitude-2 of the residual vectors, the frequency and coefficient of the pitch prediction unit 12 as in the past, and the coefficient of the linear prediction unit 13 are transmitted multiplexed by a multiplexer circuit 65.
  • the transmitted multiplexed signal is demultiplexed by the demultiplexer circuit 66.
  • Code books 20 and 70 have the same constitutions as the code books 10 and 60. From the code books 20 and 70 are read out the vectors indicated by the indexes (phase-1 and phase-2). These are passed through the multiplier units 21 and 71, then added by the adder 72 and reproduced by the pitch prediction unit 22 and further the linear prediction unit 23.
  • Figure 6 shows an example of the circuit constitution for realizing the above embodiment according to the speech coding system of the present invention.
  • Fig. 6 portions the same as in Fig. 3 are given the same reference numerals and explanations thereof are omitted.
  • a vector of a residual signal of white noise from a first code book 43 is subjected to prediction by a linear prediction unit 44 and multiplied with a gain g: by a multiplier unit 45, one example of a variable gain circuit, to obtain a white noise code vectors g1C1.
  • the vectors of residual signals of impulses from a second code book 80 are subjected to prediction by a linear prediction unit 81 and multiplied by a gain g2 by a multiplier unit 82, similarly an example of a variable gain circuit, to obtain an impulse code vector g2C2
  • the above-mentioned code vectors g1C1 and g2C2 and a pitch prediction vector bP output from a multiplier unit 48 are added by adders 49 and 83 to give a composite vector X ⁇ .
  • the error E between the composite vector X ⁇ output by the adder 83 and the target vector is evaluated by an evaluating circuit 51.
  • Figure 7 illustrates the vector operation mentioned above.
  • the most suitable code vector and pitch prediction vector one may find the amplitudes g1, g2, and b by the equations (5), (6), and (7) for all the combinations of the phases C1, C2, and P of the three vectors and search for the set of the amplitudes and phases g1, g2, b, C1, C2, and P giving the smallest error signal power.
  • the phase of the impulse code vector C2 corresponds unconditionally to the phase of the white noise code vector C1, so to determine the optimum drive source vector, one may find the b, g1, and g2 giving the value of 0 for the error power
  • Figure 8 shows the case of establishment of an impulse vector at a pulse position showing the maximum amplitude in the white noise residual vector, with respect to the impulse vectors and the white noise residual vectors stored in the first code book in the present invention.
  • the first code book 10 is provided with a table 90 with a common index i (corresponding to the second code book) and stores the position of the elements (sample) with the maximum amplitudes among the patterns of white noise vectors of the code book 10.
  • the white noise vector and maximum amplitude position read out from the code book 10 and the table 90 respectively in accordance with the search pattern indexes entering from the evaluating circuit 16 through a terminal 91 are supplied to an impulse separating circuit 92 where, as shown in Fig.
  • the sum of the white noise vector and the impulse vector output by the impulse separating circuit 92 becomes the same as the original white noise vector of the code book 10, so when the amplitude ratio g1/g2 of the multiplier units 11 and 61 is "1", use may be made of the original white noise and when it is "0" use may be made of the complete impulse.
  • the white noise vector and the impulse vector are added by varying the gain of the amplitudes of the respective elements, it is possible to accurately identify and code not only the white noise-like sound source of unvoiced speech sounds, but also the periodic pulse series sound source of voiced speech sound, a problem in the past, and thereby to vastly improve the quality of the reproduced speech.
  • the first addition circuit is formed by an adder 49 and an adder 83, but the first addition circuit may be formed by a single unit instead of the adders 49 and 83.
  • FIG. 10 Another embodiment of the speech coding system of the present invention will be shown in Fig. 10.
  • Fig. 6 provision was made of a code book comprised of fixed impulses generated in accordance with only predetermined pulse positions of the vectors in the code book 10, but even if the input speech signal is identified by adding the vector based on the fixed impulses to the conventional pitch prediction vector and white noise vector, the optimal identification cannot necessarily be performed. This is because, as shown in Fig. 6, since linear prediction is applied even to the impulse vector, there is a distortion in space.
  • the principle of which is shown in Fig. 10 instead of using fixed impulse vectors, the phase difference between the white noise vector C1 after application of linear prediction 44 and the vector obtained by applying linear prediction to the impulse by the main element pulse position detection circuit 90 is evaluated, whereby the position of the main element pulse is detected.
  • the main element impulse is generated at this position by the impulse generating unit 91.
  • the three vectors, i.e., the pitch prediction vector P, the white noise code vector C1, and the main element impulse vector are added and the composite vector is used to identify the input speech signal S.
  • FIG. 11 is a block diagram of the third embodiment of the present invention.
  • the third embodiment differs from the embodiment of Fig. 5 only in that it uses a main element pulse position detection circuit 110 instead of an impulse code book 60.
  • the main element pulse position detection circuit 110 extracts the position of the main element pulse for the vectors of the white noise code book 10, the main element pulse generated at that position is multiplied by the gain (amplitude) component by the multiplier unit 61, one type of variable gain circuit, then is added to the white noise read out from the code book 10 as in the past and multiplied by the gain by the multiplier unit 11, also one type of variable gain circuit, and reproduction is performed by the pitch prediction unit 12 and the linear prediction unit 13.
  • the coding information may be, like with Fig. 5, the white noise code index (phase) and gain (amplitude), the amplitude of the main element impulse, and the parameters for constructing the prediction units (pitch frequency, pitch prediction coefficient, linear prediction coefficient) transmitted multiplexed by the multiplexer circuit 65.
  • the receiving side may be similarly provided with a main element pulse position detection circuit 120 and the speech signal reproduced based on the parameters demultiplexed at the demultiplexer circuit 66.
  • the sound source signal is generated by adding the white noise and the impulse, it is possible to accurately generate not only a white noise-like sound source of unvoiced speech sounds, but also a periodic pulse series sound source of voiced speech sounds by control of the amplitude components and therefore possible to improve the quality of the reproduced speech.
  • Figure 12 shows an embodiment of the main element pulse position detection circuit 110 used in the above-mentioned embodiment.
  • a linear prediction unit 111 which applies linear prediction to N number of impulse vectors (these may be generated also from a separately provided memory) with different pulse positions
  • a maximum value detection unit 113 which detects the maximum value of the phase difference calculated by the phase difference calculation unit 112, and an impulse generating circuit 114 which decides on the position of the main element pulse by the maximum value detected by the maximum value detection unit 113 and generates an impulse at the position of the main element pulse.
  • an impulse code vector generated by a code book or table etc. at a position corresponding to the position of predetermined pulses of the white noise code vector is added and the identification performed by this composite vector of three vectors, so it is possible to create not only a sound source of unvoiced speech sounds, but also a pulse-like sound source of voiced speech sounds and possible to improve the quality of the reproduced speech. Further, by separating the vector of the residual signal of the impulse from the vector of the residual signal of the white noise, it is possible to increase the effect of data compression.
  • the fourth embodiment of the present invention constitutes the conventional CELP type speech coding system wherein the vector of the residual signal of the white noise and the vector of the residual signal of the impulse are added by a ratio based on the strength of the pitch correlation of the input speech signal obtained by pitch prediction so as to obtain a composite vector.
  • the composite vector is reproduced to obtain a reproduced signal and the error of that with the input speech signal is evaluated.
  • the vector of the residual signal of the white noise and the vector of the residual signal of the impulse are added by a ratio based on the strength of the pitch correlation of the input speech signal and the composite vector is reproduced, it is possible to accurately identify and code not only the white noise-like sound source of unvoiced speech sounds, but also the periodic pulse series sound source of voiced speech sounds and thereby to improve the quality of the reproduced speech.
  • FIG. 13 is a block diagram of the fourth embodiment of the system of the present invention.
  • portions the same as Fig. 1 are given the same reference numerals and explanations thereof are omitted.
  • a table 60 in the code book 10 in which are stored 2 m patterns of N order vectors of residual signals of white noise.
  • this table 60 are stored the positions of elements (samples) of the maximum amplitude for each of the 2 m patterns of vectors in the code book 10.
  • the white noise vector read out from the code book 10 in accordance with the search pattern index from the evaluating circuit 16 is supplied to the impulse generating unit 61 and the weighting and addition circuit 62, while the maximum amplitude position read out from the table is supplied to the impulse generating unit 61.
  • the impulse generating unit 61 picks out the element of the maximum amplitude position from in the white noise vector as shown in Fig. 14(A) and generates an impulse vector as shown in Fig. 14(B) with the remaining N-1 elements all made 0 and supplies the impulse vector to the weighting and addition circuit 62.
  • the weighting and addition circuit 62 multiplies the weighting sin ⁇ and cos ⁇ supplied from the later mentioned pitch correlation calculation unit 63 with the white noise vector and impulse vector for performing the weighting, then performs the addition.
  • the composite vector obtained here is supplied to the multiplier unit 11.
  • the pitch correlation calculation unit 63 finds the phase difference ⁇ between the later mentioned pitch prediction vector and the vector of the input speech signal to obtain the pitch correlation (weighting) cos ⁇ and the weighting sin ⁇ .
  • the evaluating circuit 16 searches through the code book 10 and decides on the index giving the smallest error signal power.
  • the index of the code book 10 decided on, that is, the phase of the residual vector of the white noise, the gain, that is, the amplitude of the residual vector, of the multiplier unit 11, the frequency and coefficient ( ⁇ and cos ⁇ ) of the pitch prediction unit 12 as in the past, and the coefficient of the linear prediction unit 13 are transmitted multiplexed by the multiplexer circuit 17.
  • the gain is preferably variable.
  • the transmitted multiplexed signal is demultiplexed by the demultiplexer circuit 19.
  • the code book 20 and the table 70 are each of the same construction as the code book 10 and the table 60.
  • the vector and maximum amplitude position indicated by the respective indexes (phases) are read out from the code book 20 and the table 70.
  • the impulse generating unit 71 generates an impulse vector in the same way as the impulse generating unit 61 on the coding unit side and supplies the same to the weighting circuit 72.
  • the weighting circuit 72 prepares the weighting sin ⁇ from the pitch correlation (weighting) cos ⁇ from among the coefficients ( ⁇ and cos ⁇ ) from the pitch prediction unit 12 transmitted and demultiplexed. With these, the white noise vector and the impulse vector are weighted and added and the composite vector is supplied to the multiplier 21. Reproduction is performed at the pitch prediction unit 22 and the linear prediction unit 23.
  • circuit construction of the speech coding system of the above embodiment may be expressed as shown in Fig. 16.
  • Fig. 16 portions the same as in Fig. 2 are given the same reference numerals and explanations thereof are omitted.
  • the vector of the residual signal of the white noise from the code book 43 is subjected to prediction by the linear prediction unit 44 and multiplied with the weighting sin ⁇ by the multiplier unit 80, one type of variable gain circuit, to obtain a white noise code vector.
  • the vector of the residual signal of the impulse generated from the white noise vector at the impulse generating unit 81 is subjected to prediction by the linear prediction unit 82 and multiplied by the weighting cos ⁇ by the multiplier 83, one type of variable gain circuit, to obtain an impulse code vector.
  • These are added by the adder 84 and further multiplied by the gain g at the adder 45 (amplitude of code vector) to give the code vector gC.
  • This code vector gC is added by the adder 49 with the pitch prediction vector bP output from the multiplier unit 48 and the composite vector X ⁇ is obtained.
  • the error E between the composite vector X ⁇ output by the adder 50 and the target vector X is evaluated by the evaluating circuit 51.
  • Figure 17 illustrates this vector operation.
  • the code vector gC changes in accordance with the weighting cos ⁇ , sin ⁇ from white noise to an impulse, but the pitch prediction vector bP and the code vector gC may be used to determine the phases P and C and amplitudes b and g of the two vectors in the same way as the past without change to the process of identification of the input.
  • the amplitude component b of the pitch prediction vector bP is nothing other than the prediction coefficient b of the pitch prediction unit, but this value may be found by identifying the input signal by only the pitch prediction vector using the code vector gC as "0" in the above-mentioned speech signal analysis (equation (8) and equation (9)).
  • the pitch prediction coefficient b as shown in equation (10), is the product of the amplitude ratio ⁇ of the target vector X and the pitch prediction vector P and the pitch correlation cos ⁇ .
  • the white noise vector and the impulse vector are added with the amplitudes of their respective elements controlled, so it is possible to accurately identify and code not only the white noise-like sound source of unvoiced speech sounds, but also the periodic pulse series sound source of voiced speech sounds, a problem in the past, and thereby to vastly improve the quality of the reproduced speech.
  • the speech coding system of this embodiment it is possible to accurately identify and code not only the sound source of unvoiced speech sounds but also the pulse-like sound source of voiced speech sounds, not possible in the past, and is possible to improve the quality of the reproduced signal. Further, there is no increase in the amount of the information transmitted, making this very practical.

Landscapes

  • Engineering & Computer Science (AREA)
  • Computational Linguistics (AREA)
  • Signal Processing (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Compression, Expansion, Code Conversion, And Decoders (AREA)
  • Transmission Systems Not Characterized By The Medium Used For Transmission (AREA)

Claims (20)

  1. CELP-Sprachkodierungsystem, bei dem ein reproduziertes Signal aus einem durch Anwenden einer Linearprädiktion auf einen Kodevektor eines Restsignals weißen Rauschens eines Kodelexikons erhaltenen Steuervektor und einem durch Anwenden einer Linearprädiktion auf ein Restsignal eines vorhergehenden Rahmens, dem eine einer Tonlagenfrequenz entsprechende Verzögerung gegeben ist, erhaltenen Tonlagenprädiktionsvektor erzeugt wird, der Fehler zwischen dem reproduzierten Signal und einem Eingangssprachsignal bewertet wird, der den kleinsten Fehler ergebende Kodevektor gesucht wird und das Eingangssprachsignal dementsprechend kodiert wird, und das Sprachkodierungssystem dadurch gekennzeichnet ist, daß zusätzlich zu dem Kodevektor und dem Tonlagenprädiktionsvektor von einem Restsignalvektor eines Impulses Gebrauch gemacht wird, welcher eine vorbestimmte Beziehung zu den Vektoren des Kodelexikons mit dem weißen Rauschen aufweist, variable Verstärkungen mindestens dem Kodevektor und einem Impulsvektor gegeben werden, der durch Anwenden einer Linearprädiktion auf den Vektor des Restsignals des Impulses erhalten wird, dann die Vektoren addiert werden, um ein reproduziertes Signal zu bilden, und das reproduzierte Signal dazu verwendet wird, das Eingangssprachsignal zu identifizieren.
  2. Sprachkodierungssystem nach Anspruch 1, dadurch gekennzeichnet, daß die jeweiligen Restsignalvektoren der Impulse mit einer vorbestimmten Beziehung zu den Vektoren des Kodelexikons mit dem weißen Rauschen den Vektoren des Kodelexikons mit dem weißen Rauschen entsprechen.
  3. Sprachkodierungssystem nach Anspruch 2, dadurch gekennzeichnet, daß die Vektoren der Restsignale der Impulse gerade vorbestimmten Pulspositionen in den Vektoren des Kodelexikons mit dem weißen Rauschen entsprechen.
  4. Sprachkodierungssystem nach Anspruch 2, dadurch gekennzeichnet, daß die Vektoren der Restsignale der Impulse Pulspositionen der Maximalamplitude in den Vektoren des Kodelexikons mit dem weißen Rauschen entsprechen.
  5. Sprachkodierungssystem nach Anspruch 2, dadurch gekennzeichnet, daß die Vektoren der Restsignale der Impulse, die einer einzigen Position entsprechen, die von einer der vorbestimmten Pulspositionen in den Vektoren des Kodelexikons mit dem weißen Rauschen ausgewählt ist, und die Pulspositionen der Maximalamplitude in einem getrennt vorgesehenen Kodelexikon gespeichert sind.
  6. Sprachkodierungssystem nach Anspruch 4, dadurch gekennzeichnet, daß die Vektoren der Restsignale der Impulse, die einer einzigen Position entsprechen, die von einer der vorbestimmten Pulspositionen in den Vektoren des Kodelexikons mit dem weißen Rauschen ausgewählt ist, und die Pulspositionen der Maximalamplitude in einem getrennt vorgesehenen Kodelexikon gespeichert sind.
  7. Sprachkodierungssystem nach Anspruch 1, dadurch gekennzeichnet, daß der Restsignalvektor des Impulses mit einer vorbestimmten Beziehung zu den Vektoren des Kodelexikons mit dem weißen Rauschen der Hauptelementimpuls in den Vektoren des Kodelexikons mit dem weißen Rauschen ist,
  8. Sprachkodierungssystem nach Anspruch 1, dadurch gekennzeichnet, daß der Restsignalvektor des weißen Rauschens und der Vektor des Restsignals des Impulses durch einen vorbestimmten Koeffizienten justiert werden, der von einem Vektor des Spracheingangssignals und einem durch Anwenden einer Linearprädiktion auf ein Restsignal eines vorhergehenden Rahmens erhaltenen Tonlagenprädiktionsvektor abgeleitet wird, und daß der Fehler bewertet wird.
  9. Sprachkodierungssystem nach Anspruch 8, dadurch gekennzeichnet, daß der Restsignalvektor des weißen Rauschens und der Vektor des Restsignals des Impulses mit einem vorbestimmten Koeffizienten gewichtet werden, der von einem Vektor des Spracheingangssignals und einem durch Anwenden einer Linearprädiktion auf ein Restsignal eines vorhergehenden Rahmens erhaltenen Tonlagenprädiktionsvektor abgeleitet wird, und daß der Fehler bewertet wird.
  10. Sprachkodierungssystem nach Anspruch 9, dadurch gekennzeichnet, daß der Restsignalvektor des weißen Rauschens und der Vektor des Restsignals des Impulses in einem Verhältnis entsprechend einer Intensität einer Tonlagenkorrelation addiert werden, die durch Anwenden einer Linearprädiktion auf den Vektor des Spracheingangssignals und den Vektor des Restsignals des vorhergehenden Rahmens, Reproduzieren des zusammengesetzten Vektors und Bewerten des Fehlers zwischen dem sich ergebenden reproduzierten Signal und dem Vektor des Eingangssprachsignals erhalten wird.
  11. Sprachkodierungssystem nach Anspruch 10, dadurch gekennzeichnet, daß die Tonlagenkorrelation eine Funktion eines Winkels ist.
  12. Sprachkodierungssystem nach Anspruch 1, dadurch gekennzeichnet, daß der Vektor des Restsignals des Impulses von dem Vektor des Restsignals weißen Rauschens getrennt ist.
  13. Vorrichtung zum Kodieren von Sprache, dadurch gekennzeichnet, daß es versehen ist mit: einer Tonlagenfrequenzverzögerungsschaltung, welche eine einer Tonlagenfrequenz entsprechende Verzögerung auf einen Vektor eines vorhergehenden Restsignals aufgibt, einem ersten Kodelexikon, das mehrere Vektoren von Restsignalen weißen Rauschens speichert, einer Impulserzeugungsschaltung, die einen Impuls mit einer vorbestimmten Beziehung zu den Vektoren der Restsignale des in dem ersten Kodelexikon gespeicherten weißen Rauschens erzeugt, Linearprädiktionsschaltungen, die mit der Tonlagefrequenzverzögerungsschaltung, dem ersten Kodelexikon und der Impulserzeugungsschaltung verbunden sind, einer Schaltung mit variabler Verstärkung, um den von den zumindest mit dem ersten Kodelexikon und der Impulserzeugungsschaltung verbundenen Linearprädiktionsschaltungen ausgegebenen Vektoren eine variable Verstärkung zu geben, einer ersten Additionsschaltung zum Addieren der Ausgangssignale der Schaltung mit der variablen Verstärkung und zum Erzeugen eines reproduzierten zusammengesetzten Vektors, einer Eingangsprachsignal-Eingabeeinheit, einer zweiten Additionsschaltung zum Addieren des reproduzierten zusammengesetzten Vektors und des Vektors des Spracheingangssignals und einer Bewertungsschaltung zum Bewerten des Ausgangssignals der zweiten Additionsschaltung und zum Identifizieren des Eingangssprachsignals aus dem Vektor des reproduzierten Signals.
  14. Vorrichtung zum Kodieren von Sprache nach Anspruch 13, dadurch gekennzeichnet, daß die erste Additionsschaltung einen ersten Addierer, welcher nur die Ausgangssignale der Linearprädiktionsschaltungen addiert, die mit der Tonlagenfrequenzverzögerungsschaltung und dem ersten Kodelexikon verbunden sind, und einen zweiten Addierer aufweist, welcher die Ausgangssignale der mit der Impulserzeugungsschaltung verbundenen Linearprädiktionsschaltung addiert.
  15. Vorrichtung zum Kodieren von Sprache nach Anspruch 13, dadurch gekennzeichnet, daß die Impulserzeugungsschaltung von einer Hauptelementpulsposition-Detektionsschaltung gesteuert wird, welche als Eingangssignal das Ausgangssignal von der mit der ersten Kodelexikon verbundenen Linearprädiktionsschaltung erhält.
  16. Vorrichtung zum Kodieren von Sprache nach Anspruch 15, dadurch gekennzeichnet, daß die Hauptelementpulsposition-Detektionsschaltung eine Funktion zum Extrahieren einer Pulsposition aufweist, welche den kleinsten Phasenfehler zwischen einem Ausgangsvektor aus der mit dem ersten Kodelexikon verbundenen Linearprädiktionsschaltung und einem Vektor ergibt, der durch Anwenden einer Linearprädiktion auf einen Puls erhalten wird, welcher Abtastzeitpunkten von in dem ersten Kodelexikon gespeicherten Restsignalvektoren entspricht.
  17. Vorrichtung zum Kodieren von Sprache nach Anspruch 13, dadurch gekennzeichnet, daß die Impulserzeugungsschaltung ein zweites Kodelexikon aufweist, das mehrere den mehreren Restsignalvektoren des in dem ersten Kodelexikon gespeicherten weißen Rauschens entsprechende Impulse speichert.
  18. Vorrichtung zum Kodieren von Sprache nach Anspruch 17, dadurch gekennzeichnet, daß das zweite Kodelexikon die Reihenfolgen speichert, welche die Maximalpulse in den Restsignalvektoren des in dem ersten Kodelexikon gespeicherten weißen Rauschens angeben.
  19. Vorrichtung zum Kodieren von Sprache nach Anspruch 17, dadurch gekennzeichnet, daß die Impulserzeugungsschaltung eine Impulstrennungsschaltung aufweist, welche die Impulse von den Restsignalvektoren des in dem ersten Kodelexikon gespeicherten weißen Rauschens abtrennt.
  20. Vorrichtung zum Kodieren von Sprache nach Anspruch 13, dadurch gekennzeichnet, daß bei dem Erzeugen eines reproduzierten Vektors aus den Ausgangssignalen des ersten Kodelexikons und der Impulserzeugungsschaltung durch die Linearprädiktionsschaltung und die Schaltung mit variabler Verstärkung, Vorkehrung für eine Wichtungsschaltung getroffen ist, um die Linearprädiktionsschaltung und die Schaltung mit variabler Verstärkung zu steuern, und daß die Wichtungsschaltung mit einer Tonlagenkorrelation-Berechnungsschaltung verbunden ist, welche als Eingangssignal einen Tonlagenprädiktionsvektor empfängt, der durch Anwenden einer Linearprädiktion auf einen Vektor eines Eingangssignals und einen Restsignalvektor eines vorhergehenden Rahmens erhalten wird.
EP90112351A 1989-06-28 1990-06-28 Verfahren und Einrichtung zur Sprachcodierung Expired - Lifetime EP0405548B1 (de)

Applications Claiming Priority (6)

Application Number Priority Date Filing Date Title
JP16618089 1989-06-28
JP166180/89 1989-06-28
JP1168645A JPH0333900A (ja) 1989-06-30 1989-06-30 音声符号化方式
JP168645/89 1989-06-30
JP195302/89 1989-07-27
JP1195302A JPH03101800A (ja) 1989-06-28 1989-07-27 音声符合化方式

Publications (3)

Publication Number Publication Date
EP0405548A2 EP0405548A2 (de) 1991-01-02
EP0405548A3 EP0405548A3 (en) 1991-08-28
EP0405548B1 true EP0405548B1 (de) 1994-11-17

Family

ID=27322637

Family Applications (1)

Application Number Title Priority Date Filing Date
EP90112351A Expired - Lifetime EP0405548B1 (de) 1989-06-28 1990-06-28 Verfahren und Einrichtung zur Sprachcodierung

Country Status (3)

Country Link
EP (1) EP0405548B1 (de)
CA (1) CA2019801C (de)
DE (1) DE69014156T2 (de)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7363220B2 (en) 1997-12-24 2008-04-22 Mitsubishi Denki Kabushiki Kaisha Method for speech coding, method for speech decoding and their apparatuses

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
FI98104C (fi) * 1991-05-20 1997-04-10 Nokia Mobile Phones Ltd Menetelmä herätevektorin generoimiseksi ja digitaalinen puhekooderi
SG43128A1 (en) * 1993-06-10 1997-10-17 Oki Electric Ind Co Ltd Code excitation linear predictive (celp) encoder and decoder
ES2663269T3 (es) 2007-06-11 2018-04-11 Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V. Codificador de audio para codificar una señal de audio que tiene una porción similar a un impulso y una porción estacionaria

Cited By (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7363220B2 (en) 1997-12-24 2008-04-22 Mitsubishi Denki Kabushiki Kaisha Method for speech coding, method for speech decoding and their apparatuses
US7383177B2 (en) 1997-12-24 2008-06-03 Mitsubishi Denki Kabushiki Kaisha Method for speech coding, method for speech decoding and their apparatuses
US7742917B2 (en) 1997-12-24 2010-06-22 Mitsubishi Denki Kabushiki Kaisha Method and apparatus for speech encoding by evaluating a noise level based on pitch information
US7747432B2 (en) 1997-12-24 2010-06-29 Mitsubishi Denki Kabushiki Kaisha Method and apparatus for speech decoding by evaluating a noise level based on gain information
US7747441B2 (en) 1997-12-24 2010-06-29 Mitsubishi Denki Kabushiki Kaisha Method and apparatus for speech decoding based on a parameter of the adaptive code vector
US7747433B2 (en) 1997-12-24 2010-06-29 Mitsubishi Denki Kabushiki Kaisha Method and apparatus for speech encoding by evaluating a noise level based on gain information
US7937267B2 (en) 1997-12-24 2011-05-03 Mitsubishi Denki Kabushiki Kaisha Method and apparatus for decoding
US8190428B2 (en) 1997-12-24 2012-05-29 Research In Motion Limited Method for speech coding, method for speech decoding and their apparatuses
US8352255B2 (en) 1997-12-24 2013-01-08 Research In Motion Limited Method for speech coding, method for speech decoding and their apparatuses
US8447593B2 (en) 1997-12-24 2013-05-21 Research In Motion Limited Method for speech coding, method for speech decoding and their apparatuses
US8688439B2 (en) 1997-12-24 2014-04-01 Blackberry Limited Method for speech coding, method for speech decoding and their apparatuses
US9263025B2 (en) 1997-12-24 2016-02-16 Blackberry Limited Method for speech coding, method for speech decoding and their apparatuses

Also Published As

Publication number Publication date
EP0405548A3 (en) 1991-08-28
CA2019801A1 (en) 1990-12-28
EP0405548A2 (de) 1991-01-02
CA2019801C (en) 1994-05-31
DE69014156D1 (de) 1994-12-22
DE69014156T2 (de) 1995-05-11

Similar Documents

Publication Publication Date Title
US5261027A (en) Code excited linear prediction speech coding system
US6393392B1 (en) Multi-channel signal encoding and decoding
EP0890943B1 (de) Einrichtung zur Sprachkodierung und -dekodierung
US5323486A (en) Speech coding system having codebook storing differential vectors between each two adjoining code vectors
JP3346765B2 (ja) 音声復号化方法及び音声復号化装置
EP0476614B1 (de) Sprachkodierungs- und Dekodierungssystem
JP3094908B2 (ja) 音声符号化装置
EP0751494A1 (de) System zur kodierung von tonsignalen
CA2091754C (en) Method of, and system for, coding analogue signals
EP0704836B1 (de) Vorrichtung zur Vektorquantisierung
EP0462559B1 (de) System zur Sprachcodierung und -decodierung
CA2440820A1 (en) Sound encoding apparatus and method, and sound decoding apparatus and method
EP0405548B1 (de) Verfahren und Einrichtung zur Sprachcodierung
JPH02287399A (ja) ベクトル量子化制御方式
US5884252A (en) Method of and apparatus for coding speech signal
EP0484339B1 (de) Digitaler sprachcodierer mit verbesserter sprachqualität unter anwendung einer vektoranregungsquelle
EP0729133B1 (de) Bestimmung der Verstärkung für die Signalperiode bei der Kodierung eines Sprachsignales
JP3299099B2 (ja) 音声符号化装置
JP3088204B2 (ja) コード励振線形予測符号化装置及び復号化装置
JP3010654B2 (ja) 圧縮符号化装置及び方法
JP3010655B2 (ja) 圧縮符号化装置及び方法、並びに復号装置及び方法
JP3092654B2 (ja) 信号符号化装置
JP3192051B2 (ja) 音声符号化装置
JPH03101800A (ja) 音声符合化方式
JPH05265492A (ja) コード励振線形予測符号化器及び復号化器

Legal Events

Date Code Title Description
PUAI Public reference made under article 153(3) epc to a published international application that has entered the european phase

Free format text: ORIGINAL CODE: 0009012

AK Designated contracting states

Kind code of ref document: A2

Designated state(s): DE FR GB

PUAL Search report despatched

Free format text: ORIGINAL CODE: 0009013

AK Designated contracting states

Kind code of ref document: A3

Designated state(s): DE FR GB

17P Request for examination filed

Effective date: 19910926

17Q First examination report despatched

Effective date: 19940208

GRAA (expected) grant

Free format text: ORIGINAL CODE: 0009210

AK Designated contracting states

Kind code of ref document: B1

Designated state(s): DE FR GB

REF Corresponds to:

Ref document number: 69014156

Country of ref document: DE

Date of ref document: 19941222

ET Fr: translation filed
PLBE No opposition filed within time limit

Free format text: ORIGINAL CODE: 0009261

STAA Information on the status of an ep patent application or granted ep patent

Free format text: STATUS: NO OPPOSITION FILED WITHIN TIME LIMIT

26N No opposition filed
REG Reference to a national code

Ref country code: GB

Ref legal event code: IF02

PGFP Annual fee paid to national office [announced via postgrant information from national office to epo]

Ref country code: DE

Payment date: 20090626

Year of fee payment: 20

Ref country code: GB

Payment date: 20090624

Year of fee payment: 20

REG Reference to a national code

Ref country code: GB

Ref legal event code: PE20

Expiry date: 20100627

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: GB

Free format text: LAPSE BECAUSE OF EXPIRATION OF PROTECTION

Effective date: 20100627

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: DE

Free format text: LAPSE BECAUSE OF EXPIRATION OF PROTECTION

Effective date: 20100628

PGFP Annual fee paid to national office [announced via postgrant information from national office to epo]

Ref country code: FR

Payment date: 20090611

Year of fee payment: 20