MXPA96005179A

MXPA96005179A - A system and method of processing of voice deanalisis of impulses multip

Info

Publication number: MXPA96005179A
Application number: MXPA/A/1996/005179A
Authority: MX
Inventors: Bialik Leon; Flomen Felix
Original assignee: Audiocodes Ltd
Priority date: 1994-04-29
Filing date: 1996-10-28
Publication date: 1998-10-30

Abstract

The present invention relates to a voice processing system comprising: a short-term analyzer connected to an input line and an output line wherein, in response to a speech signal from said input line, said analyzer short term generates the short-term characteristics of said input speech signal, a destination vector generator to generate a destination vector from at least said input speech signal and, optionally, said short-term characteristics; a multiple pulse analyzer connected to an output line of said destination vector generator, wherein said multiple pulse analyzer generates a plurality of sequences of equal amplitude, variable sign, variablely spaced pulses, each having sequences a different amplitude value, having each of the pulses within each sequence equal amplitudes but variable signs, the pulse analyzer multiple to send a signal corresponding to the sequence of equal amplitude variable sign, variably spaced pulses which, in accordance with criteria of maximum likelihood representing the vector more closely desti

Description

SYSTEM AND METHOD OF VOICE PROCESSING WITH MULTIPLE PULSE ANALYSIS FIELD OF THE INVENTION The present invention relates generally to speech processing systems, and in particular to multiple pulse analysis systems. BACKGROUND OF THE INVENTION Speech signal processing is well known in the art and is often used to compress an input speech signal, either for storage or transmission. Speech signal processing typically involves dividing the input speech signals into frames and then analyzing each frame to determine its components. Next, the components are stored or transmitted. Typically, the frame analyzer determines the short and long term characteristics of the speech signal. The frame analyzer can also determine one or both short and long term components, or "contributions", of the voice signal. For example, the analysis of the linear prediction coefficient (LPC) provides the analysis of the characteristics, the contribution and the short-term density, and the prediction provides the characteristics as well as the long-term contribution. Typically, any of the contributions of the long-term and short-term predictor, both or none are subtracted from the input frame, leaving a destination vector whose shape has to be characterized. This characterization can occur with the multiple pulse analysis (MPA) that is described in detail in chapter 6.4.2 of the book Digital Speech Processing. Synthesis and Recognition by Sadaoki Furui, Marcel Dekker, Inc., New York, NY. 1989. The book is incorporated herein by reference. In the MPA, the target vector, which is formed with a multiplicity of samples, is modeled by a plurality of pulses of equal amplitude (or peaks), or of location and variable sign (positive and negative). To select each impulse, an impulse is placed in each sample site and the effect of the impulse, defined by the passage of the impulse through a filter defined by the LPC coefficients, is determined. The impulse that most resembles the target vector is selected and its effect is removed from the target vector, thus generating a new target vector. The process continues until a previously determined number of impulses has been found. For storage or transmission purposes, the result of the MPA analysis is a set of pulse sites located at a quantized gain value. The gain is typically determined from the first impulse that is determined. This gain is then used for the remaining impulses. Unfortunately, the value of the gain of the first impulse is not always indicative of the value of the general gain of the target vector and is not always very accurate. SUMMARY OF THE PRESENT INVENTION It is therefore an object of the present invention to provide an improved speech processing system. In one embodiment of the present invention, the system includes a short-term analyzer, a target vector generator, and a maximum-likelihood quantization multiple-pulse analysis unit (MLQ). The short-term analyzer determines the short-term characteristics of an input speech signal. The destination vector generator generates a destination vector from at least the input signal. The MPA multiple pulse analysis unit typically determines an initial gain level for the multiple pulse sequence and executes a single gain MPA several times, each with a different gain level. The gain levels are in a range above and below the initial gain level. The resulting impulses can be positive or negative. As in other applications of maximum probability, the quality of the result is measured (in this case, minimizing the energy of an error vector defined as the difference between the target vector and an estimated vector produced by filtering the gain impulse sequence simple through a perceptual weighting filter). The pulse sequence that minimizes the energy of the error vector and its corresponding gain level (or the gain level index) is then provided as the output signal of the MLQ multi-pulse analysis unit. In an alternative incorporation, the system includes a long-term prediction analyzer and replaces the MLQ multiple pulse analysis unit with a pulsed train multiple pulse analysis unit. In this embodiment, the pulsed pulse multiple pulse analysis unit uses a pitch distance from the long-term analyzer to create a train of equal amplitude, the same signal pulses, each pitch distance separated from the previous pulse of the train. The multi-input speech signal analysis unit then sends a signal representing the sequence of speech signal streams of inputs, including positive and negative pulse trains, which best represents the target vector. In a further alternative embodiment, the system includes a MLQ multi-pulse analysis unit that combines the functions of the two previous incorporations. In other words, a range of gains is provided, and for each one, there is a sequence of pulse trains. The sequence that most resembles the target vector is provided as the output signal. In a further additional embodiment, the output of the maximum likelihood analysis and pulse train multiple pulse units are compared and the sequence that most closely resembles the target vector is provided as an output signal. BRIEF DESCRIPTION OF THE DRAWINGS The present invention will be more fully understood and appreciated from the following detailed description taken in conjunction with the drawings in which: Fig. 1 is a block diagram illustration of a first embodiment of the waste processing system; voice of the present invention; Fig. 2 is a flow graph illustration of the operation of a Quantization block of Maximum Probability of Multiple Pulses (MP-MLQ) of Fig.l; Figs. 3A and 3B are graphic illustrations, useful for understanding the functions of Fig. 2; Figs. 4A and 4B are the graphic illustration describing the pulse trains and the multiple pulse analysis using pulse trains, respectively; FIG. 5 is a block diagram illustration of a second embodiment of the voice processing system of the present invention using pulse trains; Fig. 6 is a flow chart illustration of the functions of the pulsed train multiple pulse analysis unit of Fig. 5; and Fig. 7 is a block diagram illustration of a third embodiment comparing the output of the systems of Figs. 1 and 5. DETAILED DESCRIPTION OF PREFERRED INCORPORATIONS Reference is now made to Figs. 1, 2, 3A and 3B illustrating a first embodiment of the present invention. The speech processing system of the present invention includes at least a short-term prediction analyzer 10, a long-term prediction analyzer 12, a target vector generator 13 and a quantization multiplex pulse analysis unit of maximum probability (MP-MLQ) 14. The short-term prediction analyzer 10 receives, on the input line 16, an input frame of a speech signal formed by a multiplicity of digitized speech samples. Typically, there are 240 speech samples per frame and often the frame is separated into a plurality of subframes. Typically, there are four subframes, each of 60 samples long. The input frame can be a frame of an original voice signal or a processed version thereof. The short-term prediction analyzer 10 also receives, on the input line 16, the input frame and produces, on the input line 17, the short-term characteristics of the input frame. In one embodiment, the analyzer 10 performs the linear prediction analysis to produce linear prediction coefficients (LPCs) that characterize the input frame. For purposes of the present invention, the analyzer 10 can perform any type of LPC analysis. For example, the LPC analysis can be carried out as described in chapter 6.4.2 of the book Disital Speech Processing. Synthesis and Recognition. as follows: a Hamming window is applied to a window of 180 samples centered on a subframe. Tenth order LPC coefficients are generated, using the Durbin recurrence method. The process is repeated for each subframe. The long-term predictor analyzer 12 can be any type of long-term predictor and works with the input frame received on line 16. The long-term analyzer analyzes a plurality of subframes of the input frame to determine the pitch value of the speech within each subframe, wherein the pitch value is defined as the number of samples after which the speech signal approximately repeats itself. Tone values typically vary between 20 and 146, where 20 indicates a high voice and 146 indicates a low voice. For example, for each two subframes, a tone estimate can be determined by maximizing a normalized cross-correlation function of the subframes s (n), as follows: £ s (k) s (k-i) C-i =, 0 < k < 119, 20 < i < 146 (1) ? s (k-i) s (k-i) For this example, the long-term analyzer 12 selects the index i that maximizes the cross-correlation C_i as the tone value of the two subframes. Once the long-term analyzer 12 determines the tone value, the tone value is used to determine the long-term prediction information for the sub-frame, provided on the output line 18. The destination vector generator receives the output signals from the long-term analyzer 12 and the short-term analyzer 10 as well as the input frame on the input line 16, through a delay 19. In response to these signals the meta vector generator 13 generates a vector of destination from at least one subframe of the input frame. The long and short term information can be used, if desired, or can be ignored. The delay 19 ensures that the input frame arriving at the destination vector corresponds to the output of the analyzers 10 and 12.

An output line 26 of the destination vector generator 13, which is connected to the MP-MLQ unit 14, transports the output signal of the destination vector. The MP-MLQ unit 14 is also typically connected to the output line 17 carrying the short-term characteristics produced by the analyzer 10. It will be appreciated that, without any loss of generality, the destination vector towards the MP-MLQ unit 14 can occur in any other way. In accordance with the first preferred embodiment of the present invention, the MP-MLQ unit 14 includes an initial pulse location determiner 20, a gain range determiner 22, a gain level selector 24, a pulse sequence determiner. 25, a target vector comparator 28 and an optional encoder 30. The specific functions performed by the elements 20-30 are illustrated in Fig. 2 and are described in detail hereunder. The following is a general description of the operation of unit 1. The initial pulse location determiner 20 receives the output signals from the target vector generator 13 and the short-term analyzer 10 along the output lines 17 and 26, respectively. Determine the location of the first impulse sample in accordance with multiple pulse analysis techniques.

The gain range determiner 22 receives the first pulse output from unit 20 and determines both the amplitude of the first pulse and a range of gain levels quantized around the absolute value of the determined amplitude. The step size, called MLQ_STEPS, to pass through the range of quantized gain levels, typically has a value of 3 different gain levels. The pitch size, MLQ_STEPS, is not determined by the MP_MLQ unit 14. The gain level selector 24 receives the gain range produced by the gain range determiner 22 and moves through the gain values of the range of gain. This output, on the output line 32, is a current gain level for which a sequence of single gain pulses is to be determined. The pulse sequence determiner 25 receives the destination vector, on line 26, and the current gain level, on line 32, and determines from there, using multiple pulse analysis techniques as described below in the present, a sequence of impulses (both positive and negative impulses) that are compared with the target vector. The pulse sequence is a series of positive and negative pulses that have the current gain level. The destination vector comparator 28 receives the output of the pulse sequence, on the output line 34, of the determiner 25, and the destination vector, on the output line 26. The comparator 28 determines the quality of the comparison using a criterion and maximum likelihood type. Because there is a range of gain levels, the comparator 28 returns control to the gain level selector 24 to select the next gain level. This control return is indicated by arrow 36. The comparator 28 determines for each gain value the quality of the comparison, saving the comparison (gain index and pulse sequence) only if it provides a lower value for the criterion than the comparisons. previous Once the gain selector 24 has passed through all the gain values, the gain index and the pulse sequence stored in the comparator 28 is the closest comparison to the target vector. The comparator 28 then sends the stored pulse sequence and the gain index along the output line 38 to the optional encoder 30. It will be appreciated that, when determining a pulse sequence for each of a few gain levels the MP-MLQ unit 14 can select the one that most resembles the target vector. The optional encoder 30 encodes the sequence of output pulses and the gain index for storage or transmission. The specific functions of the MP-MLQ unit 14 are shown in Fig. 2. In the initialization step 40, unit 14 generates the following signals: a) a response to the impulse h [n] for the output frame from the short-term characteristics a_i defined as: htnl =? ai * h [ni] + d [n], 0 = n = Nl, 1 = i = P (2) h [n] - 0, n »1 ... P where P is the number of short characteristics term and N is the number of speech samples in the subframe b) the result r_hh [l] of a one-response response autocorrelation, for each sample position 1, as follows r_hh [l] =? h [n] * h [n-l], 0 < 1 < N-1, 1 < n < Nl) (3) and (c) the result r_th [l] of a cross-correlation between the impulse response h [n] and the destination vector t [n], for each sample position 1, as follows: r_th [ l] = £ t [n] * h [nl], 0 = 1 = Nl, 1 = n = Nl) (4) It will be appreciated that the impulse response is a function of the short-term characteristics a_l provided along line 17 from the analyzer . The impulse response generated in the initialization step 40 corresponds to the aforementioned Durbin LPC analysis. The MP-MLQ unit 14 uses a local criterion LC_kj [l] to determine a quantitative value for each sample position 1, each pulse k and each gain level j. As will be seen below, the level of the local criterion depends on the value of k (that is, the number of impulses already determined). In step 42, the local criterion LC_0, j [l] for the determination of the first pulse is initialized to the cross correlation function r_th [l], as follows: LC_0 [1] - LC_0, j [l] = r_th [ l], 0 < 1 = N-l, j_min = j = j_max (5) A maximum local value is also established for the local criterion at some negative value. The position of index 1 is also initialized to 0. In steps 44 - 50 the position 1 of the first impulse k = 1 is determined. To do so, the absolute value of the local criterion LC_0j [l] is compared with the maximum local value) step 44). If LC_0, j [l] is greater, position 1 is stored, the local maximum value is adjusted to the absolute value of the local criterion LC_, j [l] (step 46) and the position index 1 is increased by 1 (step 48). The function is repeated until all positions have been checked 1. The sample position l_opt that is stored after all positions have been checked is the selected sample position l_opt. The steps 40-50 are performed by the pulse location determiner 20. The step 52 is performed by the gain range determiner 22. Step 52, the maximum amplitude A_max of the position 1 that produced the largest local criterion LC_0, j [l] is generated as follows: A_max = A_max_j = | LC_0, j [l_opt] | / r_hh [0], j_min < j = max (6) where l_opt is the position of the first impulse. The maximum value A_max is then approximated through a previously determined set of gain levels. For example, if the expected amplitude levels are in the range of 0.1 - 2.0 units, the gain levels can be every 0.1 units. Therefore, if A_max is 0.756, it is quantized to 0.8. The steps 54-58 are executed by the gain selector 24. In step 54, the gain selector 24 determines the gain index j associated with the determined gain level as well as a range of gain indices around the gain index. gain j. The range of gain levels can be of any size depending on the previously determined MLQ_STEPS value. In step 54, the gain selector 24 adjusts the gain index to the minimum. For the previous example, 0.1 could have an index 1 and MLQ_STEPS could be 3. Therefore, the determined gain index is 8 and the range is between the indices 5 - 11. Step 54 also establishes a minimum global value for any very large value, for example 1013.

In the present invention, for each gain index, the first pulse is the pulse location determined by the pulse location determiner. (in steps 44 - 50). The remaining impulses are in any other place within the subframe and may have positive or negative gain values. In step 56, the gain selector 24 stores the first position of the pulse and its amplitude. In step 58, the local criterion LC_k, j [l] is initialized, for the present impulse index k and the gain index j, typically in accordance with equation 5. The pulse sequence determiner 25 executes steps 60-74. In step 60, the determiner 25 adjusts the maximum local value in a large value, as before, and adjusts the position index 1 to 0. In step 62, the determiner 25 updates the local criterion with the previous pulse, as follows: LC_, j [l] = LC_k-l, j [ l] - A_k-1, j * r_hh [l-l_op_k-l, j],. j = gain index (7) k = pulse index 1 = position index In the loop of steps 64-70, the pulse sequence determiner 25 determines the location of a pulse in a manner similar to that performed in the steps 44-50 and therefore, will not be described further herein. In step 72, the determiner 24 stores the selected pulse and in step 74 updates the pulse value. The steps 62-74 are repeated for each pulse of the sequence, the result of which is the sequence of the output pulse of the pulse sequence determiner 25. It is noted that step 62 updates the local criterion for each pulse that has been found. Figs. 3A and 3B illustrate two examples of pulse sequence outputs other than the pulse sequence determiner 25. The sequence of Fig. 3A has a gain index of 7 and the sequence of Fig. 3B has a gain i of 8. Both sequences have the same first sample position 10 but the rest of the pulses are in other positions. It is pointed out that the impulses can be positive or negative. In step 76, the target vector comparator 28 determines the value of a global criterion GC j for each gain level j. The global criterion GC_j can be any suitable criterion and is typically a criterion of the type of maximum probability. For example, the overall criterion can measure the energy in an error vector defined as the difference between the target vector and an estimated vector produced by filtering the single-gain pulse sequence through a perceptual weighting filter, in this case defined by the short-term characteristics. For said criterion, the destination vector comparator 28 includes a perceptual weighting filter. It will be appreciated that the sequence is pulsed, per se, does not compare the target vector; The impulse sequence represents a function that compares the target vector. As given by equations 8a-8e that appear in the present, the global criterion GC_j is composed of two elements, p_j and d_j, both are functions of a signal x_j [n] which is the series of pulses for the level of gain j filtered by the short-term impulse response h [n]. P_j is the cross-correlation between the destination vector t [n] and x [n] and d_j is the energy of x_j [n].

GC_j = -2p_j + d_j (8a) p_j = £ t [n] * x_j [n], O = n = Nl (8b) d_j * £ x_j [n] * x_j [n], O = n = Nl (8c ) x_Jtnl =? y_j [i] * h [in], O = i = n, O = n = Nl (8d) v_j [n] = A_k, j for n = l_opt_k, j, 0 = k = Kl, O = n = Nl (8e) 0, otherwise In step 78, the global criterion GC_j for the current gain index j is compared to the current minimum global value. If it is less than the current minimum global value, as verified in step 78, the destination vector comparator 28 stores (step 80) the gain index and its associated pulse sequence. In step 82, the gain level selector 24 updates the gain index and, in step 84 it checks whether or not pulse sequences have been determined for all gain levels. If the answer is positive, the sequence of impulses and the gain index that are stored are those that most resemble the target vector in accordance with the global criterion GC_j. In step 86, the optional encoder 30 encodes the pulse sequence and the gain index as output signals, for transmission or storage, in accordance with any coding method. If desired, the destination vector can be reconstructed using x_jopt [n], where jopt is the gain index resulting from step 84. It will be appreciated that the MP-MLQ unit 14 of the present invention provides, as output signals, for at least the sequence of impulses and the selected gain levels. Reference is now made to Figs. 4A, 4B, 5 and 6 illustrating an alternative embodiment of the present invention utilizing pulse trains. A train of impulses 83 is illustrated in Fig. 4A. It comprises a series of impulses 81 separated by a distance Q which is the tone. In the system shown in Fig. 5, there is a sequence of pulses that are most similar to a target vector. Fig. 4B illustrates an example of a sequence of three pulse trains 83a, 83b and 83c that could be found. Each pulse train 83 starts at a different example position. The pulse train 83a is the first and comprises four pulses. The pulse train 83b starts at a rear position and comprises three pulses, and the pulse train 83c, which starts at a very posterior position, comprises only two pulses. The system of Fig. 5 is similar to that shown in Fig. 1; the only difference being that a) the pulse location determiner 20 and the pulse sequence determiner 25 of Fig. 1 are replaced by the pulse train sequence determiner 88 and the pulse train sequence determiner 89; b) the target vector comparator, marked 90, operates in pulse train sequences rather than pulse sequences; and c) the determiners 88 and 89 receive the tone value Q along the output line 18. In addition, the output lines 34 and 38 are replaced by the output lines 92 and 94 that carry signals representing train sequences. of impulses instead of sequence of impulses. The pulse train determiner 88 operates in a manner similar to the pulse determiner 20 except that the determiner 88 uses a pulse train pulse response h_T [n] in place of the pulse impulse response h [n]. h_T [n] is defined as: h_T [n] - £ h [n-k-Q], O = n = N-l, 0 = k = (N-l) / Q (9) where Q is the tone value. As can be seen, the trains that pulse. they are in later positions have typically fewer impulses. The impulse response correlation of the pulse train of equation 3 becomes: r_hh [l] - £ h_T [n] * h_T [nl], 0 = 1 = Nl, 1 = n = Nl (10) and the cross-correlation r_th [l] between the impulse response h_T [n] and the destination vector t [n], for each sample position 1, becomes: r_th [l] = £ t [n] * h_T [nl ], 0 = 1 = Nl, 1 = n < N-l (11) The pulse train sequence determiner 89 operates in a manner similar to the pulse sequence determiner 25 but the determiner 89 generates pulse train sequences. The destination vector comparator 90 operates in a manner similar to the destination vector comparator 28; however, the comparator 90 uses the pulse response function of the pulse train h_T [n] instead of h [n]. Thus, equation 8d becomes: x_j [n] = £ v_j [i] * hT [i-n], O = i = n, O = n = N-l (12) The specific functions of pulse train multiple pulse analysis unit 86 are shown in Fig. 6. The steps are equivalent to those shown in Fig. 2; however, the equations work on the pulse trains instead of on individual impulses. Therefore, in equation 9, a pulse train impulse response h_T [n] is defined which has pulses every Q steps. Pulsed trains that are in later positions typically have fewer pulses.

The remaining equations are similar except that they work on the impulse response h_T [n]. If desired, the gain range determined by the gain range determiner 22 may have only one gain index. In this embodiment, pulse train multiplex analysis unit 86 determines the pulse train sequence having the gain level in the first pulse train sequence. In this embodiment, the destination vector comparator 90 does not work, nor is there any repetition of the functions of the gain level selector 24 and the pulse train sequence determiner 89. It will be further appreciated that the output of the destination vector comparators 28 and 90. This is illustrated in Fig. 7 to which reference is now made. The output signals of the comparators 28 and 90, representing the sequences and the overall criteria, are provided along with the lines 38 and 94 to a comparator 100. The comparator 100 compares the GC_jopt global criteria from the comparators 28 and 90 and select the lowest one. An output signal is provided which represents the resulting sequence, pulses or pulse train along the output line 102. It will be appreciated that the systems of Figs. 1, 5 and 7 can be executed on a digital signal processing chip or software. In one embodiment, the software was written in the C ++ programming language, another in the Assembly language. Those skilled in the art will appreciate that the present invention is not limited to what has been particularly described herein. Rather, the scope of the present invention is defined only in the claims that follow.

Claims

Having described the foregoing invention, the following CLAIMS are claimed as property 1. A voice processing system comprising: a short-term analyzer connected to an input line and an output line where, in response to a voice signal In the entry line, the short-term analyzer generates the short-term characteristics of the input speech signal; a destination vector generator for generating a destination vector from at least the input speech signal and, optionally, the short-term characteristics; and a multiple pulse analyzer connected to an output line of the vector generator, wherein the multiple pulse analyzer generates a plurality of sequences of equal amplitude, variable sign, variablely spaced pulses, each of the sequences having a different amplitude value, each pulse having within each sequence equal amplitudes but variable signs, the multiple pulse analyzer to send a signal corresponding to the sequence of the same amplitude, variable sign, variablely spaced pulses that, in accordance with a maximum likelihood criterion, which more accurately represents the destination vector.
2. A voice processing system incorporating a short-term analyzer to generate the short-term characteristics using a linear prediction coefficient analysis in an input speech signal, comprising: a target vector generator to generate a destination vector from at least the input speech signal and, optionally, the short-term characteristics; a location determiner of the initial impulse to determine the location of an initial impulse in accordance with the techniques of multiple pulse analysis, based on 1 destination vector and short-term characteristics; an amplitude range determiner to determine both an amplitude of the initial impulse and a range of quantized amplitude levels grouped around an absolute value of the amplitude; an amplitude level selector to pass through the range of quantized amplitude levels in accordance with a previously determined step size, with the amplitude level selector emitting a quantized amplitude selected at each step; a pulse sequence determiner for generating, based on the selected quantized amplitude, a sequence of equal amplitude, variable sign, variablely spaced pulses corresponding to the destination vector; a target vector comparator to determine an error vector that corresponds to the quality of the comparison between the sequence of equal amplitude, variable sign, variablely spaced pulses and the destination vector, to determine the error vector for each of the selected amplitudes, to emit the sequence of equal amplitude, variable sign, pulses spaced in a variable manner corresponding to a minimum error vector.
3. The system according to claim 2 wherein the initial pulse of each of the sequences of equal amplitude, variable sign, variablely spaced pulses is located in the same sample position.
4. The system according to claim 2 wherein the destination vector comparator includes a global criterion determiner, the global criterion determiner includes a perceptual weighting filter to filter the sequence of equal amplitude, variable sign, spatially spaced pulses variable and a determiner to determine the amount of energy in the error vector, for each selected quantized amplitude, the vector defined as the difference between the target vector and the filter output, the perceptual weighting filter having characteristics that correspond to the characteristics Short-term.
5. A voice processing system incorporating a short-term analyzer to generate the short-term characteristics using a linear prediction coefficient analysis in an input speech signal, comprising: a target vector generator to generate a target vector from at least the input voice signal and, optionally, the short and long term characteristics; a location determiner of the initial impulse to determine the location of an initial impulse of an initial pulse train conforming to the techniques of multiple pulse analysis, based on the target vector and the short-term characteristics and the pitch value; a pulse train sequence determiner for generating a plurality of variable signal trains of equal amplitude, uniformly spaced pulses corresponding to the destination vector, the trains having a pulse spacing corresponding to the tone value, having the pulses within Each train has the same sign, and all the impulses of the trains have the same amplitude level.
6. A voice processing system comprising: a long-term analyzer connected to an input line and an output line where, in response to a voice input signal on the input line, the long-term analyzer generated by long-term characteristics including at least one tone value of the input speech signal; a short-term analyzer connected to an input line and an output line where, in response to a voice signal input on the input line, the short-term analyzer generates short-term characteristics of the speech signal of entry; a target vector generator for generating a destination vector from at least the input speech signal and, optionally, short and long term characteristics; and a pulse train multiple pulse analyzer, connected to an output line of the target vector generator to generate a plurality of variable-amplitude pulse trains of equal amplitude, uniformly spaced pulses, having the pulses within each train the same sign, and each of the pulse train sequences having a different amplitude value, the pulse train multiple pulse analyzer sending a signal corresponding to the plurality of trains of equal amplitude, uniformly spaced pulses, in accordance with a maximum likelihood criterion, which most faithfully represents the destination vector.
The system according to claim 6 wherein each of the pulses within each pulse train where the tone value separates from each other.
The system according to claim 6, wherein the initial pulse of the initial pulse train of each pulse train sequence is located at the same sample position.
9. A voice processing system incorporating a short-term analyzer to generate short-term characteristics using a linear prediction coefficient analysis in an input speech signal from an input speech signal and incorporating an analyzer long term to determine the long-term characteristics and a voice tone value from the speech input signal, the system comprising: a target vector generator to generate a destination vector from at least one input voice signal and, optionally, short and long term characteristics; a location determiner of the initial impulse to determine the location of an initial impulse of an initial pulse train conforming to the techniques of multiple pulse analysis, based on the target vector and the short-term characteristics and the pitch value; an amplitude range determiner for determining both an amplitude of the initial pulse train and a range of quantized amplitude levels grouped around an absolute value of the amplitude; an amplitude level selector to pass through the range of quantized amplitude levels in accordance with a previously determined step size, with the amplitude level selector emitting a quantized amplitude selected at each step; a pulse train sequence determiner for generating, for each selected quantized amplitude, a plurality of pulse trains of variable signs of equal amplitude, uniformly spaced pulses corresponding to the destination vector, the train pulses having a pulse spacing which corresponds to the tone value, the pulses of each train having the same sign, the pulses of each of each train of pulses having the same amplitude, the same amplitude corresponding to the selected quantized amplitude; and a destination vector comparator for determining an error vector that corresponds to the quality of the comparison between the plurality of variable-stream trains of equal amplitude, evenly spaced pulses and the destination vector, to determine the error vector for each selected quantized amplitude, to emit the target vector comparator the train sequence of equal amplitude, equal sign, uniformly spaced pulses corresponding to a minimum error vector.
The system according to claim 9 wherein the destination vector comparator includes a global criterion determiner, the global criterion determiner includes a perceptual weighting filter for filtering the plurality of variable sign streams of equal amplitude, uniformly spaced pulses and a determiner to determine the amount of energy in the error vector, for each selected quantized amplitude, the vector defined as the difference between the target vector and the filter output, the perceptual weighting filter having characteristics that correspond to the short-term characteristics.
The system according to claim 10 further comprising: a multiple pulse analyzer connected to an output line of the vector generator, wherein the multiple pulse analyzer generates a plurality of sequences of equal amplitude, variable sign, variablely spaced pulses, each of the sequences having a different amplitude value, each pulse having within each sequence equal amplitudes but variable signs, the multi-pulse analyzer to send a signal corresponding to the sequence of the same amplitude, variable sign, variablely spaced pulses which, in accordance with a maximum likelihood criterion, more faithfully represents the destination vector; and a comparator that receives information from both the pulsed pulse multiple pulse analyzer and the multiple pulse analyzer to select the output that most faithfully represents the vector.
12. A method of voice processing comprising the steps of: determining the short-term characteristics of an input speech signal; generating a destination vector from at least the input speech signal and, optionally, the short and long term characteristics; determine the location of an initial impulse of an initial pulse train conforming to the techniques of multiple pulse analysis, based on the target vector and the short-term characteristics; determining both an amplitude of the initial pulse train and a range of quantized amplitude levels grouped around an absolute value of the amplitude; passing through the range of quantized amplitude levels in accordance with a predetermined step size, with the amplitude level selector emitting a quantized amplitude selected at each step; generating, based on the selected quantized amplitude, a sequence of equal amplitude, variable sign, variablely spaced pulses corresponding to the destination vector; compare each sequence of equal amplitude, variable sign, pulses spaced variably with the target vector; and select the sequence and equal amplitude, variable sign, variablely spaced pulses that, in accordance with a maximum likelihood criterion, more faithfully represents the target vector.
13. The method according to the claim 12 where the initial pulse of each sequence of equal amplitude, variable sign, variablely spaced pulses is located in the same sample position.
The method according to claim 12 wherein the comparison step includes the steps of: filtering the sequence of equal amplitude, variable sign, pulses variably spaced through a perceptual weighting filter, whose characteristics are the characteristics of short term; and determining, for each level of quantized amplitude, the vector defined as the difference between the destination vector and the filter output, having the perceptual weighting filter.
15. A method of voice processing comprising the steps of: determining the short-term characteristics of an input speech signal; determining the long-term characteristics of an input speech signal including at least one tone value of the input speech signal; generate a destination vector from at least the input voice signal and, optionally, the short and long term characteristics; determine the location of an initial impulse of an initial pulse train in accordance with the techniques of multiple pulse analysis, based on the target vector, short-term characteristics and tone value; generating a plurality of variable signal trains of equal amplitude, uniformly spaced pulses corresponding to the destination vector, the train pulses having a pulse spacing corresponding to the tone value, the train pulses having the same amplitude level , having the impulses inside each train the same sign. 20.
A method of voice processing comprising the steps of: determining the short-term characteristics of an input speech signal; determining the long-term characteristics of an input speech signal including at least one tone value of the input speech signal; generating a destination vector from at least the input speech signal and, optionally, the short and long term characteristics; determine the location of an initial impulse of an initial pulse train in accordance with the techniques of multiple pulse analysis, based on the target vector, short-term characteristics and tone value; determining both an amplitude of the initial pulse train and a range of quantized amplitude levels grouped around an absolute value of the amplitude; passing through the range of quantized amplitude levels in accordance with a predetermined step size, with the amplitude level selector emitting a quantized amplitude selected at each step; generating, for each selected quantized amplitude, a plurality of pulse trains of variable signs of equal amplitude, uniformly spaced pulses corresponding to the target vector, the pulses of the pulse trains having a pulse spacing corresponding to the tone value, the pulses having the same amplitude within each pulse train, the same amplitude corresponding to the selected quantized amplitude, the pulses within each train having the same sign; comparing the plurality of pulse trains of variable sign of equal amplitude, pulses separated evenly with the target vector; and selecting the plurality of pulse trains of variable sign of equal amplitude, uniformly spaced pulses, in accordance with a maximum likelihood criterion, which most faithfully represents the destination vector.
17. The method according to claim 17, wherein the initial pulse of each pulse train sequence is located at the same sample position. EXTRACT OF THE INVENTION The present invention relates generally to voice processing systems, and in particular to multiple pulse analysis systems.