US3800093A - Method of designing orthogonal filters - Google Patents

Method of designing orthogonal filters Download PDF

Info

Publication number
US3800093A
US3800093A US00191003A US19100371A US3800093A US 3800093 A US3800093 A US 3800093A US 00191003 A US00191003 A US 00191003A US 19100371 A US19100371 A US 19100371A US 3800093 A US3800093 A US 3800093A
Authority
US
United States
Prior art keywords
transfer function
filter
section
poles
orthogonal
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Lifetime
Application number
US00191003A
Inventor
A Wolf
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Individual
Original Assignee
Individual
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Individual filed Critical Individual
Priority to US00191003A priority Critical patent/US3800093A/en
Priority to US00313239A priority patent/US3833767A/en
Application granted granted Critical
Publication of US3800093A publication Critical patent/US3800093A/en
Anticipated expiration legal-status Critical
Expired - Lifetime legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
    • G10L19/02Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using spectral analysis, e.g. transform vocoders or subband vocoders
    • G10L19/0212Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using spectral analysis, e.g. transform vocoders or subband vocoders using orthogonal transformation
    • HELECTRICITY
    • H03ELECTRONIC CIRCUITRY
    • H03HIMPEDANCE NETWORKS, e.g. RESONANT CIRCUITS; RESONATORS
    • H03H17/00Networks using digital techniques
    • H03H17/02Frequency selective networks

Definitions

  • An orthogonal filter is a filter whose output is a set of orthogonal functions related by their coefficients to the input signal to said filter. If the input signal is random white noise, then the filters output is a set of random orthogonal functions. If the coefficients for this set of random functions are properly selected, any voice signal may be reproduced by temporally averaging the resulting set of orthogonal functions.
  • phonemes are the sound features which are common to all speakers of a given speech form and which are exactly reproduced in repetition.
  • phonemes are known soundwise, and any system employing phonemes records the basic phoneme sounds on magnetic tape or some other recording means.
  • a computer program is then written to connect the proper phonemes to produce words that convey speech information in the form of recurrent patterns. In such a system one can type information into, say a computer, and have it speak back to the operator.
  • Vocoder Voice Coder
  • the Vocoder dates back to the l920s.
  • the standard Vocoder is a spectrum/channel vocoder. It consists of an analyzer which produces a signal proportional to the short term amplitude spectrum of the fundamental frequency of the speech input, and the synthesizer consists of devices that reconstruct speech by means of electrical signals appearing at the analyzer output. In both the analyzer and the synthesizer, signals are generated that are proportional to both the voiced and unvoiced sounds and the pitch of the sounds.
  • the present speech system capitalizes on the fact that speech is a stochastic process. Speech is stochastic because long samples of speech convey information which is probabilistic in time.
  • the present system employs a gaussian white noise source; the output of which is passed through an orthogonal filter to produce a set of random orthogonal functions which when multiplied by a speech signal and averaged produces coefficients which can be used to synthesize speech information at a later time. This method is somewhat analogous to the use of generalized Fourier coefficients to define and reproduce a periodic function.
  • the present invention relates to a general method for designing orthogonal filters.
  • an orthogonal filter design is disclosed which may be used in a speech synthesizing and codifying system.
  • This invention makes it possible to transform a sample of speech of length, T into a set of speech coefficients designated by a a a,,, each of which depends on time according to the sample length, T of the speech sample, which is explained further below.
  • These coefficients a a n can be thought of as a set of generalized Fourier coefficients defined on a set of orthogonal noise sample functions obtained from a white noise source. If the waveform of the speech sample, of length T is denoted by x(t), then by making use of the fact that x(t) is a sample function from a stochastic process, it can be decomposed into an infinite series of orthogonal sample functions taken from a white noise source.
  • Each orthogonal sample function is weighted by a coefficient the value of which depends on the speech sample under consideration.
  • the set of coefficients (a n l, 2, thus contains the necessary information from which the original speech sample can be reconstructed by weighting the orthogonal sample functions ⁇ w,. (t): n 1,2,. (see FIG. 5) withthe appropriate corresponding coefficient, a,,: n l, 2, in which T is a strip of time running to infinity and r is a given instant of time.
  • the invention makes possible the maximum compression of speech by using the coefficients am l, 2, l to convey information. Using these coefficients, it is now possible for man to communicate with computers and to have them in turn communicate with man by means of speech. This invention opens up the possibility of new, unforeseen innovations in machines and systems in which'there is a speech communication interface with man.- One such example, in addition to the possibility of talking to a computer is the possibility of talking to a typewriter.
  • FIG. 1 is a block diagram of the system for obtaining the speech extraction coefficients:
  • the speech signal 2 denoted by the sample functions x(t) of sample length T is multiplied (instant by instant) by each orthogonal noise signal denoted respectively by the sample functions v (r), v (t), v (t), v,,(t), which are derived as the set of outputs of an orthogonal filter 12, described below when the white noise signal 10, denoted by the sample functioni g(t) ⁇ , of spectral density N watts per cycle, is applied to the orthogonal filters input.
  • the sample functions that result from the multiplication of the set of orthogonal noise sample functions v,(t), v,,(t) with the sample function x(t) in the multipliers 4, 6, and 8, are the speech extraction samples of length T seconds as denoted by v,,(t) x(t): n l, 2,
  • This set of product sample functions v,(t) x(t), v,(t) x(t), v (t) x(t), v,,(t) x(t) is averaged temporally in the temporal averages l4, l6, and 18 give rise to the corresponding set of cofficients.
  • T a3 J v (t)x(t)dt (3) CW
  • the integral over (0,?) of each of the product sample functions, v,,x divided by T is the temporal average of those sample functions.
  • the resulting speech extraction coefficients a,, a a are dependent only on the sample length, T and are the coefficients that represent the essential information needed to reconstruct the speech signal, x(t).
  • a rule of thumb for the value of T, the averaging time is that about ten times the sample length, T o of the speech sample.
  • the orthogonal filter is a linear filter with one pair of input terminals and many pairs of output terminals.
  • the filter which is described below in detail, may be roughly likened to an ideal prism on which white light is incident.
  • the incident light may be thought of as the input to the prism.
  • the output of an ideal prism is essentially the complementary colors in response to the incident white light upon it.
  • the complementary colors are of course pairwise orthogonal.
  • the orthogonal functions are random within a band of frequencies whereas in the prism case the complementary functions are in principle roughly single frequency sinusoids with random amplitudes and random phases. The latter is due to the randomness of the white light.
  • Linear orthogonal filters can be constructed as a chain of linear filter sections in which the poles of any section of the filter, in the complex plane, is essentially cancelled by the zeros of the next immediate section.
  • FIG.-2 shows this general scheme for constructing an orthogonal filter.
  • the process for realizing the actual construction of an orthogonal filter is given by the following formula:
  • I-I,(s) be the transfer function of the first filter section. Then The numerator B(s) is selected to satisfy the equation and the denominator C(s) is selected to satisfy the equation C (s) c c s c s" (8) where the constants b b b and the constants c c c are chosen such that H(s) is physically realizable.
  • the transfer function is designed such that the numerator C (s) is exactly like the denominator of the first section, H,(s) with s replaced by (s).
  • the zeros of H (s) H (s)H (s) occupy the same positions as the poles of H,(s) except that they are reflected about the jw-axis.
  • the zeros of one transfer function in effect cancel the poles of the next section giving rise to the orthogonality of the pair.
  • the n-th transfer function of the orthogonal filter from the input of the filter is given by the formula ff cn- H,.(s) KnBts) (1 H k( n for n s 2.
  • H is the product of all is the product of all factors in the expansion of Equation 10 from I to n.
  • a filter is developed using this process of construction of orthogonal filters having simple poles and zeros along the axis of the reals. This gives rise to simple resistancecapacitance (RC) network with differential amplifiers.
  • the RC orthogonal filter is shown in FIG. 3.
  • the first section of the filter includes an input resistor 34, feedback resistor 36 designated as R in Equation 13, feedback capacitor 38, designated as C in Equation 13, and operational amplifier 40.
  • the subsequent section of the filter includes inputs resistors 42, 44, designated as r and r in Equation 15 respectively, resistors 48, 50, 54, and 56; capacitors 46, 52; and differential amplifier 58.
  • Resistors 48 and 50 are designated as R, in Equation 17, resistors 54 and 56 are designated as R and R respectively in Equation 16 and capacitors 46 and 52 are designated as C, in Equation 17.
  • the transfer function of the first section is given by i/ 1) (It) and for n a 2 A n/ n) n-1/ n) n-m) where for design purposes the first pole is placed at s along the real axis where and sec-
  • the poles s s s can be placed arbitrarily along the real axis. But for the speech synthesizer this procedure will fail to work. It is important to place the poles s s s in such way as to make the reconstructed signal converge to the desired speech signal.
  • the first pole is selected approximately equal to the bandwidth of the speech signal. Since speech signals can cover a bandwidth of the order of 200 Hz to 2,500 Hz, depend-- ing on the speaker, s, can be selected to be a frequency within this range.
  • the remaining poles are chosen according to the formula s s /n n 2 1 18) If the remaining poles are placed according to the formula s" lln n 2 1 (1.9) then the reconstructed signal will converge absolutely almost everywhere. In general be needed for convergence. There is a practical difficulty however, which causes a resolution problem in placing the poles at distinct positions along the real axis.
  • the orthogonal filter is made by cascading H (s) and H (s) together and then adding to this cascade as many sections as needed according to the requirements of the application, in which subsequent sec tions are identical to H (s) in terms of the circuit used.
  • the hardware is obtained by standard analog computer techniques or by factoring the transfer functions and then determining the equivalent partial fraction expansion using amplifiers and differential amplifiers to take account of the negative signs that may result from the partial fraction constants.
  • the process described above is the one used to obtain the hardware configuration of the orthogonal filter shown in FIG. 3.
  • SIGNAL MULTIPLIERS The signal multipliers shown in FIG. 1 for multiplying the orthogonal noise components of the white noise source with the speech signal are standard devices. In the frequency range of interest, i.e., bandwidths up to 20 KHz, the quarter Gauss square method was used. Other types can easily be used instead.
  • the quarter Gauss square multiplier is one that makes use of the identity (A+B) (A-B) 4 AB TEMPORAL AVERAGER
  • the temporal averager is a simple RC low pass filter with a very long time constant compared to the highest frequency component in the speech signal. If f denotes this frequency, then FIG. 4 gives an example of the configuration of the temporal averager.
  • the temporal averager consists of input resistor 60, feedback capacitor 62 and amplifier 64.
  • the Gaussian white noise source is a standard noise generator of the diode type.
  • a a,, a a are obtained from the Speech Extraction Code Generator, shown in FIG. I, it is possible to reconstruct from these coefficients the original'speech sample, x(t).
  • Equation 27 is the partial sum of a generalized random orthogonal series which converges as n to the speech sample x(t) in some probabiliscan be calculated from the equation For a speech sample x will be fixed. Then 6 is smallest when is largest.
  • FIG. is a block diagram of the speech reconstruction system. Using the coding coefficients 21,, a a and a Gaussian white noise source of spectral power density of l /N watts per cycle the speech sample function, x(t) is reconstructed according to the description given above.
  • the white noise signal, k(t) of spectral density of l/N watts per cycle is applied to an orthogonal filter which has a transfer function, H,,(s) identical to the orthogonal filter of the coding generator shown in FIG. 1.
  • the response of this orthogonal filter is the set of orthogonal noise sample functions w,(t), w (t), w (t), w,,(t). Each of these is multiplied respectively by the coefficients a,, a a derived from the coding generator (FIG. 1).
  • the multiplier is a signal multiplier the same as described previously above.
  • the speech extraction coefficients can be stored in a memory such as a tape as voltages of appropriate value.
  • the novelty of the system consists in the utilization of orthogonal filters, white noise and temporal averaging devices connected in the unique arrangement shown in FIG. ll that gives rise to the speech extraction coefficients, a a a
  • the speech extraction parameters are very narrow band signals. These signals are nearly constant for T T,,. This also means that the system can be used to greatly compress speech.
  • a method for making an orthogonal filter having sections 1 through n comprising the steps of:
  • each successive section following section 2 such that the zeros of their respective tmsfer functions occupy the positions of the poles of the preceeding transfer functions reflected about the jw axis and with the zeros of the successive transfer t nsti n .ssn sllis the poles OM19- arsss s as transfer function thereby giving rise to orthogonality;
  • a method as defined in claim 1 wherein the orthogonal filter is to be used in a speech synthesizer including the steps of:
  • each successive filter section following section 2 such that the zeros their respective transfer function occupy the positions of the poles of the preceeding transfer function reflected about the jw axis and with the zeros of the transfer function of the successive filter section cancelling the poles of the transfer function of the preceeding section thereby giving rise to orthogonality;
  • k denotes all the filter sections in the series from the first filter to the n-th filter inclusive.

Abstract

A method of designing an orthogonal filter for use in a speech synthesizing system wherein the filter is constructed from a series of linear filter sections. The poles in the complex plane of any section of the filter are cancelled by the zeros of the next section.

Description

O United States Patent H 1 1 3,800,093
Wolf Mar. 26, 1974 METHOD OF DESIGNING ORTHOGONAL FILTERS I Primary Examinerl(athleen H. Claffy Assistant Examiner-David L. Stewart [76] Inventor' g ifggg zg BOX Attorney, Agent, or Firm-R. S. Sciascia; Q. E. Hodges 221 Filed: on. 20, 1971 5 ABSTRACT [21] Appl. No.: 191,003 A method of designing an orthogonal filter for use in a speech synthesizing system wherein the filter is constructed from a series of linear filter sections. The 179/15 179/1555 poles in the complex plane of any section of the filter 58] Fie'ld 55 15 B0 are cancelled by thezeros of the next section.
333/20, 70 R 4 Claims, 5 Drawing Figures MULTIPLIER vTlME-f SPEECH SIGNAL T PORAL Q' (SAMPLE LENGTH T q I 4 o T MULTIPLIER fiTlME-t TEMPORAL on Q G AVERAGER 2 MULTIPLIER TEMPORAL a AVERAGER n Io I2 M. I 0 TIME t WHITE NOlSE 9(1) ORTHOGONAL SOURCE v FILTER IsPEcTRA H (S) DENSITY N) I FIRST SECTION HHS) -42 v m 4s v u) .o l sz I SUBSEQUENT SECTIONS FIG. 3.
62 If H 64 6O INPUT I oOUTPUT F/ INVENTOR.
ALFRED A. WOLF A r TORNE Y PAIENTEIJIIIII26 I974 3800.093
SHEET I 0F 4 GAUSSIAN -20 WHITE NOISE SOURCE (SPECTRAL. DENSITY l/N ORTHOGONAL W (I) FILTER w I H (S) 24 MULTIPLIER EXTRACTION 0 J" COEFFICIENTS MULTIPLIER 30 I I A h 2 RECONSTRUCTED FIG 5 SPEECH SIGNAL INVENTOR.
ALFRED A. WOLF AT IVE) 1 METHOD or DESIGNING onrnoconAr. FILTERS This application is a continuation-in-part of an application filed on June 23, 1971 bearing the Ser. No. 155,988 now US Pat. No. 3,746,791, issued July 17,
The invention described herein may be manufactured and used by or for the Government of the United States of America for governmental purposes without the payment of any royalties thereon or therefor.
FIELD OF INVENTION An orthogonal filter is a filter whose output is a set of orthogonal functions related by their coefficients to the input signal to said filter. If the input signal is random white noise, then the filters output is a set of random orthogonal functions. If the coefficients for this set of random functions are properly selected, any voice signal may be reproduced by temporally averaging the resulting set of orthogonal functions.
With the advent of systems in which there is a man/- machine interface such as computer systems or control systems, it is desirable for a man to be able to communicate with the machine as easily as possible. The ideal situation would be to have the man speak to the ma- DESCRIPTION OF THE PRIOR ART In the prior art there is no known general method of designing an orthogonal filter. The US. Pat. to Norbert Wiener, Nos. 2,024,900 and 2,128,257 describe the design of a specific orthogonal filter which was empirically developed for his situation. This orthogonal filter was the only known orthogonal filter from the date of the Wiener patents to the present time. The present method was developed while designing an orthogonal filter for use in a speech synthesizing system and the method is most clearly described in combination with the speech system.
One of the most popular man/machine interface schemes in which the man merely talks to a machine employs the fact that in each language thereare a few basic sounds-that make up the words in that language. These basic sounds are called phonemes. ln more precise language, phonemes are the sound features which are common to all speakers of a given speech form and which are exactly reproduced in repetition. In any language there are a definite and small number of phonemes. 1n the English language there are 46 phonemes. These phonemes are known soundwise, and any system employing phonemes records the basic phoneme sounds on magnetic tape or some other recording means. A computer program is then written to connect the proper phonemes to produce words that convey speech information in the form of recurrent patterns. In such a system one can type information into, say a computer, and have it speak back to the operator.
To date such systems have the disadvantage that, as the vocabulary of the system increases the computer programs become more complicated. Thus the speech vocabulary of such systems is usually very limited. Another disadvantage of the system is that the words spoken by the computer are generally of poor fidelity and difficult to understand. Also, such a system is not flexible since the information that can be conveyed by the computer depends on the extent of the vocabulary of the computer, that is, the complexity of the computer program.
Another type of speech synthesizer system is known as the Vocoder (Voice Coder). The Vocoder dates back to the l920s. The standard Vocoder is a spectrum/channel vocoder. It consists of an analyzer which produces a signal proportional to the short term amplitude spectrum of the fundamental frequency of the speech input, and the synthesizer consists of devices that reconstruct speech by means of electrical signals appearing at the analyzer output. In both the analyzer and the synthesizer, signals are generated that are proportional to both the voiced and unvoiced sounds and the pitch of the sounds. There are, of course, other speech analyzer/synthesizer systems which will not be dealt with here.
Suffice it to say that in each of these systems the method of speech used by the human is imitated in one way or another or the linguistic properties of speech are capitalized upon.
It is'an object of the present disclosure to provide a a general method of designing orthogonal filters.
It is another object of the present disclosure to provide an orthogonal filter for use in a speech synthesizing system.
' These and other objects of the present invention are set forth in the following disclosure.
SUMMARY OF THE INVENTION The present speech system capitalizes on the fact that speech is a stochastic process. Speech is stochastic because long samples of speech convey information which is probabilistic in time. The present system employs a gaussian white noise source; the output of which is passed through an orthogonal filter to produce a set of random orthogonal functions which when multiplied by a speech signal and averaged produces coefficients which can be used to synthesize speech information at a later time. This method is somewhat analogous to the use of generalized Fourier coefficients to define and reproduce a periodic function.
DESCRIPTION OF THE DRAWINGS DESCRIPTION OF THE PREFERRED EMBODIMENTS The present invention relates to a general method for designing orthogonal filters. In particular an orthogonal filter design is disclosed which may be used in a speech synthesizing and codifying system.
It should be noted that the following mathematical terminology is standard in the art and can be found in the textbook'published by the Addison-Wesley Publishing Company, entitled: Principles of Feedback Control," written by Charles H. Wilts, as well as other textbooks.
This invention makes it possible to transform a sample of speech of length, T into a set of speech coefficients designated by a a a,,, each of which depends on time according to the sample length, T of the speech sample, which is explained further below. These coefficients a a n can be thought of as a set of generalized Fourier coefficients defined on a set of orthogonal noise sample functions obtained from a white noise source. If the waveform of the speech sample, of length T is denoted by x(t), then by making use of the fact that x(t) is a sample function from a stochastic process, it can be decomposed into an infinite series of orthogonal sample functions taken from a white noise source. Each orthogonal sample function is weighted by a coefficient the value of which depends on the speech sample under consideration. The set of coefficients (a n l, 2, thus contains the necessary information from which the original speech sample can be reconstructed by weighting the orthogonal sample functions {w,. (t): n 1,2,. (see FIG. 5) withthe appropriate corresponding coefficient, a,,: n l, 2, in which T is a strip of time running to infinity and r is a given instant of time. The invention makes possible the maximum compression of speech by using the coefficients am l, 2, l to convey information. Using these coefficients, it is now possible for man to communicate with computers and to have them in turn communicate with man by means of speech. This invention opens up the possibility of new, unforeseen innovations in machines and systems in which'there is a speech communication interface with man.- One such example, in addition to the possibility of talking to a computer is the possibility of talking to a typewriter.
DESCRIPTION OF THE COEFFICIENT OR CODING GENERATOR FIG. 1 is a block diagram of the system for obtaining the speech extraction coefficients: The speech signal 2, denoted by the sample functions x(t) of sample length T is multiplied (instant by instant) by each orthogonal noise signal denoted respectively by the sample functions v (r), v (t), v (t), v,,(t), which are derived as the set of outputs of an orthogonal filter 12, described below when the white noise signal 10, denoted by the sample functioni g(t)}, of spectral density N watts per cycle, is applied to the orthogonal filters input. The sample functions that result from the multiplication of the set of orthogonal noise sample functions v,(t), v,,(t) with the sample function x(t) in the multipliers 4, 6, and 8, are the speech extraction samples of length T seconds as denoted by v,,(t) x(t): n l, 2, This set of product sample functions v,(t) x(t), v,(t) x(t), v (t) x(t), v,,(t) x(t) is averaged temporally in the temporal averages l4, l6, and 18 give rise to the corresponding set of cofficients. a a a a,,, where mathematically these averages are formally given by the equations:
1 T a3= J v (t)x(t)dt (3) CW where Tis the averaging time and is at least equal to the sample length, T of the speech sample. The integral over (0,?) of each of the product sample functions, v,,x divided by T is the temporal average of those sample functions. The resulting speech extraction coefficients a,, a a, are dependent only on the sample length, T and are the coefficients that represent the essential information needed to reconstruct the speech signal, x(t). A rule of thumb for the value of T, the averaging time is that about ten times the sample length, T o of the speech sample.
GENERATION OF THE ORTl-IOGONAL NOISE FUNCTIONS In the generation of the speech code, a,, a a a,,, it is necessary to generate a corresponding set of random sample functions v,(t), v (t), v (t), v,,(t), that are pairwise orthogonal. By pairwise orthogonality we roughly mean that the information contained in each of the sample functions v (r), v,(t), v,,(t) is unique. To put this another'way, no overlap in information exists between any two sample functions v,(t), v,(z), v,,(z). For descriptive purposes, the generation of the orthogonal random sample functions is achieved by means of an orthogonal filter when whitenoise of power spectral density N is applied to the input of the orthogonal filter.
The orthogonal filter is a linear filter with one pair of input terminals and many pairs of output terminals. The filter, which is described below in detail, may be roughly likened to an ideal prism on which white light is incident. The incident light may be thought of as the input to the prism. The output of an ideal prism is essentially the complementary colors in response to the incident white light upon it. The complementary colors are of course pairwise orthogonal. In our case, however, the orthogonal functions are random within a band of frequencies whereas in the prism case the complementary functions are in principle roughly single frequency sinusoids with random amplitudes and random phases. The latter is due to the randomness of the white light.
CONSTRUCTION OF THE ORTHOGONAL FILTERS Linear orthogonal filters can be constructed as a chain of linear filter sections in which the poles of any section of the filter, in the complex plane, is essentially cancelled by the zeros of the next immediate section. FIG.-2 shows this general scheme for constructing an orthogonal filter. The complex function H,(s) is the transfer function of the first section of the filter as a funst netw qe e z fr=q ns ar bl i w ch a jw s where .f= FEQP'F"?! i eye et per. sec nd n-th output terminal pair of the orthogonalfilter. The process for realizing the actual construction of an orthogonal filter is given by the following formula:
Let I-I,(s) be the transfer function of the first filter section. Then The numerator B(s) is selected to satisfy the equation and the denominator C(s) is selected to satisfy the equation C (s) c c s c s" (8) where the constants b b b and the constants c c c are chosen such that H(s) is physically realizable. In a like way the transfer function is designed such that the numerator C (s) is exactly like the denominator of the first section, H,(s) with s replaced by (s). This makes the zeros of H (s) H (s)H (s) occupy the same positions as the poles of H,(s) except that they are reflected about the jw-axis. Hence, the zeros of one transfer functio in effect cancel the poles of the next section giving rise to the orthogonality of the pair. In general, the n-th transfer function of the orthogonal filter from the input of the filter is given by the formula ff cn- H,.(s) KnBts) (1 H k( n for n s 2. Where the symbol, H is the product of all is the product of all factors in the expansion of Equation 10 from I to n. For this speech device, a filter is developed using this process of construction of orthogonal filters having simple poles and zeros along the axis of the reals. This gives rise to simple resistancecapacitance (RC) network with differential amplifiers. The RC orthogonal filter is shown in FIG. 3. The first section of the filter includes an input resistor 34, feedback resistor 36 designated as R in Equation 13, feedback capacitor 38, designated as C in Equation 13, and operational amplifier 40. The subsequent section of the filter includes inputs resistors 42, 44, designated as r and r in Equation 15 respectively, resistors 48, 50, 54, and 56; capacitors 46, 52; and differential amplifier 58. Resistors 48 and 50 are designated as R, in Equation 17, resistors 54 and 56 are designated as R and R respectively in Equation 16 and capacitors 46 and 52 are designated as C, in Equation 17. For these filters the transfer function of the first section is given by i/ 1) (It) and for n a 2 A n/ n) n-1/ n) n-m) where for design purposes the first pole is placed at s along the real axis where and sec-
. are placed according to the n/ IP!) m m) In the ordinary orthogonal filter, the poles s s s can be placed arbitrarily along the real axis. But for the speech synthesizer this procedure will fail to work. It is important to place the poles s s s in such way as to make the reconstructed signal converge to the desired speech signal. In the process of constructing an orthogonal filter for the speech synthesizer the first pole is selected approximately equal to the bandwidth of the speech signal. Since speech signals can cover a bandwidth of the order of 200 Hz to 2,500 Hz, depend-- ing on the speaker, s, can be selected to be a frequency within this range. To make the reconstructed signal converge to the original speech signal in the mean squared sense, the remaining poles are chosen according to the formula s s /n n 2 1 18) If the remaining poles are placed according to the formula s" lln n 2 1 (1.9) then the reconstructed signal will converge absolutely almost everywhere. In general be needed for convergence. There is a practical difficulty however, which causes a resolution problem in placing the poles at distinct positions along the real axis.
SUMMARY OF THE GENERAL PROCESS FOR PRODUCING ORTHOGONAL FILTERS 1. Select the nature of the poles of the orthogonal filter to be, i.e., whether they are simple or complex, from the application for which the filters are to be .us d- I log Hn(w) dw 1 m is satisfied where: H,,(w) |H,,(jw)| is the amplitude characteristic of the transfer function H,,(s).
4. From each transfer function, i.e., H,(s) and H (s),
etc. given by and (23) where etc. the circuit is synthesized according to standard procedure noting that the placement of the poles must follow the requirements of convergence if forexample a speech signal is to be reconstructed.
5. Once a design of H,(s) and H (s) are made, the orthogonal filter is made by cascading H (s) and H (s) together and then adding to this cascade as many sections as needed according to the requirements of the application, in which subsequent sec tions are identical to H (s) in terms of the circuit used. c
The individual sections following the second section are designed according to the formula It is therefore clear that in terms of the circuit all the H (s) for n 2' 2 are the same with the exception of the circuit parameters which depend only on n, the number of the section. Hence, the additional requirements on the poles depend on the convergence of the reconstruction process or on some other physical requirement. The resulting chain is now an orthogonal filter.
6. The design of the transfer functions H,(s), H (s), H (s), depend on the placementof the poles- 0 s s s These in turn define the constants b 5 S S /n c. For convergence almost everywhere S S1/n d. For other types of convergence S, S /n;p 4
7. The kind of convergence or the method of pole placement dependson the application intended 5 note by the symbols {w (t):K=1, 2,
and once that is decided, the resolution of pole placement is determined and the resulting error in convergence can be estimated.
8. Once the transfer function is designed according to the process given in steps I through 7 above, the hardware is obtained by standard analog computer techniques or by factoring the transfer functions and then determining the equivalent partial fraction expansion using amplifiers and differential amplifiers to take account of the negative signs that may result from the partial fraction constants. The process described above is the one used to obtain the hardware configuration of the orthogonal filter shown in FIG. 3.
SIGNAL MULTIPLIERS The signal multipliers shown in FIG. 1 for multiplying the orthogonal noise components of the white noise source with the speech signal are standard devices. In the frequency range of interest, i.e., bandwidths up to 20 KHz, the quarter Gauss square method was used. Other types can easily be used instead.
The quarter Gauss square multiplier is one that makes use of the identity (A+B) (A-B) 4 AB TEMPORAL AVERAGER The temporal averager is a simple RC low pass filter with a very long time constant compared to the highest frequency component in the speech signal. If f denotes this frequency, then FIG. 4 gives an example of the configuration of the temporal averager. The temporal averager consists of input resistor 60, feedback capacitor 62 and amplifier 64.
GAUSSIAN WHITE NOISE SOURCE The Gaussian white noise source is a standard noise generator of the diode type.
SYSTEM FOR THE RECONSTRUCTION OF THE SPEECH SIGNAL FROM SPEECH EXTRACTION COEFFICIENTS AND WHITE NOISE Once the speech extraction coefficients, a a,, a a, are obtained from the Speech Extraction Code Generator, shown in FIG. I, it is possible to reconstruct from these coefficients the original'speech sample, x(t). The reconstruction of x(t) is carried out by multiplying each coefficienfl a :k= l, 2, n} by a corresponding set of orthogonal noise functions which we shall den and 0 st and then summing the resulting set of products.
Thus, we form the set of products a1 w (t), a w (t),
a w (t), a,.wn(t) and then sum to obtain the reconstructed speech signal x(z) as the summation Sn z i at-wkm (2 It is evident by analogy that the coefficients a a a correspond to the coefficients of a Fourier series and the orthogonal noise functions w (t), w (t), w (t),- w,,(t) correspond to the orthogonal set of trigonometric functions. Equation 27) is the partial sum of a generalized random orthogonal series which converges as n to the speech sample x(t) in some probabiliscan be calculated from the equation For a speech sample x will be fixed. Then 6 is smallest when is largest.
THE PHYSICAL RECONSTRUCTION SYSTEM FIG. is a block diagram of the speech reconstruction system. Using the coding coefficients 21,, a a and a Gaussian white noise source of spectral power density of l /N watts per cycle the speech sample function, x(t) is reconstructed according to the description given above.
In FIG. 5, the white noise signal, k(t) of spectral density of l/N watts per cycle is applied to an orthogonal filter which has a transfer function, H,,(s) identical to the orthogonal filter of the coding generator shown in FIG. 1. The response of this orthogonal filter is the set of orthogonal noise sample functions w,(t), w (t), w (t), w,,(t). Each of these is multiplied respectively by the coefficients a,, a a derived from the coding generator (FIG. 1). The products thus formed in each of the multipliers is summed in the adder to give x(t) e Ewe) (29) k= l v V It will be noted that the elements of the reconstruction system are quite similar to the coefficient code generator system. The orthogonal filter is the exact same design as in the code generator case. The white noise source of the reconstruction system differsfrom the coding generator system in that the spectral density of one is the reciprocal of the other.
The multiplier is a signal multiplier the same as described previously above. The speech extraction coefficients can be stored in a memory such as a tape as voltages of appropriate value.
THE ADDER The summing of the products a w (t), a w (t), a,,w,,(t) are accomplished in a conventional adder.
RELATIONS BETWEEN SPECTRAL DENSITIES Since the white noise source of the generation and .reconstruction systems have reciprocal spectral density functions, the sample functions are related according t0 El!) Wu 4 92 SUMMARY In the method presented here the speech analysis and synthesis technique capitalizes onthe factthat speech-- I is a random signal that can be decomposed into a generalized orthogonal series something like the Fourier series. This means that a speech signal can be represented by a set of coefficients which depend only on the nature of the speech information and on the length of the speech sample. This speech synthesizing system is an electronic system for accomplishing the generation of these speech coefficients or speech extraction parameters and for utilizing them to reconstruct the speech into a spoken signal.
The novelty of the system consists in the utilization of orthogonal filters, white noise and temporal averaging devices connected in the unique arrangement shown in FIG. ll that gives rise to the speech extraction coefficients, a a a It should be noted that the speech extraction parameters are very narrow band signals. These signals are nearly constant for T T,,. This also means that the system can be used to greatly compress speech.
Obviously many modifications and variations of the present invention are possible in the light of the above teachings. It is therefore to be understood that withing the scope of the appended claims the invention may be.
practiced otherwise than as specifically described.
What is claimed is: l. A method for making an orthogonal filter having sections 1 through n comprising the steps of:
selecting a first transfer function as wherein the numerator B(s) is selected to satisfy the relationship:
and the denominator C is selected to satisfy the following relationship:
C18) C10 C S C S'" wherein the constants b,,, b b and the constants c c c are chosen such that H (.r) is capable of being manufactured from realizable components;
constructing a first filter section having as its transfer function said first transfer function; placing said first filter section first serially in a group of n filter sections; select ng a sspn ansfe funst szn V electing the numerator of the second equation C,(s) and the denominator of the first equation C (s) of the transfer function so that the zeros of the second transfer function occupy the positions of the poles of the first transfer function and are reflected about the jcu-axis, and with the effect that poles of the first transfer function cancel the zeros of the second transfer function thereby producing orthogonality; constructing a second filter section having as its transfer function said second transfer function; placing said second filter section second serially after said first filter section in a group of n filter sections;
arranging each successive section following section 2 such that the zeros of their respective tmsfer functions occupy the positions of the poles of the preceeding transfer functions reflected about the jw axis and with the zeros of the successive transfer t nsti n .ssn sllis the poles OM19- arsss s as transfer function thereby giving rise to orthogonality; and,
arranging the n-th successive filter section of the orthogonal filter such that the transfer function of each filter in a series including n filters is described by the relationship:
11:11 Ck(s) Hits) KnBm H C (s) K=l wherein k signifies all the filters in the series from the first filter to the n-th filter, inclusively.
2. A method as defined in claim 1 wherein the orthogonal filter is to be used in a speech synthesizer including the steps of:
selecting s, to be within the frequency range of 200 Hertz to 2,500 Hertz, and
selecting the remaining poles in accordance with the formula wherein the constants b,,, b,, .b and the constant c c .0 are chosen such that H (s) is capable of being manufactured from realizable components;
a second section characterized by the transfer function z( z/ i) 1 ("U/ 2 wherein the numerator of the second equation C,(s)
and the denominator of the first equation C,(s) of the transfer function are selected such that the zeros of the second transfer function occupy the positions of the 5 poles of the first transfer function and are reflected about the jw-axis, and with the effect that poles of the first transfer function cancel the zeros of the second transfer function thereby producing. orthogonality;
arranging each successive filter section following section 2 such that the zeros their respective transfer function occupy the positions of the poles of the preceeding transfer function reflected about the jw axis and with the zeros of the transfer function of the successive filter section cancelling the poles of the transfer function of the preceeding section thereby giving rise to orthogonality; and
e r w s a sriaspy the rwsfsrfunstisan wherein k denotes all the filter sections in the series from the first filter to the n-th filter inclusive.
4. An orthogonal filter as defined in claim 3 for use in a speech synthesizing system which further includes: selecting s, from within the range of 200 Hertz to 2,500 Hertz, and selecting the remaining poles in accordance with the formula n =(s /n for n 2 l. l

Claims (4)

1. A method for making an orthogonal filter having sections 1 through n comprising the steps of: selecting a first transfer function as H1(s) K1(B(s)/C1(s)) wherein the numerator B(s) is selected to satisfy the relationship: B(s) bo + b1s + . . . bksk and the denominator C1 is selected to satisfy the following relationship: C1(s) c10 + c11s + . . . + c1msm wherein the constants bo, b1, . . ., bk and the constants c10, c11, . . ., c1m are chosen such that H1(s) is capable of being manufactured from realizable components; constructing a first filter section having as its transfer function said first transfer function; placing said first filter section first serially in a group of n filter sections; selecting a second transfer function as H2o(s) (K2/K1) (C1(s)/C2(s)) electing the numerator of the second equation C1(-s) and the denominator of the first equation C1(s) of the transfer function so that the zeros of the second transfer function occupy the positions of the poles of the first transfer function and are reflected about the j omega -axis, and with the effect that poles of the first transfer function cancel the zeros of the second transfer functiOn thereby producing orthogonality; constructing a second filter section having as its transfer function said second transfer function; placing said second filter section second serially after said first filter section in a group of n filter sections; arranging each successive section following section 2 such that the zeros of their respective trnsfer functions occupy the positions of the poles of the preceeding transfer functions reflected about the j omega axis and with the zeros of the successive transfer function cancelling the poles of the preceeding transfer function thereby giving rise to orthogonality; and, arranging the n-th successive filter section of the orthogonal filter such that the transfer function of each filter in a series including n filters is described by the relationship:
2. A method as defined in claim 1 wherein the orthogonal filter is to be used in a speech synthesizer including the steps of: selecting s1 to be within the frequency range of 200 Hertz to 2, 500 Hertz, and selecting the remaining poles in accordance with the formula sn (s1/n3) for n > or = 1.
3. An orthogonal filter having sections 1 through n comprising: a first section characterized by the transfer function H1(s) K1(B(s)/C1(s)) wherein the numerator B(s) is selected to satisfy the relationship: B(s) bo + b1s + . . . + bksk and the denominator C1 is selected to satisfy the following relationship: C1(s) c10 + c11s + . . . + c1msm wherein the constants bo, b1, . . .bk and the constant c10, c11, . . .c1m are chosen such that H1(s) is capable of being manufactured from realizable components; a second section characterized by the transfer function H2o(s) (K2/K1) (C1 (-s)/C2 (s)) wherein the numerator of the second equation C1(-s) and the denominator of the first equation C1(s) of the transfer function are selected such that the zeros of the second transfer function occupy the positions of the poles of the first transfer function and are reflected about the j omega -axis, and with the effect that poles of the first transfer function cancel the zeros of the second transfer function thereby producing orthogonality; arranging each successive filter section following section 2 such that the zeros their respective transfer function occupy the positions of the poles of the preceeding transfer function reflected about the j omega axis and with the zeros of the transfer function of the successive filter section cancelling the poles of the transfer function of the preceeding section thereby giving rise to orthogonality; and an n-th section characterized by the transfer function
4. An orthogonal filter as defined in claim 3 for use in a speech synthesizing system which further includes: selecting s1 from within the range of 200 Hertz to 2,500 Hertz, and selecting the remaining poles in accordance with the formula sn (s1/n3) for n > or = 1.
US00191003A 1971-06-23 1971-10-20 Method of designing orthogonal filters Expired - Lifetime US3800093A (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
US00191003A US3800093A (en) 1971-10-20 1971-10-20 Method of designing orthogonal filters
US00313239A US3833767A (en) 1971-06-23 1972-12-08 Speech compression system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
US00191003A US3800093A (en) 1971-10-20 1971-10-20 Method of designing orthogonal filters

Publications (1)

Publication Number Publication Date
US3800093A true US3800093A (en) 1974-03-26

Family

ID=22703718

Family Applications (1)

Application Number Title Priority Date Filing Date
US00191003A Expired - Lifetime US3800093A (en) 1971-06-23 1971-10-20 Method of designing orthogonal filters

Country Status (1)

Country Link
US (1) US3800093A (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US4125898A (en) * 1977-01-05 1978-11-14 The Singer Company Digitally shaped noise generating system
US4188667A (en) * 1976-02-23 1980-02-12 Beex Aloysius A ARMA filter and method for designing the same
US4545065A (en) * 1982-04-28 1985-10-01 Xsi General Partnership Extrema coding signal processing method and apparatus
US20210319800A1 (en) * 2019-01-31 2021-10-14 Mitsubishi Electric Corporation Frequency band expansion device, frequency band expansion method, and storage medium storing frequency band expansion program

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US4188667A (en) * 1976-02-23 1980-02-12 Beex Aloysius A ARMA filter and method for designing the same
US4125898A (en) * 1977-01-05 1978-11-14 The Singer Company Digitally shaped noise generating system
US4545065A (en) * 1982-04-28 1985-10-01 Xsi General Partnership Extrema coding signal processing method and apparatus
US20210319800A1 (en) * 2019-01-31 2021-10-14 Mitsubishi Electric Corporation Frequency band expansion device, frequency band expansion method, and storage medium storing frequency band expansion program
US11763828B2 (en) * 2019-01-31 2023-09-19 Mitsubishi Electric Corporation Frequency band expansion device, frequency band expansion method, and storage medium storing frequency band expansion program

Similar Documents

Publication Publication Date Title
Dautrich et al. On the effects of varying filter bank parameters on isolated word recognition
CA1157564A (en) Sound synthesizer
Gold et al. Analysis of digital and analog formant synthesizers
US3995116A (en) Emphasis controlled speech synthesizer
Sadasiv The arithmetic Fourier transform
Schroeder Direct (nonrecursive) relations between cepstrum and predictor coefficients
Morgan et al. Real-time adaptive linear prediction using the least mean square gradient algorithm
US4340781A (en) Speech analysing device
EP0128298A2 (en) Orthogonal transformer and apparatus operational thereby
US3403227A (en) Adaptive digital vocoder
US3344349A (en) Apparatus for analyzing the spectra of complex waves
Noll et al. Short‐Time “Cepstrum” Pitch Detection
US3746791A (en) Speech synthesizer utilizing white noise
US3851162A (en) Continuous fourier transform method and apparatus
US3800093A (en) Method of designing orthogonal filters
US3069507A (en) Autocorrelation vocoder
Yegnanarayana Design of ARMA digital filters by pole-zero decomposition
US3394228A (en) Apparatus for spectral scaling of speech
Hewes et al. Applications of CCD and switched capacitor filter technology
Yang The Algorithms of Speech Recognition: programming and simulating in MATLAB
Farhang-Boroujeny Analysis and efficient implementation of partitioned block LMS adaptive filters
CA2142509A1 (en) Complex Cepstrum Analyzer for Speech Signals
US3330910A (en) Formant analysis and speech reconstruction
Kwan On the problem of designing IIR digital filters with short coefficient word lengths
Makhoul Methods for nonlinear spectral distortion of speech signals