US3830977A - Speech-systhesiser - Google Patents
Speech-systhesiser Download PDFInfo
- Publication number
- US3830977A US3830977A US00231558A US23155872A US3830977A US 3830977 A US3830977 A US 3830977A US 00231558 A US00231558 A US 00231558A US 23155872 A US23155872 A US 23155872A US 3830977 A US3830977 A US 3830977A
- Authority
- US
- United States
- Prior art keywords
- signal
- frequency
- auxiliary
- amplitude
- generators
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Expired - Lifetime
Links
- 238000005070 sampling Methods 0.000 claims abstract description 9
- 230000015572 biosynthetic process Effects 0.000 description 9
- 238000003786 synthesis reaction Methods 0.000 description 8
- 230000000737 periodic effect Effects 0.000 description 5
- 238000004458 analytical method Methods 0.000 description 4
- 230000000694 effects Effects 0.000 description 3
- 238000010586 diagram Methods 0.000 description 2
- 238000005259 measurement Methods 0.000 description 2
- 230000003595 spectral effect Effects 0.000 description 2
- 230000001755 vocal effect Effects 0.000 description 2
- 230000003321 amplification Effects 0.000 description 1
- 230000005540 biological transmission Effects 0.000 description 1
- 239000003990 capacitor Substances 0.000 description 1
- 230000006870 function Effects 0.000 description 1
- 238000004519 manufacturing process Methods 0.000 description 1
- 238000000034 method Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 238000003199 nucleic acid amplification method Methods 0.000 description 1
- 238000005192 partition Methods 0.000 description 1
- 230000008447 perception Effects 0.000 description 1
- 238000001228 spectrum Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L19/00—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
- G10L19/02—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using spectral analysis, e.g. transform vocoders or subband vocoders
Definitions
- ABSTRACT Claffy Assistant Examiner-Jon Bradford Leaheey Attorney, Agent, or Firm-Cushman, Darby & Cushman [5 7] ABSTRACT
- a synthesiser which, for each sampling period, reconstitutes a language element by means of three sinusoidal components obtained with the help of variablefrequency generators and variable-attenuators, those components are simultaneously subject to predetermined rephasing operations carried out at an auxiliary frequency identified with the pitch frequency (or frequency of vibration of the voice) at the time of emission of vowels or voiced consonants.
- This auxiliary frequency is delivered by a further variable-frequency generator.
- the signal representing the sum of these components is amplitude-modulated by a modulating signal at the auxiliary frequency.
- the present invention relates to an improvement in speech-synthesisers supplied with a recurrence periodicity of T,-corresponding to a sampling periodicity T, with digital information translating an information that will be referred to hereinafter as the main information, the latter making it possible to reconstitute in an approximate manner a language element by adding up, in respect of each sampling period, a certain number p, equal to or less than a fixed number n, of sinusoidal signals, whose frequency and amplitude form the main information, and which will be referred to here as main components.
- a main component is a portion, in the course of time, of a formant, a formant being defined as a succession, (in time) of spectral components whose frequencies are identical or vary little, and corresponding to an absolute or relative maximum of energy in the speech spectrum.
- the formant portions which are transmitted with each speech sampling operation are determined in accordance with-criteria set out in the aforementioned Patent (the precise way in which the main information is selected being, however,.outside of the scope of the present invention which only concerns the use of it in a synthetiser).
- the object of the present invention is the utilisation in synthesisers of the above-mentioned kind, of an auxiliary information relating to the person speaking and making it possible, to a certain extent, to identify him.
- This auxiliary information is the pitch information. It is constituted during the emission of vowels and voiced consonants by a frequency which is the frequency of vibration of the vocal chords of the person speaking, and this will be referred to abbreviatedly as the pitch frequency; it generally ranges between 80 and 350 c/s.
- the spectral components relating to the vowels and voiced consonants are harmonics of this pitch frequency, unlike the case with the unvoiced consonants.
- the simplest of these devices is ofthe peak detector type:
- the speech signal is applied to a bandpass filter or, even safer, to two bandpass filters, normally producing a signal which is amplitude-modulated at the pitch frequency at the time of emission of vowels or voiced consonants; each of these filters is followed by an amplitude detector itself followed by a peak detector, the frequency of the peaks corresponding to the pitch frequency when the latter appears in the signal; in the contrary case, the measuring device produces an output frequency which fluctuates very widely.
- the frequency produced by the measuring device and referred to here as the auxiliary frequency is permanently transmitted through an auxiliary channel, irrespective of whether it is in fact the pitch frequency or not, but there is also additionally transmitted a special signal whose production requires complex equipment and which indicates if the auxiliary frequency is the pitch frequency, in which case the speech synthesis is carried out by means of a generator producing harmonics of this frequency, which is obviously only reconstituted to within the limits dictated by the quantizing operations; in the contrary case, the auxiliary frequency is not utilised and synthesis is effected with the help of a noise generator.
- the present invention makes it possible to exploit the pitch information by carrying out predetermined, simultaneous rephasing on all the main components, these predetermined rephasing operations being effected at the pitch frequency; they enable the sum signal of the main components to be rendered periodic at this frequency.
- a speech synthesiser designed to carry out speech synthesis on the basis of a periodic information comprising a main information relating to a language element, and constituted by the frequencies and amplitudes of p sinusoidal components hereinafter referred to as main components (p being a variable number which is equal at the most at a fixed number n greater than 1), and an auxiliary information constituted by a frequency, hereinafter referred to as the auxiliary frequency, which, during the emission of vowels and voiced consonants, is the frequency of vibration of the vocal chords of the speaker, referred to as the pitch frequency
- said synthe siser comprising: n variable-frequency generators and n amplitude control devices respectively associated with said n generators, said n generators and said amplitude-control devices being controlled by said main information in order to reconstitute said main components; an adder, having an output, for delivering the sum of said reconstituted main components; a device hereinafter referred to as rephasing device, for rephas
- FIG. 1 being the diagram of a preferred embodiment of the synthesiser in'accordance with the invention and FIG. 2 illustrating an element of the diagram of FIG. 1.
- the information source 1 is the source designed to supply the synthesiser.
- This source may, for example, be the final stage of a receiver device which produces the information required by the synthesiser, in the form of parallel binary signals. Each of these signals is maintained at the outputs of the source 1 for a period T equal to the analysis period.
- the source 1 thus has seven multiple outputs (each multiple output having a plurality of terminals each of which corresponds to a binary digit), namely the outputs ll, 12 and 13 respectively delivering the values of the frequencies F F and F of the main components, the outputs 21, 22 and 23 delivering the values of the corresponding amplitudes A A and A and, finally, the output 10 delivering the value of the auxiliary frequencyf.
- the outputs ll, 12 and 13 respectively delivering the values of the frequencies F F and F of the main components
- the outputs 21, 22 and 23 delivering the values of the corresponding amplitudes A A and A
- the output 10 delivering the value of the auxiliary frequencyf.
- the outputs 11, 12 and 13 are respectively connected to three code converter circuits 31, 32, 33 which respectively produce in relation to the digital data representing F F F (these data are generally constituted by the identification number of an analysis 1 filter), the numbers where F is a frequency very much higher than the acoustic frequency band (300 to 3,500 c/s for example) analysed at emission, and where q is a fixed whole number in the order of for example.
- Three variable dividers 41, 42, 43 of the counter type receive the pulses from a clock 2 at frequency F
- These dividers are respectively provided with multiple control inputs respectively connected to the outputs of the code converters 31, 32 and 33, and their respective outputs are connected to the frequency control inputs 61, 62 and 63 of three signal generators 51, 52 and 53 which they respectively supply with pulses of frequency 2qF 2qF and 2qF
- Each of the three generators (FIG.
- the generator For a periodic shift pulse train, the generator produces a step signal comprising a certain number of cycles, which simply has to be fed into a low-pass filter in order to convert it to a sinusoidal signal, the direct component normally being thereafter eliminated through an amplification process for example.
- the three generators will respectively produce output voltages whose envelopes (apart from a direct component) are portions of sinusoidal waveforms having respective frequencies F F and F It is likewise easy to appreciate that the load of the register at a given instant, determines the phase of the sinusoidal signal at said same instant.
- the three generators furthermore comprise inputs 71, 72 and 73 making it possible to reset to zero all the stages of the respective registers, this register condition corresponding to the phase 270 in the sinusoidal signal.
- the three generators respectively supply three variable attenuators 81, 82 and 83 the control inputs of which are respectively connected to the outputs 21, 22 and 23 of the source 1 through three digital-analogue converters 91, 92, 93.
- the output signals from the attenuators 81, 82 and 83 are added in an adder 55.
- the signals appearing at the output of the adder 55 have lost their direct components because of the presence of a capacitor arranged at some suitable point.
- the output of the adder 55 supplies the carrier input of an amplitude-modulator whose output is coupled, through a low pass filter 75, to the input of an electro-acoustic transducer 85 which represents the output element of the synthesiser.
- the auxiliary frequency in this preferred embodiment of the synthesiser, is utilised not only for the aforesaid operations of rephasing, but also to carry out amplitude modulation (in the modulator 65) of the output signal from the adder 55, this indeed in such a manner that the instantaneous value of the modulating signal of frequency f passes through a minimum at the time of the rephasing operations.
- the overall effect thus obtained is highly satisfactory.
- each main component is rephased to the same fixed value, corresponding to a zero in all the stages of each register.
- the output of the source 1 in order to produce a signal of frequency f, supplies a circuit of the same kind as that which, as far as the frequency is concerned, is employed for the step signals of frequencies F F and F
- This circuit contains a code converter circuit 30 followed by a variable divider 40 and a signal generator 50.
- variable divider 40 is supplied with pulses by a fixed divider 90 itself supplied from the clock 2.
- the output signal of frequency f used by the generator 50 is applied to the modulating input of the modulator 65.
- the modulating signal is the step signal produced by the generator 50, as yet not ridded of its direct component, so that it passes through a minimum value of zero when the register 50 contains nothing but zeros; moreover, the modulation depth is adjusted so that the modulated signal becomes zero at the same time as the modulating signal.
- the output signal from the modulator 65 is then supplied to the low-pass filter 75 which smooths it, removing the discontinuities which are due to the steps in both the modulated and the modulating signals.
- the zero transits in the output signal from the generator 50 are detected by means of a decoding circuit which is reduced here to a simple NOR-gate 35 with two inputs respectively connected to the two end stages of the shift-register of the generator 50. It can readily be confirmed that this register can only simultaneously carry a zero in each of these two stages, when all its stages carry the zero condition.
- the output signal from the gate 35 is applied to the inputs 71, 72 and 73 of the generators 51, 52 and 53 and thus effects the rephasing operations at the desired instants.
- Another feature which improves the reconstitution of the vowels and voice consonants consists in using as the modulating signal a periodic signal of periodicity 0 l/f, a cycle of which is formed by two sinusoidal halfcycles (apart from the direct component which give them a minimum value of zero) of respective durations 6/4 for the rise from zero to the maximum and 3 6/4 for the descent from the maximum to zero.
- a network of 2q resistors is used each of which is selectively connected to the source of the generator for a corresponding position of the l digit, this making it possible to give any desired shape to a cycle of the step signal.
- a given phase of the envelope of the output signal is indicated by the position, in a given stage, of the single l digit circulating through the register, and the output of this stage can directly control the operations of predetermined rephasing.
- Another solution consists in retaining the generator 50 of FIG. 1 and supplying it with shift pulses whose frequency is three times higher in the case of the rise portions of the signal (register changes from 0" throughout to 1 throughout), than it is for the descent portions (return to the 0 throughout condition).
- a speech synthesiser comprising means for receiving for each one of successive sampling periods, first digital signals representative of the amplitude of n formants in a speech wave, n being an integer greater than 1, second digital signals representative of the frequencies of said n formants, and an auxiliary digital signal representative of an auxiliary frequency which, for the time intervals corresponding in said speech wave to the emission of vowels and voiced consonants, is the pitch frequency, said synthesiser comprising: n variable frequency signal generators having respective outputs, said n signal generators being signal generators of the shift register and resistance network type, each one of said 11 signal generators comprising a shift register having stages and a shift pulse input, said shift register of each one of said n signal generators having a further input for resetting its stages to predetermined states; n amplitude-control devices respectively coupled to said outputs of said n signal generators, and having respective amplitude control inputs and respective outputs; means for deriving from said first signals n amplitude control signals and respectively applying them to said
- said further control circuit comprises a further signal generator, having an output, for deriving from said auxiliary signal a modulation signal at said auxiliary frequency, said further signal generator being of the shift register and resistance network type and comprising a shift register having stages and a shift pulse input, means for deriving from said auxiliary signal a further series of pulses having a recurrence frequency proportional to said auxiliary frequency and applying said further series to said shift pulse input of said further signal generator, and decoding means coupled to said stages of said further signal generator for generating said phase control pulses at the instants when the value of said adder and a modulation input coupled to said outsaid modulation signal passes through a minimum; said put of said further signal generator for receiving said speech synthesiser further including an amplitude modmodulation signal.
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Spectroscopy & Molecular Physics (AREA)
- Computational Linguistics (AREA)
- Signal Processing (AREA)
- Health & Medical Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Human Computer Interaction (AREA)
- Acoustics & Sound (AREA)
- Multimedia (AREA)
- Electrophonic Musical Instruments (AREA)
- Compression, Expansion, Code Conversion, And Decoders (AREA)
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
FR7110824A FR2130952A5 (enrdf_load_stackoverflow) | 1971-03-26 | 1971-03-26 |
Publications (1)
Publication Number | Publication Date |
---|---|
US3830977A true US3830977A (en) | 1974-08-20 |
Family
ID=9074226
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US00231558A Expired - Lifetime US3830977A (en) | 1971-03-26 | 1972-03-03 | Speech-systhesiser |
Country Status (10)
Country | Link |
---|---|
US (1) | US3830977A (enrdf_load_stackoverflow) |
AU (1) | AU463038B2 (enrdf_load_stackoverflow) |
BE (1) | BE781116A (enrdf_load_stackoverflow) |
DE (1) | DE2214521A1 (enrdf_load_stackoverflow) |
FR (1) | FR2130952A5 (enrdf_load_stackoverflow) |
GB (1) | GB1364775A (enrdf_load_stackoverflow) |
IT (1) | IT952370B (enrdf_load_stackoverflow) |
NL (1) | NL7203873A (enrdf_load_stackoverflow) |
SE (1) | SE375178B (enrdf_load_stackoverflow) |
ZA (1) | ZA721392B (enrdf_load_stackoverflow) |
Cited By (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US4051331A (en) * | 1976-03-29 | 1977-09-27 | Brigham Young University | Speech coding hearing aid system utilizing formant frequency transformation |
US4075424A (en) * | 1975-12-19 | 1978-02-21 | International Computers Limited | Speech synthesizing apparatus |
US4566117A (en) * | 1982-10-04 | 1986-01-21 | Motorola, Inc. | Speech synthesis system |
US5140639A (en) * | 1990-08-13 | 1992-08-18 | First Byte | Speech generation using variable frequency oscillators |
RU2296377C2 (ru) * | 2005-06-14 | 2007-03-27 | Михаил Николаевич Гусев | Способ анализа и синтеза речи |
US20160104490A1 (en) * | 2013-06-21 | 2016-04-14 | Fraunhofer-Gesellschaft Zur Foerderung Der Angewandten Forschung E.V. | Method and apparataus for obtaining spectrum coefficients for a replacement frame of an audio signal, audio decoder, audio receiver, and system for transmitting audio signals |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US3268660A (en) * | 1963-02-12 | 1966-08-23 | Bell Telephone Labor Inc | Synthesis of artificial speech |
US3394228A (en) * | 1965-06-03 | 1968-07-23 | Bell Telephone Labor Inc | Apparatus for spectral scaling of speech |
US3491205A (en) * | 1966-09-29 | 1970-01-20 | Philco Ford Corp | Plural formant speech synthesizer |
US3499991A (en) * | 1967-08-01 | 1970-03-10 | Philco Ford Corp | Voice-excited vocoder |
US3532821A (en) * | 1967-11-29 | 1970-10-06 | Hitachi Ltd | Speech synthesizer |
-
1971
- 1971-03-26 FR FR7110824A patent/FR2130952A5/fr not_active Expired
-
1972
- 1972-03-01 ZA ZA721392A patent/ZA721392B/xx unknown
- 1972-03-03 US US00231558A patent/US3830977A/en not_active Expired - Lifetime
- 1972-03-23 NL NL7203873A patent/NL7203873A/xx not_active Application Discontinuation
- 1972-03-23 BE BE781116A patent/BE781116A/xx unknown
- 1972-03-24 AU AU40375/72A patent/AU463038B2/en not_active Expired
- 1972-03-24 DE DE19722214521 patent/DE2214521A1/de active Pending
- 1972-03-24 SE SE7203892A patent/SE375178B/xx unknown
- 1972-03-24 IT IT49208/72A patent/IT952370B/it active
- 1972-03-24 GB GB1399472A patent/GB1364775A/en not_active Expired
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US3268660A (en) * | 1963-02-12 | 1966-08-23 | Bell Telephone Labor Inc | Synthesis of artificial speech |
US3394228A (en) * | 1965-06-03 | 1968-07-23 | Bell Telephone Labor Inc | Apparatus for spectral scaling of speech |
US3491205A (en) * | 1966-09-29 | 1970-01-20 | Philco Ford Corp | Plural formant speech synthesizer |
US3499991A (en) * | 1967-08-01 | 1970-03-10 | Philco Ford Corp | Voice-excited vocoder |
US3532821A (en) * | 1967-11-29 | 1970-10-06 | Hitachi Ltd | Speech synthesizer |
Cited By (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US4075424A (en) * | 1975-12-19 | 1978-02-21 | International Computers Limited | Speech synthesizing apparatus |
US4092495A (en) * | 1975-12-19 | 1978-05-30 | International Computers Limited | Speech synthesizing apparatus |
US4051331A (en) * | 1976-03-29 | 1977-09-27 | Brigham Young University | Speech coding hearing aid system utilizing formant frequency transformation |
US4566117A (en) * | 1982-10-04 | 1986-01-21 | Motorola, Inc. | Speech synthesis system |
US5140639A (en) * | 1990-08-13 | 1992-08-18 | First Byte | Speech generation using variable frequency oscillators |
RU2296377C2 (ru) * | 2005-06-14 | 2007-03-27 | Михаил Николаевич Гусев | Способ анализа и синтеза речи |
US20160104490A1 (en) * | 2013-06-21 | 2016-04-14 | Fraunhofer-Gesellschaft Zur Foerderung Der Angewandten Forschung E.V. | Method and apparataus for obtaining spectrum coefficients for a replacement frame of an audio signal, audio decoder, audio receiver, and system for transmitting audio signals |
US9916834B2 (en) * | 2013-06-21 | 2018-03-13 | Fraunhofer-Gesellschaft Zur Foerderung Der Angewandten Forschung E.V. | Method and apparatus for obtaining spectrum coefficients for a replacement frame of an audio signal, audio decoder, audio receiver, and system for transmitting audio signals |
US10475455B2 (en) | 2013-06-21 | 2019-11-12 | Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V. | Method and apparatus for obtaining spectrum coefficients for a replacement frame of an audio signal, audio decoder, audio receiver, and system for transmitting audio signals |
US11282529B2 (en) | 2013-06-21 | 2022-03-22 | Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V. | Method and apparatus for obtaining spectrum coefficients for a replacement frame of an audio signal, audio decoder, audio receiver, and system for transmitting audio signals |
Also Published As
Publication number | Publication date |
---|---|
BE781116A (fr) | 1972-07-17 |
ZA721392B (en) | 1972-11-29 |
SE375178B (enrdf_load_stackoverflow) | 1975-04-07 |
AU463038B2 (en) | 1975-06-26 |
NL7203873A (enrdf_load_stackoverflow) | 1972-09-28 |
IT952370B (it) | 1973-07-20 |
FR2130952A5 (enrdf_load_stackoverflow) | 1972-11-10 |
AU4037572A (en) | 1973-09-27 |
DE2214521A1 (de) | 1972-10-05 |
GB1364775A (en) | 1974-08-29 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US3772681A (en) | Frequency synthesiser | |
US4003003A (en) | Multichannel digital synthesizer and modulator | |
US3995116A (en) | Emphasis controlled speech synthesizer | |
JPH07101840B2 (ja) | ディジタル雑音信号発生回路 | |
US3566035A (en) | Real time cepstrum analyzer | |
US4071903A (en) | Autocorrelation function factor generating method and circuitry therefor | |
US3830977A (en) | Speech-systhesiser | |
US4680479A (en) | Method of and apparatus for providing pulse trains whose frequency is variable in small increments and whose period, at each frequency, is substantially constant from pulse to pulse | |
US3431362A (en) | Voice-excited,bandwidth reduction system employing pitch frequency pulses generated by unencoded baseband signal | |
EP0391524B1 (en) | Phase accumulation dual tone multiple frequency generator | |
US4374304A (en) | Spectrum division/multiplication communication arrangement for speech signals | |
US3069507A (en) | Autocorrelation vocoder | |
US3697699A (en) | Digital speech signal synthesizer | |
US3078345A (en) | Speech compression systems | |
US4124898A (en) | Programmable clock | |
US3703609A (en) | Noise signal generator for a digital speech synthesizer | |
RU2030092C1 (ru) | Цифровой синтезатор частот | |
US3448216A (en) | Vocoder system | |
US4084472A (en) | Electronic musical instrument with tone generation by recursive calculation | |
US3083266A (en) | Vocoder apparatus | |
US3330910A (en) | Formant analysis and speech reconstruction | |
US3491205A (en) | Plural formant speech synthesizer | |
US2522539A (en) | Frequency control for synthesizing systems | |
JPS595913Y2 (ja) | 電子楽器の音源波形形成装置 | |
US4758971A (en) | Digital signal generator |