US4489437A - Speech synthesizer - Google Patents

Speech synthesizer Download PDF

Info

Publication number
US4489437A
US4489437A US06/343,198 US34319882A US4489437A US 4489437 A US4489437 A US 4489437A US 34319882 A US34319882 A US 34319882A US 4489437 A US4489437 A US 4489437A
Authority
US
United States
Prior art keywords
interpolation
parcor coefficient
speech
period
circuit means
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Fee Related
Application number
US06/343,198
Other languages
English (en)
Inventor
Takuro Fukuichi
Yasuo Kusumoto
Sumio Fujita
Shuji Kawamura
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Seiko Instruments Inc
Original Assignee
Seiko Instruments Inc
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Seiko Instruments Inc filed Critical Seiko Instruments Inc
Assigned to SEIKO INSTRUMENTS & ELECTRONICS LTD. reassignment SEIKO INSTRUMENTS & ELECTRONICS LTD. ASSIGNMENT OF ASSIGNORS INTEREST. Assignors: FUJITA, SUMIO, FUKUICHI, TAKURO, KAWAMURA, SHUJI, KUSUMOTO, YASUO
Application granted granted Critical
Publication of US4489437A publication Critical patent/US4489437A/en
Anticipated expiration legal-status Critical
Expired - Fee Related legal-status Critical Current

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04QSELECTING
    • H04Q11/00Selecting arrangements for multiplex systems
    • H04Q11/04Selecting arrangements for multiplex systems for time-division multiplexing
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
    • G10L19/04Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using predictive techniques

Definitions

  • This invention relates to a speech synthesizer for using Speech Analysis and Synthesis of linear predictive coding techniques representing the PARCOR (PARTIAL AUTOCORRELATION) technique.
  • a recent speech synthesizer uses a fixed frame length, namely a fixed renew frequency of speech parameter at the synthesizing time and is intended to decrease information required at the synthesizing time so that the speech synthesizer is realized in one chip LSI.
  • the interpolation process which makes speech parameter in the frame change is executed so as to smooth the change of speech parameter in dependence on the time. It is required that the frame period is short so as to smooth the change of speech parameter in dependence on time in order to obtain good sound quality.
  • the conventional speech synthesizer does not attain sufficiently synthesized speech.
  • a variable frame length technique as the approach of the speech synthesizer of PARCOR technique.
  • one pitch of voiced sound including at least one pitch is a fundamental time and the frame interval is changed in dependence on the change of pitch length.
  • variable frame length may obtain better synthesized speech sound, but it is the problem that it makes information required in speech synthesis increase.
  • circuit for determining an interpolation period corresponding to the determined frame interval and for generating an interpolation timing signal every interpolation period
  • a circuit for executing an interpolation of the speech parameters in order by using the interpolation value in synchronism with the interpolation timing signal a circuit for executing an interpolation of the speech parameters in order by using the interpolation value in synchronism with the interpolation timing signal.
  • FIGURE is a block diagram showing an embodiment of the speech synthesizer according to this invention.
  • FIGURE shows a block diagram of a speech synthesizer according to this invention.
  • the circuit except the speaker 1, the driving circuit 2 and etc. is constructed within LSI (Large Scale Integrated Circuit).
  • the analysis speech data stored in the speech ROM are provided to the speech synthesizer in order with the predetermined prosecution by the control of a microprocessor not shown in the FIGURE.
  • the speech data corresponding to one-frame described below is provided to the bus line 4 from the microprocessor side in response to the data request signal REQ generated from the counter 3 in the speech synthesizer as described hereinafter.
  • the PARCOR coefficient K i is the parameter for determining the transmission characteristic of the digital filter 5, and the amplitude data AMP, the pitch data PITCH and the repeat times REPEAT are data for determining the amplitude, the period and the pulse number of the pulse signal serving as the a speech source signal inputted to the digital filter 5.
  • the speech synthesizer becomes the complete variable frame length as the synthesized pitch frequency is equal to the frame frequency.
  • the speech synthesis of the voiced sound having repetitive waveforms does not allot one pitch to one frame but one pitch x repeat times REPEAT to one frame.
  • the speech synthesizer of this invention decrease information required in the speech synthesis very much.
  • the random noise (pulse signal being random in polarity) from the noise generator 6 is encoded digitally and is inputted as the speech source signal to the digital filter 5.
  • the amplitude of the noise is determined by the amplitude data AMP and the time of noise application is determined by the pitch data PITCH and the repeat times REPEAT.
  • the frame (analysis-window) at the time of the analysis is constant so that the pitch data PITCH is constant.
  • the frame frequency at the synthesis is determined substantially by the repeat times REPEAT.
  • the classification signal N of interpolation period on the speech data table is the signal for determining the interpolation period ⁇ t of PARCOR coefficient K i which equals to the value of the time T f divided by the classification signal N determined by the frame interval T f .
  • the interpolation is executed (N-1) times in the period ⁇ t with respect to PARCOR coefficient K i in the frame interval T f .
  • the repeat times REPEAT of this embodiment is determined by the involution of 2 such as 1, 2, 4, 8, . . . and the classification signal N of the interpolation period is determined to the involution of 2 such as 4, 8, 16, 32.
  • the relationship between T f and N is determined as the following table:
  • the interpolation of AMP is executed to change smoothly from the amplitude of the present frame to the amplitude of next frame.
  • the amplitude data AMP is not constant during T f in the case that the repeat times is not 1, namely in the case that the repetitive process is executed.
  • the circuit operation will be described.
  • the stored data K i in the memory 8a is transferred to the memory 8b
  • the stored data AMP is transferred to the memory 9b
  • the stored data PITCH in the memory 10a is transferred to the memory 10b
  • the stored data REPEAT in the memory 11a is transferred to the memory 11b
  • the stored data N is transferred to the memory 12 b
  • the stored data V/UV is transferred to the memory 13b.
  • PARCOR coefficient K i of the subsequent frame is stored in the memory 8a
  • AMP of it is stored in the memory 9a
  • PITCH of it is stored in the memory 10a
  • REPEAT of it is stored in the memory 11a
  • the signal N of it is stored in the memory 12a
  • the signal V/UV of it is stored in the memory 13a.
  • the speech data DATA 1 of the first frame is stored in the memories 8b-13b and the speech data DATA 2 of the subsequent frame is stored in the memories 8a-13a.
  • the interpolation process is executed to smooth the change of K i and AMP referring to PARCOR coefficient K i and the amplitude data AMP DATA 1.
  • PARCOR coefficient and the amplitude data in the DATA 1 are K i1 , AMP 1 , respectively
  • the PARCOR coefficient and the amplitude data in the DATA 2 are K i2 , AMP 2 .
  • the pitch data PITCH stored in the memory 10b is preset into the shift circuit 14 serving as the multiplier.
  • the repeat times REPEAT is applied to the shift circuit 14 and serves as the shift signal to shift the content of the shift circuit 14.
  • This data T f is preset into the shift circuit 15 serving as the divider.
  • the classification signal (N) of interpolation period stored in the memory 12b is applied to the shift circuit 15 and serves as the shift signal to shift down the content of the shift circuit 15.
  • This interpolation period ⁇ t is preset into the presettable down-counter 16.
  • This counter 16 counts the clock signal CK after initiating the synthesis (the frequency of the clock signal CK is equal to the sampling frequency at the time of the synthesis, for example, 10 KHz) in the down-direction and produces a count-up signal C 1 every ⁇ t.
  • This signal C 1 applies as the interpolation timing signal to the interpolator 17 of PARCOR coefficient.
  • the interpolation value to execute the addition and subtraction is solved by K i1 in the memory 8b and K i2 in the interpolation memory 8a and is stored in the interpolation value memory 18.
  • the interpolation value ⁇ K i is represented by ##EQU1## In order to solve ⁇ K i , K i1 of the memory 8b is taken into the interpolator 17 and K i2 of the memory 8a is taken into the interpolator 17 through the change-over gate 19.
  • the classification signal N of the interpolation period stored in the memory 12b applies as the shift signal for shifting down to the shift circuit of the interpolator 17 so that (K i2 -K i1 )/N is solved as mentioned above.
  • the value ⁇ K 1 is stored in the interpolation value memory 18.
  • the similar pre-process in the interpolator 20 for the amplitude data AMP is executed.
  • the interpolation period of AMP is PITCH and the interpolation times is (REPEAT-1).
  • the interpolation value ⁇ AMP is represented by the following equation: ##EQU2##
  • AMP 1 stored in the memory 9b is taken into the interpolator 20 and AMP 2 stored in the memory 9a is taken into the interpolator 20 through the change-over gate 21.
  • REPEAT stored in the memory 11b applies as the shift signal for shifting down to the shift circuit so that ⁇ AMP of the former equation is solved and is stored in the interpolation value memory 22.
  • PITCH stored in the memory 10b and REPEAT stored in the memory 11b are presetted into the presettable down-counters 23 and 3 respectively.
  • the counter 23 counts the above mentioned clock CK in the down direction and the count-up signal C 2 produces from the counter 23 every PITCH time.
  • the counter 3 counts the count-up signal C 2 in order in the down direction and the count-up signal C 3 of the counter 3 is outputted as the data request signal REQ from the counter 3.
  • the count-up signal C 2 of the counter 23 is applied as the interpolation timing signal to the interpolator 20 of AMP.
  • the preset signal PS of the counter 23 which produces after this count-up signal is applied as the open signal to the gate 24 for sending the voiced sound source signal.
  • K i1 stored in the memory 8b is renewed to (K i1 + ⁇ K i ).
  • PARCOR coefficient provided to the digital filter 5 and the content of the memory 8b changes as K i1 + ⁇ K i ⁇ K i1 +2 ⁇ K i ⁇ K i1 +3 ⁇ K i . . . every interpolation timing signal C 1 .
  • the interpolation value ⁇ AMP stored in the interpolation value memory 22 is taken into the interpolator 20 through gate 21 so that AMP is added to AMP 1 stored temporarily in the interpolator 20.
  • the operation result (AMP 1 + ⁇ AMP) is produced from the interpolator 20 and the data stored in the interpolator 20 temporarily changes from AMP 1 to (AMP 1 + ⁇ AMP).
  • the amplitude data derived from the interpolator 20 changes as AMP 1 ⁇ AMP 1 + ⁇ AMP ⁇ AMP 1 +2 ⁇ AMP ⁇ AMP 1 +3 ⁇ AMP . . . every interpolation timing signal C 2 .
  • Discriminating signal of voiced sound/unvoiced sound V/UV stored in the memory 13b is applied as the change-over signal to the change-over gate 25.
  • the change-over gate 25 is switched to (V) side.
  • the amplitude data derived from the AMP interpolator 20 is applied as the sound source signal to the digital filter 5 through the gates 24 and 25.
  • the change-over gate 25 is switched to (UV) side.
  • the amplitude code control circuit 26 produces the random noise coded digitally, changed at random in polarity and controlled by the amplitude data produced from the AMP interpolator 20 under the output signal from the noise generator 6.
  • the random noise is applied as the sound source signal to the digital filter 5 through the gate 25.
  • the speech waveform is synthesized digitally from the sound source signal and PARCOR coefficient and the digital output of the filter 5 become the speech waveform through the rounding circuit 28 and D/A converter 29 and the driver 2, and an acoustic output produces from the speaker 1.
  • the data request signal REQ is produced from the counter 3.
  • the speech data DATA 2 of the second frame stored in the memories 8a-13a is transferred to the memories 8b-13b and the speech data DATA 3 of the third frame provided to the bus line 4 is stored in the memories 8a-13a.
  • the synthesis process of the speech parameter DATA 2 of the second frame is executed with the interpolation referring to and using K i3 and AMP 3 in the speech data 3 of the third parameter as mentioned above.
  • the classification signal N corresponding to T f every frame is provided as the speech data beforehand.
  • the synthesis circuit may be provided with a circuit portion for determining N and ⁇ t based on the output T f of the shift circuit 14 instead.
  • this invention is the speech synthesizer using the speech synthesis technology of the linear predictive coding technique and variable frame length system in which one pitch of the synthesis-sound for analyzing is the fundamental time and the repeat times is the repetitive times of the waveform, and comprising the circuit portion for solving the frame length from pitch data and repeat times, the circuit portion for solving the interpolation value per one interpolation and the circuit for interpolating in order the synthesis parameter from the interpolation timing signal and the interpolation value.
  • information required in the synthesis may be reduced greatly from the repetitive process, the interpolation of the synthesis may execute suitably in response to the frame length in spite of the length of the frame and the quality of the synthesis sound is good.

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Computational Linguistics (AREA)
  • Signal Processing (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Compression, Expansion, Code Conversion, And Decoders (AREA)
US06/343,198 1981-01-29 1982-01-27 Speech synthesizer Expired - Fee Related US4489437A (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
JP56-11871 1981-01-29
JP56011871A JPS57125999A (en) 1981-01-29 1981-01-29 Voice synthesizer

Publications (1)

Publication Number Publication Date
US4489437A true US4489437A (en) 1984-12-18

Family

ID=11789779

Family Applications (1)

Application Number Title Priority Date Filing Date
US06/343,198 Expired - Fee Related US4489437A (en) 1981-01-29 1982-01-27 Speech synthesizer

Country Status (3)

Country Link
US (1) US4489437A (enrdf_load_stackoverflow)
JP (1) JPS57125999A (enrdf_load_stackoverflow)
GB (1) GB2093668B (enrdf_load_stackoverflow)

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US4718087A (en) * 1984-05-11 1988-01-05 Texas Instruments Incorporated Method and system for encoding digital speech information
US5029214A (en) * 1986-08-11 1991-07-02 Hollander James F Electronic speech control apparatus and methods
US5095508A (en) * 1984-01-27 1992-03-10 Ricoh Company, Ltd. Identification of voice pattern
US5111505A (en) * 1988-07-21 1992-05-05 Sharp Kabushiki Kaisha System and method for reducing distortion in voice synthesis through improved interpolation
WO1994007237A1 (en) * 1992-09-21 1994-03-31 Aware, Inc. Audio compression system employing multi-rate signal analysis
KR20020006164A (ko) * 2000-07-11 2002-01-19 송문섭 음성 신호 부호화시 격자방법을 이용한 고정소수점선형예측부호화 계수 추출 방법

Families Citing this family (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CA1181859A (en) * 1982-07-12 1985-01-29 Forrest S. Mozer Variable rate speech synthesizer
GB2130852B (en) * 1982-11-19 1986-03-12 Gen Electric Co Plc Speech signal reproducing systems
JPS59188700A (ja) * 1983-04-08 1984-10-26 日本電信電話株式会社 音声合成装置
JPH0776873B2 (ja) * 1986-04-15 1995-08-16 ヤマハ株式会社 楽音信号発生装置
CN106157968B (zh) * 2011-06-30 2019-11-29 三星电子株式会社 用于产生带宽扩展信号的设备和方法

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US4270027A (en) * 1979-11-28 1981-05-26 International Telephone And Telegraph Corporation Telephone subscriber line unit with sigma-delta digital to analog converter
US4335275A (en) * 1978-04-28 1982-06-15 Texas Instruments Incorporated Synchronous method and apparatus for speech synthesis circuit

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US4335275A (en) * 1978-04-28 1982-06-15 Texas Instruments Incorporated Synchronous method and apparatus for speech synthesis circuit
US4270027A (en) * 1979-11-28 1981-05-26 International Telephone And Telegraph Corporation Telephone subscriber line unit with sigma-delta digital to analog converter

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
Cole, et al., "A Real Time Floating Point . . . Vocoder", IEEE Conf. Record, Acoustics, Speech . . . , 1977, pp. 429, 430.
Cole, et al., A Real Time Floating Point . . . Vocoder , IEEE Conf. Record, Acoustics, Speech . . . , 1977, pp. 429, 430. *

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5095508A (en) * 1984-01-27 1992-03-10 Ricoh Company, Ltd. Identification of voice pattern
US4718087A (en) * 1984-05-11 1988-01-05 Texas Instruments Incorporated Method and system for encoding digital speech information
US5029214A (en) * 1986-08-11 1991-07-02 Hollander James F Electronic speech control apparatus and methods
US5111505A (en) * 1988-07-21 1992-05-05 Sharp Kabushiki Kaisha System and method for reducing distortion in voice synthesis through improved interpolation
WO1994007237A1 (en) * 1992-09-21 1994-03-31 Aware, Inc. Audio compression system employing multi-rate signal analysis
KR20020006164A (ko) * 2000-07-11 2002-01-19 송문섭 음성 신호 부호화시 격자방법을 이용한 고정소수점선형예측부호화 계수 추출 방법

Also Published As

Publication number Publication date
JPS645720B2 (enrdf_load_stackoverflow) 1989-01-31
GB2093668B (en) 1984-10-24
JPS57125999A (en) 1982-08-05
GB2093668A (en) 1982-09-02

Similar Documents

Publication Publication Date Title
JP2782147B2 (ja) 波形編集型音声合成装置
US4489437A (en) Speech synthesizer
US5890118A (en) Interpolating between representative frame waveforms of a prediction error signal for speech synthesis
US6125344A (en) Pitch modification method by glottal closure interval extrapolation
US5290965A (en) Asynchronous waveform generating device for use in an electronic musical instrument
US4653099A (en) SP sound synthesizer
US7418388B2 (en) Voice synthesizing method using independent sampling frequencies and apparatus therefor
US4989250A (en) Speech synthesizing apparatus and method
JPS62229200A (ja) ピツチ検出器
GB2097636A (en) Speech synthesizer
JPH0318197B2 (enrdf_load_stackoverflow)
JPH06131000A (ja) 基本周期符号化装置
JPS6053999A (ja) 音声合成器
JPS5952840B2 (ja) 音声合成器の補間装置
JP2937322B2 (ja) 音声合成装置
JP2560277B2 (ja) 音声合成方式
JPS5950997B2 (ja) 音声パラメ−タ情報抽出方式
JPH01189700A (ja) 音声合成装置
JPH05167457A (ja) 音声符号化装置
KR100310930B1 (ko) 음성합성장치및그방법
JPH0675598A (ja) 音声符号化方法及び音声合成方法
JPH10187180A (ja) 楽音発生装置
JPH01283600A (ja) 残差駆動型音声合成装置
JPH01224800A (ja) 残差駆動型音声合成装置
JPH06186975A (ja) 音源装置

Legal Events

Date Code Title Description
AS Assignment

Owner name: SEIKO INSTRUMENTS & ELECTRONICS LTD., 31-1, KAMEID

Free format text: ASSIGNMENT OF ASSIGNORS INTEREST.;ASSIGNORS:FUKUICHI, TAKURO;KUSUMOTO, YASUO;FUJITA, SUMIO;AND OTHERS;REEL/FRAME:004280/0076

Effective date: 19840525

FEPP Fee payment procedure

Free format text: PAYOR NUMBER ASSIGNED (ORIGINAL EVENT CODE: ASPN); ENTITY STATUS OF PATENT OWNER: LARGE ENTITY

FPAY Fee payment

Year of fee payment: 4

REMI Maintenance fee reminder mailed
LAPS Lapse for failure to pay maintenance fees
FP Lapsed due to failure to pay maintenance fee

Effective date: 19921220

STCH Information on status: patent discontinuation

Free format text: PATENT EXPIRED DUE TO NONPAYMENT OF MAINTENANCE FEES UNDER 37 CFR 1.362