US4489437A - Speech synthesizer - Google Patents
Speech synthesizer Download PDFInfo
- Publication number
- US4489437A US4489437A US06/343,198 US34319882A US4489437A US 4489437 A US4489437 A US 4489437A US 34319882 A US34319882 A US 34319882A US 4489437 A US4489437 A US 4489437A
- Authority
- US
- United States
- Prior art keywords
- interpolation
- parcor coefficient
- speech
- period
- circuit means
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Expired - Fee Related
Links
- 230000004044 response Effects 0.000 claims description 7
- 238000000034 method Methods 0.000 abstract description 23
- 230000015654 memory Effects 0.000 description 45
- 230000015572 biosynthetic process Effects 0.000 description 23
- 238000003786 synthesis reaction Methods 0.000 description 23
- 230000003252 repetitive effect Effects 0.000 description 5
- 238000010586 diagram Methods 0.000 description 2
- 230000000977 initiatory effect Effects 0.000 description 2
- 230000002194 synthesizing effect Effects 0.000 description 2
- 230000005540 biological transmission Effects 0.000 description 1
- 230000006870 function Effects 0.000 description 1
- 238000005070 sampling Methods 0.000 description 1
- HODRFAVLXIFVTR-RKDXNWHRSA-N tevenel Chemical compound NS(=O)(=O)C1=CC=C([C@@H](O)[C@@H](CO)NC(=O)C(Cl)Cl)C=C1 HODRFAVLXIFVTR-RKDXNWHRSA-N 0.000 description 1
Images
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04Q—SELECTING
- H04Q11/00—Selecting arrangements for multiplex systems
- H04Q11/04—Selecting arrangements for multiplex systems for time-division multiplexing
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L19/00—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
- G10L19/04—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using predictive techniques
Definitions
- This invention relates to a speech synthesizer for using Speech Analysis and Synthesis of linear predictive coding techniques representing the PARCOR (PARTIAL AUTOCORRELATION) technique.
- a recent speech synthesizer uses a fixed frame length, namely a fixed renew frequency of speech parameter at the synthesizing time and is intended to decrease information required at the synthesizing time so that the speech synthesizer is realized in one chip LSI.
- the interpolation process which makes speech parameter in the frame change is executed so as to smooth the change of speech parameter in dependence on the time. It is required that the frame period is short so as to smooth the change of speech parameter in dependence on time in order to obtain good sound quality.
- the conventional speech synthesizer does not attain sufficiently synthesized speech.
- a variable frame length technique as the approach of the speech synthesizer of PARCOR technique.
- one pitch of voiced sound including at least one pitch is a fundamental time and the frame interval is changed in dependence on the change of pitch length.
- variable frame length may obtain better synthesized speech sound, but it is the problem that it makes information required in speech synthesis increase.
- circuit for determining an interpolation period corresponding to the determined frame interval and for generating an interpolation timing signal every interpolation period
- a circuit for executing an interpolation of the speech parameters in order by using the interpolation value in synchronism with the interpolation timing signal a circuit for executing an interpolation of the speech parameters in order by using the interpolation value in synchronism with the interpolation timing signal.
- FIGURE is a block diagram showing an embodiment of the speech synthesizer according to this invention.
- FIGURE shows a block diagram of a speech synthesizer according to this invention.
- the circuit except the speaker 1, the driving circuit 2 and etc. is constructed within LSI (Large Scale Integrated Circuit).
- the analysis speech data stored in the speech ROM are provided to the speech synthesizer in order with the predetermined prosecution by the control of a microprocessor not shown in the FIGURE.
- the speech data corresponding to one-frame described below is provided to the bus line 4 from the microprocessor side in response to the data request signal REQ generated from the counter 3 in the speech synthesizer as described hereinafter.
- the PARCOR coefficient K i is the parameter for determining the transmission characteristic of the digital filter 5, and the amplitude data AMP, the pitch data PITCH and the repeat times REPEAT are data for determining the amplitude, the period and the pulse number of the pulse signal serving as the a speech source signal inputted to the digital filter 5.
- the speech synthesizer becomes the complete variable frame length as the synthesized pitch frequency is equal to the frame frequency.
- the speech synthesis of the voiced sound having repetitive waveforms does not allot one pitch to one frame but one pitch x repeat times REPEAT to one frame.
- the speech synthesizer of this invention decrease information required in the speech synthesis very much.
- the random noise (pulse signal being random in polarity) from the noise generator 6 is encoded digitally and is inputted as the speech source signal to the digital filter 5.
- the amplitude of the noise is determined by the amplitude data AMP and the time of noise application is determined by the pitch data PITCH and the repeat times REPEAT.
- the frame (analysis-window) at the time of the analysis is constant so that the pitch data PITCH is constant.
- the frame frequency at the synthesis is determined substantially by the repeat times REPEAT.
- the classification signal N of interpolation period on the speech data table is the signal for determining the interpolation period ⁇ t of PARCOR coefficient K i which equals to the value of the time T f divided by the classification signal N determined by the frame interval T f .
- the interpolation is executed (N-1) times in the period ⁇ t with respect to PARCOR coefficient K i in the frame interval T f .
- the repeat times REPEAT of this embodiment is determined by the involution of 2 such as 1, 2, 4, 8, . . . and the classification signal N of the interpolation period is determined to the involution of 2 such as 4, 8, 16, 32.
- the relationship between T f and N is determined as the following table:
- the interpolation of AMP is executed to change smoothly from the amplitude of the present frame to the amplitude of next frame.
- the amplitude data AMP is not constant during T f in the case that the repeat times is not 1, namely in the case that the repetitive process is executed.
- the circuit operation will be described.
- the stored data K i in the memory 8a is transferred to the memory 8b
- the stored data AMP is transferred to the memory 9b
- the stored data PITCH in the memory 10a is transferred to the memory 10b
- the stored data REPEAT in the memory 11a is transferred to the memory 11b
- the stored data N is transferred to the memory 12 b
- the stored data V/UV is transferred to the memory 13b.
- PARCOR coefficient K i of the subsequent frame is stored in the memory 8a
- AMP of it is stored in the memory 9a
- PITCH of it is stored in the memory 10a
- REPEAT of it is stored in the memory 11a
- the signal N of it is stored in the memory 12a
- the signal V/UV of it is stored in the memory 13a.
- the speech data DATA 1 of the first frame is stored in the memories 8b-13b and the speech data DATA 2 of the subsequent frame is stored in the memories 8a-13a.
- the interpolation process is executed to smooth the change of K i and AMP referring to PARCOR coefficient K i and the amplitude data AMP DATA 1.
- PARCOR coefficient and the amplitude data in the DATA 1 are K i1 , AMP 1 , respectively
- the PARCOR coefficient and the amplitude data in the DATA 2 are K i2 , AMP 2 .
- the pitch data PITCH stored in the memory 10b is preset into the shift circuit 14 serving as the multiplier.
- the repeat times REPEAT is applied to the shift circuit 14 and serves as the shift signal to shift the content of the shift circuit 14.
- This data T f is preset into the shift circuit 15 serving as the divider.
- the classification signal (N) of interpolation period stored in the memory 12b is applied to the shift circuit 15 and serves as the shift signal to shift down the content of the shift circuit 15.
- This interpolation period ⁇ t is preset into the presettable down-counter 16.
- This counter 16 counts the clock signal CK after initiating the synthesis (the frequency of the clock signal CK is equal to the sampling frequency at the time of the synthesis, for example, 10 KHz) in the down-direction and produces a count-up signal C 1 every ⁇ t.
- This signal C 1 applies as the interpolation timing signal to the interpolator 17 of PARCOR coefficient.
- the interpolation value to execute the addition and subtraction is solved by K i1 in the memory 8b and K i2 in the interpolation memory 8a and is stored in the interpolation value memory 18.
- the interpolation value ⁇ K i is represented by ##EQU1## In order to solve ⁇ K i , K i1 of the memory 8b is taken into the interpolator 17 and K i2 of the memory 8a is taken into the interpolator 17 through the change-over gate 19.
- the classification signal N of the interpolation period stored in the memory 12b applies as the shift signal for shifting down to the shift circuit of the interpolator 17 so that (K i2 -K i1 )/N is solved as mentioned above.
- the value ⁇ K 1 is stored in the interpolation value memory 18.
- the similar pre-process in the interpolator 20 for the amplitude data AMP is executed.
- the interpolation period of AMP is PITCH and the interpolation times is (REPEAT-1).
- the interpolation value ⁇ AMP is represented by the following equation: ##EQU2##
- AMP 1 stored in the memory 9b is taken into the interpolator 20 and AMP 2 stored in the memory 9a is taken into the interpolator 20 through the change-over gate 21.
- REPEAT stored in the memory 11b applies as the shift signal for shifting down to the shift circuit so that ⁇ AMP of the former equation is solved and is stored in the interpolation value memory 22.
- PITCH stored in the memory 10b and REPEAT stored in the memory 11b are presetted into the presettable down-counters 23 and 3 respectively.
- the counter 23 counts the above mentioned clock CK in the down direction and the count-up signal C 2 produces from the counter 23 every PITCH time.
- the counter 3 counts the count-up signal C 2 in order in the down direction and the count-up signal C 3 of the counter 3 is outputted as the data request signal REQ from the counter 3.
- the count-up signal C 2 of the counter 23 is applied as the interpolation timing signal to the interpolator 20 of AMP.
- the preset signal PS of the counter 23 which produces after this count-up signal is applied as the open signal to the gate 24 for sending the voiced sound source signal.
- K i1 stored in the memory 8b is renewed to (K i1 + ⁇ K i ).
- PARCOR coefficient provided to the digital filter 5 and the content of the memory 8b changes as K i1 + ⁇ K i ⁇ K i1 +2 ⁇ K i ⁇ K i1 +3 ⁇ K i . . . every interpolation timing signal C 1 .
- the interpolation value ⁇ AMP stored in the interpolation value memory 22 is taken into the interpolator 20 through gate 21 so that AMP is added to AMP 1 stored temporarily in the interpolator 20.
- the operation result (AMP 1 + ⁇ AMP) is produced from the interpolator 20 and the data stored in the interpolator 20 temporarily changes from AMP 1 to (AMP 1 + ⁇ AMP).
- the amplitude data derived from the interpolator 20 changes as AMP 1 ⁇ AMP 1 + ⁇ AMP ⁇ AMP 1 +2 ⁇ AMP ⁇ AMP 1 +3 ⁇ AMP . . . every interpolation timing signal C 2 .
- Discriminating signal of voiced sound/unvoiced sound V/UV stored in the memory 13b is applied as the change-over signal to the change-over gate 25.
- the change-over gate 25 is switched to (V) side.
- the amplitude data derived from the AMP interpolator 20 is applied as the sound source signal to the digital filter 5 through the gates 24 and 25.
- the change-over gate 25 is switched to (UV) side.
- the amplitude code control circuit 26 produces the random noise coded digitally, changed at random in polarity and controlled by the amplitude data produced from the AMP interpolator 20 under the output signal from the noise generator 6.
- the random noise is applied as the sound source signal to the digital filter 5 through the gate 25.
- the speech waveform is synthesized digitally from the sound source signal and PARCOR coefficient and the digital output of the filter 5 become the speech waveform through the rounding circuit 28 and D/A converter 29 and the driver 2, and an acoustic output produces from the speaker 1.
- the data request signal REQ is produced from the counter 3.
- the speech data DATA 2 of the second frame stored in the memories 8a-13a is transferred to the memories 8b-13b and the speech data DATA 3 of the third frame provided to the bus line 4 is stored in the memories 8a-13a.
- the synthesis process of the speech parameter DATA 2 of the second frame is executed with the interpolation referring to and using K i3 and AMP 3 in the speech data 3 of the third parameter as mentioned above.
- the classification signal N corresponding to T f every frame is provided as the speech data beforehand.
- the synthesis circuit may be provided with a circuit portion for determining N and ⁇ t based on the output T f of the shift circuit 14 instead.
- this invention is the speech synthesizer using the speech synthesis technology of the linear predictive coding technique and variable frame length system in which one pitch of the synthesis-sound for analyzing is the fundamental time and the repeat times is the repetitive times of the waveform, and comprising the circuit portion for solving the frame length from pitch data and repeat times, the circuit portion for solving the interpolation value per one interpolation and the circuit for interpolating in order the synthesis parameter from the interpolation timing signal and the interpolation value.
- information required in the synthesis may be reduced greatly from the repetitive process, the interpolation of the synthesis may execute suitably in response to the frame length in spite of the length of the frame and the quality of the synthesis sound is good.
Landscapes
- Engineering & Computer Science (AREA)
- Computer Networks & Wireless Communication (AREA)
- Computational Linguistics (AREA)
- Signal Processing (AREA)
- Health & Medical Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Human Computer Interaction (AREA)
- Physics & Mathematics (AREA)
- Acoustics & Sound (AREA)
- Multimedia (AREA)
- Compression, Expansion, Code Conversion, And Decoders (AREA)
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
JP56-11871 | 1981-01-29 | ||
JP56011871A JPS57125999A (en) | 1981-01-29 | 1981-01-29 | Voice synthesizer |
Publications (1)
Publication Number | Publication Date |
---|---|
US4489437A true US4489437A (en) | 1984-12-18 |
Family
ID=11789779
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US06/343,198 Expired - Fee Related US4489437A (en) | 1981-01-29 | 1982-01-27 | Speech synthesizer |
Country Status (3)
Country | Link |
---|---|
US (1) | US4489437A (enrdf_load_stackoverflow) |
JP (1) | JPS57125999A (enrdf_load_stackoverflow) |
GB (1) | GB2093668B (enrdf_load_stackoverflow) |
Cited By (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US4718087A (en) * | 1984-05-11 | 1988-01-05 | Texas Instruments Incorporated | Method and system for encoding digital speech information |
US5029214A (en) * | 1986-08-11 | 1991-07-02 | Hollander James F | Electronic speech control apparatus and methods |
US5095508A (en) * | 1984-01-27 | 1992-03-10 | Ricoh Company, Ltd. | Identification of voice pattern |
US5111505A (en) * | 1988-07-21 | 1992-05-05 | Sharp Kabushiki Kaisha | System and method for reducing distortion in voice synthesis through improved interpolation |
WO1994007237A1 (en) * | 1992-09-21 | 1994-03-31 | Aware, Inc. | Audio compression system employing multi-rate signal analysis |
KR20020006164A (ko) * | 2000-07-11 | 2002-01-19 | 송문섭 | 음성 신호 부호화시 격자방법을 이용한 고정소수점선형예측부호화 계수 추출 방법 |
Families Citing this family (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CA1181859A (en) * | 1982-07-12 | 1985-01-29 | Forrest S. Mozer | Variable rate speech synthesizer |
GB2130852B (en) * | 1982-11-19 | 1986-03-12 | Gen Electric Co Plc | Speech signal reproducing systems |
JPS59188700A (ja) * | 1983-04-08 | 1984-10-26 | 日本電信電話株式会社 | 音声合成装置 |
JPH0776873B2 (ja) * | 1986-04-15 | 1995-08-16 | ヤマハ株式会社 | 楽音信号発生装置 |
CN106157968B (zh) * | 2011-06-30 | 2019-11-29 | 三星电子株式会社 | 用于产生带宽扩展信号的设备和方法 |
Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US4270027A (en) * | 1979-11-28 | 1981-05-26 | International Telephone And Telegraph Corporation | Telephone subscriber line unit with sigma-delta digital to analog converter |
US4335275A (en) * | 1978-04-28 | 1982-06-15 | Texas Instruments Incorporated | Synchronous method and apparatus for speech synthesis circuit |
-
1981
- 1981-01-29 JP JP56011871A patent/JPS57125999A/ja active Granted
-
1982
- 1982-01-26 GB GB8202118A patent/GB2093668B/en not_active Expired
- 1982-01-27 US US06/343,198 patent/US4489437A/en not_active Expired - Fee Related
Patent Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US4335275A (en) * | 1978-04-28 | 1982-06-15 | Texas Instruments Incorporated | Synchronous method and apparatus for speech synthesis circuit |
US4270027A (en) * | 1979-11-28 | 1981-05-26 | International Telephone And Telegraph Corporation | Telephone subscriber line unit with sigma-delta digital to analog converter |
Non-Patent Citations (2)
Title |
---|
Cole, et al., "A Real Time Floating Point . . . Vocoder", IEEE Conf. Record, Acoustics, Speech . . . , 1977, pp. 429, 430. |
Cole, et al., A Real Time Floating Point . . . Vocoder , IEEE Conf. Record, Acoustics, Speech . . . , 1977, pp. 429, 430. * |
Cited By (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US5095508A (en) * | 1984-01-27 | 1992-03-10 | Ricoh Company, Ltd. | Identification of voice pattern |
US4718087A (en) * | 1984-05-11 | 1988-01-05 | Texas Instruments Incorporated | Method and system for encoding digital speech information |
US5029214A (en) * | 1986-08-11 | 1991-07-02 | Hollander James F | Electronic speech control apparatus and methods |
US5111505A (en) * | 1988-07-21 | 1992-05-05 | Sharp Kabushiki Kaisha | System and method for reducing distortion in voice synthesis through improved interpolation |
WO1994007237A1 (en) * | 1992-09-21 | 1994-03-31 | Aware, Inc. | Audio compression system employing multi-rate signal analysis |
KR20020006164A (ko) * | 2000-07-11 | 2002-01-19 | 송문섭 | 음성 신호 부호화시 격자방법을 이용한 고정소수점선형예측부호화 계수 추출 방법 |
Also Published As
Publication number | Publication date |
---|---|
JPS645720B2 (enrdf_load_stackoverflow) | 1989-01-31 |
GB2093668B (en) | 1984-10-24 |
JPS57125999A (en) | 1982-08-05 |
GB2093668A (en) | 1982-09-02 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
JP2782147B2 (ja) | 波形編集型音声合成装置 | |
US4489437A (en) | Speech synthesizer | |
US5890118A (en) | Interpolating between representative frame waveforms of a prediction error signal for speech synthesis | |
US6125344A (en) | Pitch modification method by glottal closure interval extrapolation | |
US5290965A (en) | Asynchronous waveform generating device for use in an electronic musical instrument | |
US4653099A (en) | SP sound synthesizer | |
US7418388B2 (en) | Voice synthesizing method using independent sampling frequencies and apparatus therefor | |
US4989250A (en) | Speech synthesizing apparatus and method | |
JPS62229200A (ja) | ピツチ検出器 | |
GB2097636A (en) | Speech synthesizer | |
JPH0318197B2 (enrdf_load_stackoverflow) | ||
JPH06131000A (ja) | 基本周期符号化装置 | |
JPS6053999A (ja) | 音声合成器 | |
JPS5952840B2 (ja) | 音声合成器の補間装置 | |
JP2937322B2 (ja) | 音声合成装置 | |
JP2560277B2 (ja) | 音声合成方式 | |
JPS5950997B2 (ja) | 音声パラメ−タ情報抽出方式 | |
JPH01189700A (ja) | 音声合成装置 | |
JPH05167457A (ja) | 音声符号化装置 | |
KR100310930B1 (ko) | 음성합성장치및그방법 | |
JPH0675598A (ja) | 音声符号化方法及び音声合成方法 | |
JPH10187180A (ja) | 楽音発生装置 | |
JPH01283600A (ja) | 残差駆動型音声合成装置 | |
JPH01224800A (ja) | 残差駆動型音声合成装置 | |
JPH06186975A (ja) | 音源装置 |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
AS | Assignment |
Owner name: SEIKO INSTRUMENTS & ELECTRONICS LTD., 31-1, KAMEID Free format text: ASSIGNMENT OF ASSIGNORS INTEREST.;ASSIGNORS:FUKUICHI, TAKURO;KUSUMOTO, YASUO;FUJITA, SUMIO;AND OTHERS;REEL/FRAME:004280/0076 Effective date: 19840525 |
|
FEPP | Fee payment procedure |
Free format text: PAYOR NUMBER ASSIGNED (ORIGINAL EVENT CODE: ASPN); ENTITY STATUS OF PATENT OWNER: LARGE ENTITY |
|
FPAY | Fee payment |
Year of fee payment: 4 |
|
REMI | Maintenance fee reminder mailed | ||
LAPS | Lapse for failure to pay maintenance fees | ||
FP | Lapsed due to failure to pay maintenance fee |
Effective date: 19921220 |
|
STCH | Information on status: patent discontinuation |
Free format text: PATENT EXPIRED DUE TO NONPAYMENT OF MAINTENANCE FEES UNDER 37 CFR 1.362 |