WO1982004493A1 - Synthetiseur vocal - Google Patents

Synthetiseur vocal Download PDF

Info

Publication number
WO1982004493A1
WO1982004493A1 PCT/JP1982/000233 JP8200233W WO8204493A1 WO 1982004493 A1 WO1982004493 A1 WO 1982004493A1 JP 8200233 W JP8200233 W JP 8200233W WO 8204493 A1 WO8204493 A1 WO 8204493A1
Authority
WO
WIPO (PCT)
Prior art keywords
analog
digital
control means
signal
sample
Prior art date
Application number
PCT/JP1982/000233
Other languages
English (en)
Japanese (ja)
Inventor
Electric Co Sanyo
Original Assignee
Sugiura Youji
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Sugiura Youji filed Critical Sugiura Youji
Priority to DE1982901856 priority Critical patent/DE81595T1/de
Priority to DE8282901856T priority patent/DE3277258D1/de
Publication of WO1982004493A1 publication Critical patent/WO1982004493A1/fr

Links

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L13/00Speech synthesis; Text to speech systems
    • G10L13/02Methods for producing synthetic speech; Speech synthesisers

Definitions

  • the invention of the art is a speech synthesizer that edits and synthesizes phonemes derived from the analog speech sputum form. ⁇ ). After converting the iota signal, the data near the rear end of the leading phoneme piece and the data near the tip of the trailing phoneme piece are compared with each other relatively shifted, and the leading phoneme piece is followed. The phoneme piece is in contact with the most slippery ⁇ '.
  • the quality of the voice signals (profanity, phrases, spoken voices) synthesized by combining phoneme fragments, that is, syllables, syllables, or this]? It is a unit of voice syllable
  • Fig. 1 is a program diagram showing the conventional time-axis extension load.
  • terminal (1) is a voice input filter
  • (2) is an output terminal
  • the analog shifter such as BBD of S bit
  • CLPF low-dust filter
  • CLPF low-dust filter
  • (6) (7)
  • (8) )
  • And (9) are analog switches] ?, output from the input terminal (1) via the analog shifter (3) or (4), LPF (5). Switch the audio signal leading to 3 ⁇ 4 child (2).
  • these analog switches divide the analog shift counter: 3 ⁇ 4: 1 zero-filled click circuit by 2 m Ii (described later).
  • the analog shifter (3:'and) is the click circuit ilQ'and the frequency divider circuit; U ) 3 ⁇ 4), (QJ output ⁇ 2i D gate 02 and. 0 5 gate () and 35) are alternately written and read, and read and clicked. ⁇ as'and division
  • Output AND gates (17) and (18) are read alternately via 0 R gates (14) and (15). It will be locked. That is, for example, an audio signal whose time axis given to the input terminal is compressed to m times (m> 1:) (such a compressed signal is, for example, the playback speed of a tape recorder in times the recording speed. The obtained) is obtained when the (3 ⁇ 4) output of the frequency divider circuit ⁇ ) is 1, and the analog signal (4) is passed through the analog switch (8). ) Since the number of bits of the shift is ⁇ , the input audio signal is a sample string of m N, and the application input is completed.
  • the time sputum in each block is set, the time sputum will be extended m times, and the compressed sound input to the voice input terminal (1) will be the output terminal).
  • the time axis is restored to the o. It is decided to run over the theorem o
  • connection timing of the phoneme pieces that alternately output the analog shift (3) and) is the write click ⁇ . 2 iii N
  • the output of the divided branch ⁇ 01) is automatically determined every second. Therefore, as shown in Fig. 2, the connection part of the phoneme piece is inaccurately changed in shape. As in the case of 0 tu, where fluctuations occur in the perimeter, the difference in the pitch at the connection of such phoneme pieces can reduce the sound quality and intelligibility o.
  • the output of the conversion means is stored in the digital storage means.
  • the speech synthesizer of the present invention it is possible to obtain a time-varying change in which a smooth connection point can be obtained by using the immersion circuit, and therefore, like the conventional device. It is possible to obtain a synthetic sound with no misreading of sputum shape at the connection part and fluctuation in the number of sputum peripheries o.
  • Figure 1 shows the block diagram of the conventional speech synthesizer
  • Figure 2 shows the custom-made drawing of the conventional device
  • Figure 5 shows the phoneme of the original speech synthesizer.
  • Block diagram, Fig. 4 and Fig. 5 are ⁇ 5 Fig. 5
  • the circuit diagram showing the formation, Fig. ⁇ , figure 5 shows the time chart for explaining the output of the gates (1 1 5) and (1 1 7) of the same device in Fig. 5.
  • Fig. 7 is a drawing showing a time chart for explaining the operation of the same device in Fig. 5 ⁇ operation HI ⁇ ⁇ 05).
  • ⁇ r) Value O A sputum diagram of the sample columns (Xp) and (Yp).
  • the present invention recognizes phoneme piece sputum-shaped patterns and naturally combines each phoneme piece with a ⁇ shape]? High quality-one synthetic sound is obtained. It is possible to do this.
  • the phoneme piece the one that was cut out for each pitch section from the natural voice was used, and the one piece was synthesized by another voice synthesizer.]?
  • the present invention is a method of combining phoneme pieces of a few seconds, specifically, phoneme pieces of several seconds to the inconsistency of the waveform and the fluctuation of the bitte frequency at the connection part. O That is, such short-term phoneme pieces should have similar waveforms at least for the binding parts of the opposing phoneme pieces]? Correct the time axis of each phoneme piece slightly.
  • the gun contact part can be smoothly connected.
  • O The present invention grasps the similarity of sputum shape with respect to the connection part of the phoneme pieces to be connected in the form of signal level. Based on this, the time axis of the phoneme piece is appropriately corrected in time.
  • ⁇ 0 1) is the audio signal input Ji.
  • the child is the audio signal output ⁇ child
  • (1 05) is the analog-digital conversion circuit (hereinafter referred to as) that converts the audio signal into digital data.
  • o (1 0 4) is a 2-byte storage element
  • the control input terminal (LT5) is the logical level "0"
  • the data input terminal (e ⁇ ⁇ ) (The digital value that can be given to the lower level is stored in the address input terminal (: ⁇ ⁇ Aa) (lower level)]? It is stored in the given address o
  • the ⁇ ⁇ input terminal C LT3) is at the logical level. 1 "is output to the address input terminal (Ai Aa)]? The contents of the given address are output to the data terminal (COd).
  • ⁇ 0 ⁇ ) and (108) are clock generation circuits. It is supplied to the clock input terminal ( ⁇ ) of the data (107), and the output of the reading counter ⁇ 07) is advanced.
  • the reading counter ⁇ 07 :) is a bit. In the counter of, the initial value is set by the output of the arithmetic circuit ⁇ 05). Here, this initial'direct setting O method is described.
  • the arithmetic operation circuit (105) gives the clear input terminal (CL) pulse of the read power unit (107) and reads it, and clears the output of the counter (1 7).
  • Operation J 3 ⁇ 4 circuit 105) SCC Set Center;) From Tatsuko, give the number of parameters to be initialized by inputting 0 R code (12G)] 3 Read Counter (107) O Set the initial setting o Note that this
  • the cycle for setting the initial value is the interval at which the force (f R). Of the click generation circuit (100) is counted.
  • Output of input 07) :! Is a new number of j3 ⁇ 4 that was initialized in the previous local period, and this value is new.
  • reading counter (reading counter by the arithmetic circuit ⁇ 05)
  • the initial setting of 107) is also performed by mourning the output (f H) of the sequel (125) as shown in the fifth section.
  • (f H) is sufficiently local compared to (i "R)] ?, which is one of the input terminals of AND gate (122) and the operation circuit.
  • Connect to the input terminal of the circuit ⁇ 05) o Read the arithmetic autopsy cycle (105) and set the initial value of the counter (107) at the logical level to the input of the AD gate ⁇ 21). Give "0" and AND gate (122)
  • the output of the NAN 3 gate (117) (2! C) is 77 (the start of the figure a)] 9
  • the start is delayed, A :: Gate ( ⁇ 5) O
  • o Replacement ⁇ 11) is the lattice number 3 ⁇ 4, 3 ⁇ 4 input child (L ⁇ 2) when the logical level is "0", the input is transmitted, and "1 is up”? Latch output of current information.
  • D digital-anag transformation circuit
  • the output of AZD (103) is stored. Since the frequency divider circuit (109) staggers in the cycle of (fw), the address of ⁇ AM ⁇ 04) in which the voice signal is sampled and stored is continuous. However, the address of 2 A is 0.
  • the voice signal sampled according to the write lock (iw) and recorded as the digital value RA (104) is read and clicked (: f R). It is read according to the D-no-A conversion (112), and the audio signal is regenerated as an analog signal.
  • the ratio of the writing click (w) and the reading click (fR) is the ratio at which the axis is converted to 0. The reading force is read and the click (f) is used.
  • the address which is stepped in the local period of ⁇ ) and therefore reads the contents of RAM (104), has set up a lattice path (111) that is stepped in the cycle of (fR)-. Is to read the wrong address when the AM C104) is read. That is, the reading of the HAM (104) is always performed except when the writing is done. This issue 5 ⁇ is based on Fig. 1. The reading part of the phoneme piece to be connected can be corrected for a long time, but this can be corrected by the operation circuit (105)]? C operation 3 ⁇ 4 3 ⁇ 4 circuit (105). ⁇ yo program
  • FIG. 7 shows the operation of the arithmetic processing unit ⁇ 05) o
  • Each processing cycle is read out.
  • the cycle in which the clock is counted is 0 or less, and the time axis (t) direction is written.
  • the clock (fw) is described in the Fu position.
  • the last M sample columns are stored in the [processing cycle 1] according to the write lock (;: fw) o [processing cycle 2]. From the beginning (M + r :), and] 3 samples, and for this and the M sample columns mentioned above, calculate the point (K) with a high degree of phase M. o The calculation of this (K) will be described later.
  • Fig. 8 (a) and () are the sample M at the rear end of the preceding phoneme piece written in [Processing cycle 1] in Fig. 7, respectively, and the succeeding phoneme at the tip of [Processing cycle 2], respectively.
  • Let the sample sequence at the front end be (Yp) (: P 1, 2' ⁇ ⁇ + r) ⁇
  • This ( ⁇ ) and (IP) write the output of ⁇ D ⁇ 03) Obtained by sampling with (fw).
  • the power to calculate the squared error (ek) of (Yp) is the 0 squared error (e £).
  • equation (2) examines the calculation of two waveforms with different amplitudes and levels. Due to the seismic quasi-amplitude difference (), (the waveform is directly parented, and then the difference from the average level (( ⁇ ) is calculated as the sum of squares), and the error is calculated.
  • the handling waveforms are sputum shapes that are close in time, and their amplitudes and levels are similar to each other. In this case, the difference between the two sputum waveforms is replaced by Eq. (2).
  • equation (3) can be further replaced by the following equation.
  • (Xp) and (Yp + k) are A No. 1) Only the highest level of the converter may be used o, or the polarity near the AC intersection of the input signal may be used o In this case (Xp) and (YP +1 deviation are also [1] or [0] 0, that is, this is the integral of the absolute value of the difference between each corresponding sampling value. ] ?, By knowing k, which is the minimum]? The connection timing is determined o In the present invention, in order to minimize the calculation processing time, replace it with Eq. (4).
  • ( ⁇ ⁇ ® ⁇ ⁇ +) (5) may be calculated ⁇
  • ( ⁇ ) and ( ⁇ + k) are the data of the most significant digit of the ⁇ ⁇ ⁇ converter.
  • the symbol of 2 is an exclusive OR] 3, so (X ® Yp + l) is (Xp) and (Yp + k) ) Exclusive OR, that is, (Xp) and ( ⁇ + k) forces; both are given [: 1] or [0] and [0], and at other times [1].
  • the audio signal given to the input terminal (101) is converted to A code (105).
  • the sample sequence (Xp) and (Yp) are obtained by sampling with the write input (fw), which is the output of the operation circuit ⁇ 08). All timings to capture columns ( ⁇ ) and (Yp) are indicated by the output of the minute, circuit (109) (depending on the value of), and the arithmetic circuit (105).
  • the read-out clock which is the key to the click generation circuit (106), is counted, and when N of these are counted, the initial value of the read-out counter ⁇ 07) is set. Enter the processing cycle of, o The value of this read-out counter is initialized by the immersion of (Xp) and ( ⁇ )]? Obtained (k)
  • the indicated value of the frequency divider circuit when P) is taken in is also added.
  • the operation circuit ⁇ 05 performs the calculation of similarity.
  • the sample string is the analog input OMPI given to the input terminal (101).

Landscapes

  • Engineering & Computer Science (AREA)
  • Computational Linguistics (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Electrophonic Musical Instruments (AREA)

Abstract

Un synthetiseur vocal servant au montage et a la synthese de segments d'elements sonores extraits d'une forme d'onde vocale analogique, qui convertit un signal vocal analogique en un signal numerique, decale relativement les donnees a proximite de l'extremite posterieure du segment d'element sonore precedent et les donnees a proximite de l'extremite du segment d'element sonore suivant au moyen d'un organe de commande arithmetique servant a calculer le degre d'analogie et extrait de maniere cadencee de la memoire les donnees concernant le segment d'element sonore suivant de maniere que ce segment d'element sonore suivant soit relie de la maniere la plus continue au segment d'element sonore precedent. Par consequent, la variation brusque dans la forme d'onde produite au connecteur entre le segment d'element sonore precedent et le segment d'element sonore suivant, c'est-a-dire, le bruit a haute frequence base sur la discontinuite de la forme d'onde, la deterioration du rapport signal/bruit du son synthetise et la deterioration de l'articulation peuvent etre pratiquement eliminees, et l'on peut obtenir un son synthetise ne presentant pas de forme d'onde discontinue ni de variation de la frequence du son au connecteur.
PCT/JP1982/000233 1981-06-18 1982-06-18 Synthetiseur vocal WO1982004493A1 (fr)

Priority Applications (2)

Application Number Priority Date Filing Date Title
DE1982901856 DE81595T1 (de) 1981-06-18 1982-06-18 Sprachsynthesizer.
DE8282901856T DE3277258D1 (en) 1981-06-18 1982-06-18 Voice synthesizer

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
JP81/94802810618 1981-06-18
JP56094802A JPS602680B2 (ja) 1981-06-18 1981-06-18 音声合成装置

Publications (1)

Publication Number Publication Date
WO1982004493A1 true WO1982004493A1 (fr) 1982-12-23

Family

ID=14120186

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/JP1982/000233 WO1982004493A1 (fr) 1981-06-18 1982-06-18 Synthetiseur vocal

Country Status (5)

Country Link
US (1) US4658369A (fr)
EP (1) EP0081595B1 (fr)
JP (1) JPS602680B2 (fr)
DE (1) DE3277258D1 (fr)
WO (1) WO1982004493A1 (fr)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP0114123A1 (fr) * 1983-01-18 1984-07-25 Matsushita Electric Industrial Co., Ltd. Dispositif pour la production d'ondes

Families Citing this family (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US4802224A (en) * 1985-09-26 1989-01-31 Nippon Telegraph And Telephone Corporation Reference speech pattern generating method
JPH0727397B2 (ja) * 1988-07-21 1995-03-29 シャープ株式会社 音声合成装置
JPH05827Y2 (fr) * 1989-01-27 1993-01-11
US5408583A (en) * 1991-07-26 1995-04-18 Casio Computer Co., Ltd. Sound outputting devices using digital displacement data for a PWM sound signal
US5355430A (en) * 1991-08-12 1994-10-11 Mechatronics Holding Ag Method for encoding and decoding a human speech signal by using a set of parameters
US5802250A (en) * 1994-11-15 1998-09-01 United Microelectronics Corporation Method to eliminate noise in repeated sound start during digital sound recording
JP3053576B2 (ja) 1996-08-07 2000-06-19 オリンパス光学工業株式会社 コードイメージデータ出力装置及び出力方法
US10262646B2 (en) 2017-01-09 2019-04-16 Media Overkill, LLC Multi-source switched sequence oscillator waveform compositing system

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPS4881008A (fr) * 1973-01-13 1973-10-30
JPS5062709A (fr) * 1973-10-05 1975-05-28
JPS5597000A (en) * 1979-01-19 1980-07-23 Sanyo Electric Co Sound synthesizer

Family Cites Families (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US31172A (en) * 1861-01-22 Improvement in plows
US3104284A (en) * 1961-12-29 1963-09-17 Ibm Time duration modification of audio waveforms
US3369077A (en) * 1964-06-09 1968-02-13 Ibm Pitch modification of audio waveforms
US3588353A (en) * 1968-02-26 1971-06-28 Rca Corp Speech synthesizer utilizing timewise truncation of adjacent phonemes to provide smooth formant transition
US3575555A (en) * 1968-02-26 1971-04-20 Rca Corp Speech synthesizer providing smooth transistion between adjacent phonemes
FR2364520A2 (fr) * 1976-09-09 1978-04-07 Anvar Procede et dispositif de division de frequences audibles supprimant les distorsions de raccordement du signal de sortie
US4210781A (en) * 1977-12-16 1980-07-01 Sanyo Electric Co., Ltd. Sound synthesizing apparatus
US4369336A (en) * 1979-11-26 1983-01-18 Eventide Clockworks, Inc. Method and apparatus for producing two complementary pitch signals without glitch
US4464784A (en) * 1981-04-30 1984-08-07 Eventide Clockworks, Inc. Pitch changer with glitch minimizer

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPS4881008A (fr) * 1973-01-13 1973-10-30
JPS5062709A (fr) * 1973-10-05 1975-05-28
JPS5597000A (en) * 1979-01-19 1980-07-23 Sanyo Electric Co Sound synthesizer

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP0114123A1 (fr) * 1983-01-18 1984-07-25 Matsushita Electric Industrial Co., Ltd. Dispositif pour la production d'ondes

Also Published As

Publication number Publication date
EP0081595A1 (fr) 1983-06-22
JPS602680B2 (ja) 1985-01-23
US4658369A (en) 1987-04-14
DE3277258D1 (en) 1987-10-15
EP0081595A4 (fr) 1983-10-04
EP0081595B1 (fr) 1987-09-09
JPS57208598A (en) 1982-12-21

Similar Documents

Publication Publication Date Title
US8185386B2 (en) Method and apparatus for performing packet loss or frame erasure concealment
US7881925B2 (en) Method and apparatus for performing packet loss or frame erasure concealment
CA2335006C (fr) Procede et appareil destines a effectuer un masquage de pertes de paquets ou d'effacement de trame (fec)
US5153913A (en) Generating speech from digitally stored coarticulated speech segments
JPS5919358B2 (ja) 音声内容伝送方式
US20070055498A1 (en) Method and apparatus for performing packet loss or frame erasure concealment
WO1982004493A1 (fr) Synthetiseur vocal
US6961697B1 (en) Method and apparatus for performing packet loss or frame erasure concealment
JP3829134B2 (ja) 生成装置、再生装置、生成方法、再生方法、および、プログラム
JP2847699B2 (ja) 音声合成装置
JPH0642158B2 (ja) 音声合成装置
JPS6295595A (ja) 音声応答方式
JP2577372B2 (ja) 音声合成装置および方法
JPH035599B2 (fr)
JP2990693B2 (ja) 音声合成装置
JP2547612B2 (ja) 文章作成システム
JPS61252598A (ja) 音声単語編集方式
JP2861005B2 (ja) 音声蓄積再生装置
JP2992995B2 (ja) 音声合成装置
JPS6042959B2 (ja) アナログ信号合成装置
JP2990691B2 (ja) 音声合成装置
JPS63210900A (ja) 音声合成装置
JPH0358518B2 (fr)
JPS635400A (ja) 音声コ−ド変換器
JPS6265098A (ja) 音楽用ボコ−ダ

Legal Events

Date Code Title Description
AK Designated states

Designated state(s): US

AL Designated countries for regional patents

Designated state(s): CH DE FR GB NL

WWE Wipo information: entry into national phase

Ref document number: 1982901856

Country of ref document: EP

WWP Wipo information: published in national office

Ref document number: 1982901856

Country of ref document: EP

WWG Wipo information: grant in national office

Ref document number: 1982901856

Country of ref document: EP