CN2699429Y

CN2699429Y - Sound synthesizing apparatus and storage media with stored program

Info

Publication number: CN2699429Y
Application number: CNU2004200079068U
Authority: CN
Inventors: 吉冈靖雄
Original assignee: Yamaha Corp
Current assignee: Yamaha Corp
Priority date: 2003-03-06
Filing date: 2004-03-08
Publication date: 2005-05-11
Anticipated expiration: 2014-03-08
Also published as: KR100620484B1; CN100565667C; TWI254276B; JP3888315B2; TW200425056A; JP2004271750A; CN1527279A; KR20040079330A; HK1068185A1

Abstract

The utility model relates to a sound synthesizer and a storage medium for storing a program. A sound synthesis part is provided with a plurality of original signal output units 41. Each original signal output unit 41 is provided with a signal generator 41a which generates a signal Sa of which the level changes periodically, and a multiplier 41b which multiplies the signal Sa by an amplitude value. A controller 10 indicates a phase (p) of the signal Sa at the start of a pitch cycle to each signal generators 41a. At this time, the phase 9 which is 0 is indicated to the signal generator 41a of the original signal output unit 41 in odd number, and the phase p which is pi is indicated to the signal generator 41a of the original signal output unit 41 in even number. Thereby, simple, synthetic and natural sound can be formed.

Description

Speech synthesizing device and the medium that has program stored therein

Technical field

The utility model relates to the medium that a kind of speech synthesizing device and storing that generates the signal of the desired sound of expression by the multiple signal of adduction is used to make the program that computing machine plays a role as this speech synthesizing device.

Background technology

In the past, just relevant for the motion of the technology of the so desired sound of the performance sound of synthetic people's sound or musical instrument.And be one of method that is widely used in this technology (with reference to patent documentation 1) by the speech synthesizing method (hereinafter referred to as " CSM speech synthesizing method ") that CSM (Composite Sinusoidal Modeling) carries out.According to this method, represent the peak value yt of the signal (hereinafter referred to as " synthetic video signal ") of desired sound, as shown in the formula (1) be expressed as like that the different respectively a plurality of sine waves of angular frequency and amplitude and.

yt＝a1sinω1t+a2sinω2t+…aisinωit…+ansinωnt (1)

It should be noted that wherein in following formula (1), " ai (i satisfies 1≤i≤n) " reaches " ω i " is respectively the i time sinusoidal wave amplitude and angular frequency." t " is expression integer constantly in addition, and " n " is the number (so-called CSM number) that the sine wave of adduction is answered in expression.

Fig. 7 is that expression utilizes this CSM speech synthesizing method to come the calcspar of formation of the device of synthetic video.As shown in the drawing, this speech synthesizing device has speech synthesiser 70 and control device 80.Wherein, speech synthesiser 70, have: produce respectively and the n of sine wave output individual (being n=4 here) sinewave output unit 71 (71-1,71-2,71-3 and 71-4), the totalizer 72 that adds the output signal that merges each sinewave output unit 71 of output multiply by specific envelope signal (envelope) back at the signal from totalizer 72 outputs, as the envelope signal handling part 73 of synthetic video signal S output.Each sinewave output unit 71, comprise the angular frequency that produces indication sine wave sine-wave generator 71a and be multiplied by the multiplier 71b of the amplitude of indication for this sine wave.On the other hand, control device 80 shown in Figure 7, indicate corresponding angular frequency of content and amplitude with the sound that should synthesize respectively for the sine-wave generator 71a of each sinewave output unit 71 and multiplier 71b, and, to pitch (tone) corresponding time span (hereinafter referred to as " the pitch period ") T of envelope signal handling part 73 indications with synthetic video.

According to this formation, 4 sine-wave generator 71a are shown in Fig. 8 (a)～(d), being the zero hour that arrives pitch period T that starting point begins to produce sinusoidal wave simultaneously.That is, each sinusoidal wave phase place of the zero hour of pitch period is 0 for whole sine-wave generator 71a.And, by utilize envelope signal handling part 73 will carry out the signal (hereinafter referred to as original signal) that on these sine waves, is multiplied by amplitude adduction data (Fig. 8 (e)), multiply each other with the envelope signal (Fig. 8 (f)) with time span of pitch period T, and obtain the synthetic video signal S of waveform shown in Fig. 8 (g).It is to represent the waveform of this synthetic video signal S and the figure of the curve of spectrum respectively that Fig. 9 (a) reaches (b).That is, transverse axis express time in Fig. 9 (a), the longitudinal axis is represented signal intensity.In addition, transverse axis express time in Fig. 9 (b), the longitudinal axis is represented frequency, the depth of color is represented signal intensity profile.

Patent documentation 1: the spy opens flat 11-231875 communique (especially the 3rd page and Fig. 1)

But, shown in Fig. 9 (a), constitute the synthetic video signal S that obtains by shown in Figure 7, signal intensity is benchmark amesiality (being "+" side here) with level " 0 ", and, shown in Fig. 9 (b), near the obvious high noise composition of the frequency of occurrences zero hour of pitch period T.And the sound based on such signal produces exists and contains noise and ear-piercing problem.On the other hand, for the bias or the noise composition of control signal intensity, also can consider synthetic voice signal S is imposed various filtration treatment.But, in this case, need filtrator be set separately on the basis of the formation that is used to obtain synthetic video signal S, can produce and cause treatment capacity to increase or the such problem of complex structure.

The utility model content

The utility model is made in view of the above problems, purpose be to provide a kind of can be by the speech synthesizing device of the sound that simply constitutes synthetic nature and store the medium that is used to make the program that computing machine plays a role as this note synthesizer.

As mentioned above, by the synthetic video signal S that obtains that constitutes shown in Figure 7, its signal intensity is that benchmark is amesiality with level " 0 ", and contains noise near the zero hour of pitch period.The utility model applicant, think: synthetic video signal S becomes one of reason of the signal with such character, is that just having begun all original signals in back at pitch period begins to increase simultaneously.

That is, because all original signals begin increase simultaneously in the zero hour of pitch period, so the waveform of synthetic video signal sharply and discontinuously changes constantly in the boundary of each pitch period.More particularly, as shown in this figure of time-axis direction enlarged drawing 9 (a) (c), the waveform of synthetic video signal is discontinuous in the zero hour of pitch period.As this discontinuous result, the synthetic video signal is calculated at pitch period and is contained noise the zero hour.

In addition, the waveform of envelope signal, usually, the pitch period that increases simultaneously at all original signals has just begun the back for maximum, decay then.Therefore, in the synthetic video signal that obtains that the signal of all original signal adductions and envelope signal are multiplied each other, it is big to be with level " 0 " that integrated value ratio that benchmark is in "+" part is in integrated value of "-" part, and the signal intensity that produces the synthetic video signal based on the unevenness of this integrated value is amesiality.

The utility model is based on above-mentioned opinion and makes, and it is characterized in that comprising: by a plurality of output mechanisms of the periodically variable original signal of the corresponding pitch period output level of each and the time span of the pitch of the corresponding synthetic video that will generate; Adduction is from the original signal of above-mentioned a plurality of output mechanism outputs, and the synthetic video that generates the synthetic video signal of expression synthetic video generates mechanism; And the indicating mechanism that indicates the original signal that these output mechanisms should export for above-mentioned a plurality of output mechanisms respectively, its on the one hand the level of the original signal of a part of output mechanism output from above-mentioned a plurality of output mechanisms just begun the back at above-mentioned pitch period and increase, just begun the back from other the level of original signal of output mechanism output at above-mentioned pitch period on the other hand and reduce ground, above-mentioned each output mechanism is indicated.

In this constitutes, in the different separately mode of increase and decrease of the level of the original signal after pitch period has just begun on the output mechanism more than 1, by the state of indicating mechanism to each output mechanism indication original signal.That is, avoid all original signals after pitch period just begins, to begin the situation that increases simultaneously.Thereby, according to the utility model, need not synthetic voice signal is carried out filtration treatment, just can suppress the noise of the deflection of signal intensity or each pitch period and obtain the sound of nature.

In the utility model,, for example consider following formation as the different formation of increase and decrease that makes the level after pitch period just begins for original signal of exporting from a part of output mechanism and original signal from other output mechanisms outputs.Promptly, can adopt such formation: differently, by the phase place of above-mentioned indicating mechanism to above-mentioned each output mechanism indication original signal from the phase place of the original signal of above-mentioned a part of output mechanism output and the zero hour that makes above-mentioned pitch period from above-mentioned other the original signal of output mechanism output.In addition, also can be: each output mechanism, the signal that be multiplied by amplitude on the periodically variable signal of level, also will obtain is thus exported as original signal, indicating mechanism, for a part of output mechanism and other output mechanism, as the amplitude that carries out multiplying for these each output mechanisms, the different amplitude of difference designated symbol.Perhaps, also can be: each output mechanism, to have the time span suitable and represent that the time dependent envelope signal of amplitude of above-mentioned original signal is reflected in the signal that is obtained by multiplying each other of amplitude with above-mentioned pitch period, and the signal that will obtain is thus exported as above-mentioned original signal, indicating mechanism, for a part of output mechanism and other output mechanism, indicate the different envelope signal of symbol of the amplitude of expression separately respectively.

And, if a plurality of output mechanisms are the different original signal of output angle frequency respectively, then wish: the output mechanism 1 or more suitable with the little odd number that begins number of the angular frequency of the original signal of exporting from a plurality of output mechanisms is above-mentioned " a part of output mechanism ", and the output mechanism 1 or more suitable with even number is above-mentioned " other output mechanism ".

In addition, each output mechanism has the signal generating mechanism that can generate the different multiple signal of waveform separately, and exports above-mentioned original signal according to the signal that is generated by this signal generating mechanism; Indicating mechanism indicates as the signal that the signal generating mechanism of these output mechanisms should generate in the above-mentioned multiple signal any to above-mentioned each output mechanism.Constitute in view of the above, because based on any output original signal in the multiple signal, so can make tonequality (tone color) variation of the synthetic video that obtains thus.And in this constituted, indicating mechanism indicated as the signal that the signal generating mechanism of these output mechanisms should generate the signal selected by the user in the multiple signal to each output mechanism, then can access the synthetic video that the user likes.

In addition, the utility model is also by specific program for computing machine is played a role as speech synthesizing device of the present utility model.This program can offer computing machine by network, can be that form in the recording medium of representative provides and installs on computers to be stored in the CD also.

Description of drawings

Fig. 1 is the calcspar of formation of the speech synthesizing device of expression embodiment of the present utility model.

Fig. 2 is the calcspar of formation of the speech synthesiser of this speech synthesizing device of expression.

Fig. 3 is the time diagram of expression from the signal of the signal generator output of each output unit.

Fig. 4 is the time diagram of expression from the phase place of the signal of the signal generator output of each output unit.

Fig. 5 is the waveform of expression synthetic video signal and the figure of the curve of spectrum.

Fig. 6 is the calcspar of formation of speech synthesizing device of the variation of this embodiment of expression.

Fig. 7 is the calcspar of the formation of the existing speech synthesizing device of expression.

Fig. 8 is the time diagram that is illustrated in the waveform of the signal that supplies processing in the existing speech synthesizing device.

Fig. 9 is the waveform of the existing synthetic video signal of expression and the figure of the curve of spectrum.

Symbol description

100: speech synthesizing device, 10: control device (indicating mechanism), 20: memory storage, 30: input media, 40: speech synthesiser, 50: loudspeaker, 41 (41-1,41-2,41-3,41-4): original signal output unit (output mechanism), 41a: signal generator (signal generating mechanism), 41b: multiplier, 41c, 43: envelope signal handling part, 42: totalizer (synthetic video generates mechanism), Sa: periodic signal, Sb: original signal, Sd: synthetic video signal.

Embodiment

Below, with reference to accompanying drawing embodiment of the present utility model is described.

(A: the formation of embodiment)

Fig. 1 is the calcspar of formation of the speech synthesizing device of expression embodiment of the present utility model.As shown in the drawing, this speech synthesizing device 100 has: control device 10, memory storage 20, input media 30, speech synthesiser 40, loudspeaker 50.Wherein memory storage 20, input media 30 and speech synthesiser 40 are connected on the control device 10 by bus (bus) respectively.In addition, loudspeaker 50 is connected on the speech synthesiser 40.

Control device 10 is the devices that are used to control speech synthesizing device 100 integral body.Specifically, control device 10, have, carry out the control of each several part or the CPU of various calculation process (Central Processing Unit) by executive routine, store the ROM (ReadOnly Memory) of the program of carrying out by this CPU, by the RAM (Random AccessMemory) of CPU as the operating area use.

Memory storage 20 is the devices that are used to store program of being carried out by control device 10 or the data of using when it is carried out.For example, can adopt various devices such as hard disk unit or optical disc apparatus as memory storage 20.In this memory storage 20, store and be used for carrying out the synthetic program (hereinafter referred to as " sound synthesis program ") of sound at speech synthesiser 40.Control device 10 by carrying out this sound synthesis program, and plays a role as the mechanism that the various indications that relevant sound is synthetic give speech synthesiser 40.

Input media 30 has indicators such as mouse or is used for input characters or the keyboard of mark etc., and the signal of corresponding user's operation is outputed to control device 10.The user is by this input media 30 of operation, and the pitch (tone) that can suitably select the sound that for example should synthesize maybe should be for the waveform of the synthetic signal of sound etc.

Speech synthesiser 40 is the indications that give according to control device 10, generates the mechanism of synthetic video signal and output from a plurality of original signals.Loudspeaker 50, output and the corresponding sound of exporting from speech synthesiser 40 of synthetic video signal.In addition, though here as the device illustration that is used for output sound loudspeaker 50, also can replace it and the earphone (earphone or headphone) that is installed on user's ear is set.

Fig. 2 is the calcspar of the concrete formation of expression speech synthesiser 40.As shown in the drawing, this speech synthesiser 40 has a plurality of original signal output units 41 (41-1,41-2,41-3 and 41-4), totalizer 42 and envelope signal handling part 43.Original signal output unit 41 that quantity is set is corresponding with the number of signals (being the CSM frequency n) of the object that constitutes adduction in the CSM speech synthesizing method.But, in the present embodiment, be visualized as the situation that 4 original signal output units 41 are set.In addition, in the following description, under the situation of special each original signal output unit 41 of difference expression, use parameter i and souvenir is " original signal output unit 41-i ".

These original signal output units 41 are generation and output angle frequency and amplitude mechanisms of different original signal Sb respectively.Each original signal output unit 41 has signal generator 41a and multiplier 41b.Wherein, signal generator 41a, signal (hereinafter referred to as " the periodic signal ") Sa that generates level cyclical movement exports.The signal generator 41a of present embodiment has as periodic signal Sa and select to generate any function that sinusoidal wave, triangle involves the different multiple signal of wave form such as square wave.Specifically, signal generator 41a has the storer of Wave data of the waveform of 1 Cycle Length that stores the various signals of expression in advance.And, chronologically mutually continuous by the waveform that makes any Wave data of expression, and generate signal with specific waveform.On the other hand, multiplier 41b is after the periodic signal Sa from this signal generator 41a output is multiplied by amplitude, export as original signal Sb.

Totalizer 42, the original signal Sb that adduction is exported from 4 original signal output units 41, and export the signal Sc that obtains in view of the above.Envelope signal handling part 43 is the mechanisms that are used to change from the envelope signal of the amplitude of the signal Sc of totalizer 42 outputs.Promptly, envelope signal handling part 43, have envelope signal generator (envelope generator) 43a that generates the time dependent envelope signal of expression amplitude, the envelope signal of Sheng Chenging is multiplied by from the signal Sc of totalizer 42 outputs in view of the above, it is amassed export as synthetic video signal Sd.

(B: the action of embodiment)

If the operation of input media 30 being stipulated by the user, then control device 10, with being stored in after sound synthesis program in the memory storage 20 is read among the RAM, then carry out in proper order by it.Below, synthetic action describes to the sound that carries out with the execution of this sound synthesis program.

The user by the operation of before the synthetic beginning of sound input media 30 being stipulated, and can select the kind of the periodic signal Sa that each signal generator 41a should produce.After detecting this operation, control device 10 will represent that data (hereinafter referred to as " the waveform selection data ") Ds of kind of the signal selected is to the signal generator 41a of each original signal output unit 41 output.On the other hand, each signal generator 41a, according to this waveform selection data Ds, the signal that specific this signal generator 41a should generate.In addition, the user can select from the kind of the periodic signal Sa of signal generator 41a output carrying out the synthetic moment arbitrarily of sound.When carrying out this operation, the waveform selection data Ds of kind of the selecteed signal of expression is outputed to each signal generator 41a from control device 10, and the kind of the specific periodic signal Sa that should export from these signal generators 41a.

Simultaneously, the operation of input media 30 being stipulated the user and give sound synthetic begin to indicate the time, control device 10 gives the indication that begins that signal takes place to each signal generator 41a.And, control device 10, each have predetermined time span (for example about 5ms～30ms) during in, calculate the frame parameter of the content corresponding and output to speech synthesiser 40 with the sound that should synthesize.As shown in Figure 2, this frame parameter comprises angular frequency i (ω 1, ω 2, ω 3 and ω 4), initial stage phase place pi (p1, p2, p3 and p4), amplitude ai (a1, a2, a3 and a4) and pitch period T.Wherein, pitch period T is and the corresponding time span of the pitch of the sound that should synthesize, and is fed into envelope signal handling part 43.In addition, angular frequency i and initial stage phase place pi supply with to the signal generator 41a of each original signal output unit 41.That is, its situation is: at the signal generator 41a of original signal output unit 41-1 indicated angle frequencies omega 1 and initial stage phase place p1, at the signal generator 41a of original signal output unit 41-2 indicated angle frequencies omega 2 and initial stage phase place p2.Angular frequency i represents the angular frequency of the periodic signal Sa that the signal generator 41a of original signal output unit 41-i should export.In addition, in the present embodiment, to angular frequency 1 minimum that the signal generator 41a of original signal output unit 41-1 supplies with, the value of ω 2, ω 3, ω 4 increases successively.On the other hand, initial stage phase place pi, the phase place of the zero hour of the pitch period among the periodic signal Sa that the signal generator 41a of expression original signal output unit 41-i should export.In addition, amplitude ai should be multiplied by from the amplitude of the signal of signal generator 41a output at multiplier 41b, supplies with to the multiplier 41b of each original signal output unit 41.For example, its situation is: at the multiplier 41b of original signal output unit 41-1 indication amplitude a1, at the multiplier 41b of original signal output unit 41-2 indication amplitude a2.

Speech synthesiser 40 when reaching the zero hour of pitch period, is reflected in the synthetic video value of frame parameter.Here, in Fig. 3 (a), represent the supply moment of frame parameter with downward arrow.When reaching the zero hour of pitch period T, the frame parameter that speech synthesiser 40 will just have been supplied with before is reflected in the synthetic video.That is, in the frame parameter of supplying with successively by time as Fig. 3 (a) shown in, only influential to synthetic video by the frame parameter of the time supply of band " zero ".In addition, in Fig. 3 (a), example illustrates the time interval situation shorter than the time span of pitch period T of supplying with frame parameter.But, supply with time interval of this frame parameter, be corresponding predetermined time span with the content of the sound that should synthesize, also there be the time span longer than pitch period T.

On the other hand, parallel with the output of the frame parameter of control device 10, each original signal output unit 41 generates the original signal output of corresponding frame parameter, promptly, the signal generator 41a of each original signal output unit 41, be the signal by waveform selection data Ds appointed category, output has the periodic signal Sa as the angular frequency i of frame parameter indication.The output action of this periodic signal Sa is undertaken by every pitch period T.That is, each signal generator 41a is from the output that begins periodic signal Sa the zero hour of pitch period T.And each signal generator 41a is initial stage phase place pi as the frame parameter indication with the phase settings of the periodic signal Sa of zero hour of pitch period.From the periodic signal Sa of each signal generator 41a output, export as original signal Sb in the back of multiplying each other of the amplitude ai that passes through multiplier 41b.

In the present embodiment, the initial stage phase place p1 of indication and the value of p3 are configured to " 0 " in the original signal output unit 41-1 of odd number of the little number of the angular frequency i of indication since 4 original signal output units 41 and 41-3 (below, abbreviate " the original signal output unit 41 of odd number " as).Relative therewith, the initial stage phase place p2 of indication and the value of p4 are configured to " π " in the original signal output unit 41-2 of even number of the little number of the angular frequency i of indication since 4 original signal output units 41 and 41-4 (below, abbreviate " the original signal output unit 41 of even number " as).Here, Fig. 4 is the time diagram of phase change of the periodic signal Sa of expression original signal output unit 41.In the figure, transverse axis express time, the longitudinal axis are represented the phase place of each periodic signal Sa.As shown in Figures 3 and 4, the periodic signal Sa of the original signal output unit 41 of one side odd number, its phase place is updated to " 0 " when whenever reaching the zero hour of pitch period, and the periodic signal Sa of the original signal output unit 41 of even number on the other hand, its phase place is updated to " π " when whenever reaching the zero hour of pitch period.Its result, just increase the back at each pitch period since the level of the original signal Sb of original signal output unit 41 output of odd number, the level since the original signal Sb of original signal output unit 41 outputs of even number just reduces (with reference to Fig. 3) in the back at each pitch period simultaneously.

On the other hand, the original signal Sb from 41 outputs of 4 original signal output units supplies to envelope signal handling part 43 as signal Sc after carrying out adduction by totalizer 42.Envelope signal handling part 43 will be multiplied by signal Sc by the envelope signal that envelope signal generator 43a generates, and the signal that will obtain is in view of the above exported as synthetic video signal Sd.Here, envelope signal generator 43a generates the envelope signal with time span identical with the pitch period T that indicates as frame parameter from control device 10.And loudspeaker 50 is according to the synthetic video signal Sd output synthetic video from 43 outputs of envelope signal handling part.

Like this, in the present embodiment, the original signal Sb of the output of the part (original signal output unit 41-1 and 41-3) since 4 original signal output units 41 just increases the back at pitch period on the one hand, and other original signal (from the original signal of original signal output unit 41-2 and 41-4) reduces after pitch period has just begun on the other hand.In other words, avoid all original signals after pitch period just begins, to increase (or minimizing) simultaneously.Its result, according to present embodiment, can suppress the deflection of amplitude of signal intensity shown in Figure 9 or pitch period the zero hour noise and obtain the synthetic video of nature.About this point, describe in detail with reference to Fig. 5.

Here, Fig. 5 (a) is the curve map of expression from the waveform of the synthetic video signal Sd of envelope signal handling part 43 outputs, and Fig. 5 (b) is the curve of spectrum of this signal Sd.Transverse axis express time in Fig. 5 (a), the longitudinal axis is represented signal intensity.On the other hand, transverse axis express time in Fig. 5 (b), the longitudinal axis is represented frequency, the depth of color is represented the distribution of signal intensity (shallow part signal intensity is big).Fig. 5 (a) is compared obviously as can be known with Fig. 9 (a), and in the present embodiment, signal intensity is that benchmark is to "+" side and the two-way even distribution of "-" side with level " 0 ".Think this be because: by since the original signal Sb of original signal output unit 41 output of odd number pitch period just the back increase and the deflection that produces to the signal intensity of "+" side, with by just reducing and the deflection to the signal intensity of "-" side that produces is cancelled out each other the back at pitch period since the original signal Sb of original signal output unit 41 outputs of even number.And shown in Fig. 5 (b), the synthetic video signal Sd of present embodiment compares the high-frequency composition (noise) that has reduced the firm zero hour of pitch period with the existing synthetic video signal S shown in Fig. 9 (b).As shown in Fig. 5 of time-axis direction enlarged drawing 5 (a) (c), also can know and find out that the synthetic video signal waveform is in the consecutive situation zero hour of pitch period.Like this, according to present embodiment, can suppress the deflection of signal intensity shown in Figure 9 or pitch period the zero hour noise and obtain the synthetic video of nature, and, need not to be used to suppress the filtration treatment of noise of the zero hour of the deflection of signal intensity or pitch period.

In addition, in the present embodiment, because select in the different multiple signal of waveform any as periodic signal Sa output, so can make tonequality (especially tone color) variation of the synthetic video that obtains based on this.And, because can select any of this multiple signal, so can obtain having the synthetic video of the tonequality that the user likes by the user.

(C: variation)

More than Shuo Ming embodiment is a kind of illustration eventually, for the various distortion in addition in the scope that does not break away from main thought of the present utility model of this embodiment.As concrete distortion, for example have below shown in mode.

(C-1: variation 1)

Though in the above-described embodiment, illustration initial stage phase place pi is made as the formation of " 0 " or " π ", the initial stage value of phase place pi is not limited thereto.Promptly, for make by since the original signal of original signal output unit 41 output of odd number pitch period just the back increase the deflection that produces to the signal intensity of "+" side, with by just reducing the deflection that produces and cancel out each other in the back at pitch period to the signal intensity of "-" side since the original signal of original signal output unit 41 outputs of even number, signal intensity with level " 0 " be benchmark to two-way even distribution, wish the suitably value of selected each initial stage phase place pi.

In addition, in the above-described embodiment, though illustration the formation of phase place of each original signal of the zero hour of control pitch period, the original signal that is used to avoid all is not limited thereto in the formation of the situation that increase the zero hour of pitch period simultaneously.Can replace this formation, adopt formation as shown below.

(1) mode 1

Also can be set as: the symbol, the different formation of symbol of the amplitude of indicating with the original signal output unit 41 of even number of amplitude that makes original signal output unit 41 indication of odd number.For example, in formation shown in Figure 2, the symbol with amplitude a1 and a3 is made as "+" on the one hand, and the symbol with amplitude a2 and a4 is made as "-" on the other hand.In this constitutes, because will be from the original signal of the original signal output unit 41 of odd number and phase differential in fact only stagger " π " from the original signal of the original signal output unit 41 of even number, so the waveform of each original signal is identical with the waveform shown in Fig. 3 (b).Thereby, can obtain the effect identical with above-mentioned embodiment.

(2) mode 2

Also can be set as: the signal of the multiplier 41b of the original signal output unit 41 of odd number from 4 original signal output units 41 output, with from the signal of the multiplier 41b output of the original signal output unit 41 of even number, multiply by the different envelope signal of symbol respectively after, the formation of exporting as original signal.That is, as shown in Figure 6, on 4 original signal output units 41, be respectively equipped with and take advantage of from the envelope signal of control device 10 indication from the envelope signal handling part 41c on the output signal of multiplier 41b.Specifically, on the envelope signal handling part 41c of the original signal output unit 41 of odd number, use such envelope signal, increase to "+" side its zero hour from pitch period, reduces then to reach " 0 " in the finish time of pitch period.Thereby, just increasing the back at pitch period since the original signal of original signal output unit 41 output of odd number, the original signal since original signal output unit 41 outputs of even number just reduces the back at pitch period on the other hand.That is, because the situation of having avoided all original signals after pitch period just begins, to increase simultaneously, so can obtain the effect identical with above-mentioned embodiment.

Like this, in the utility model, so long as just having begun the back at pitch period, the level of a part of original signal in a plurality of original signals of one side increases, the level of original signal at other gets final product in the formation that pitch period has just begun to reduce the back on the other hand, need not be thought of as the level of such setting original signal and how to constitute.

(C-2: variation 2)

In above-mentioned embodiment and variation, though illustration speech synthesiser 40 have the formation of 4 original signal output units 41, the quantity of original signal output unit 41 is not limited thereto.In a word, so long as the formation that a plurality of original signal output units 41 is set at speech synthesiser 40 get final product.

In addition, though in the above-described embodiment, illustration at the original signal output unit 41 of the little odd number that begins number of the angular frequency of original signal from a plurality of original signal output units 41 and the original signal output unit 41 of even number, make the different formation of increase and decrease of the original signal after pitch period just begins, but the mode of the differentiation of original signal output unit 41 is arbitrarily.For example, be under 4 the situation at original signal output unit 41, also can be: the group that is distinguished into group that the original signal output unit 41 by No. 1 and No. 4 constitutes, constitutes with original signal output unit 41 by No. 2 and No. 3, and for two groups of different formations of increase and decrease that make the original signal after pitch period has just begun.Perhaps, also can be: be distinguished into 1 original signal output unit 41 and other original signal output unit 41, and make the different formation of increase and decrease of the original signal after pitch period has just begun for the two.In a word, for the original signal output unit 41 1 or more of the original signal output unit more than 1 in a plurality of original signal output units 41 41, as long as make the different formation of increase and decrease of the original signal after pitch period has just begun with other.

(C-3: variation 3)

In above-mentioned embodiment and each variation, though illustration constitute the situation of speech synthesiser 40 by DSP, but also can be: the acting in conjunction of the program of carrying out by hardware such as CPU and this CPU, and realize being used to export original signal mechanism's (original signal output unit 41) or adduction original signal, generate mechanism's (totalizer 42) of synthetic video signal.

As mentioned above, according to the utility model, can be by the sound that simply constitutes synthetic nature.

Claims

1. a speech synthesizing device is characterized in that: comprise

By each and a plurality of output mechanisms corresponding to the periodically variable original signal of the corresponding pitch period output level of time span of the pitch of the synthetic video that will generate;

Adduction is from the original signal of described a plurality of output mechanism outputs, and the synthetic video that generates the synthetic video signal of expression synthetic video generates mechanism; And

For described a plurality of output mechanisms indicate respectively original signal that these output mechanisms should export, with its on the one hand the level of the original signal of a part of output mechanism output from described a plurality of output mechanisms in indicating mechanism that described pitch period has just begun mode that the back increases, just begun to reduce the back at described pitch period from other the level of original signal of output mechanism output on the other hand, described each output mechanism is indicated.

2. speech synthesizing device as claimed in claim 1 is characterized in that:

Described indicating mechanism, so that the phase place of the zero hour of described pitch period is indicated the phase place of original signal from the original signal of described a part of output mechanism output and different mode from described other the original signal of output mechanism output to described each output mechanism.

3. speech synthesizing device as claimed in claim 1 is characterized in that:

Described each output mechanism is multiplied by amplitude on the periodically variable signal of level, and the signal that will obtain thus exports as described original signal,

Has different symbols for the described amplitude of described a part of output mechanism with described amplitude for described other output mechanisms.

4. speech synthesizing device as claimed in claim 1 is characterized in that:

Described each output mechanism, to have the time span suitable and represent that the time dependent envelope signal of amplitude of described original signal is reflected in the periodically variable signal of level with described pitch period, and the signal that will obtain is thus exported as described original signal

Described indicating mechanism, for described a part of output mechanism and described other output mechanism, the indication different envelope signal of symbol of the amplitude of expression separately respectively.

5. as any described speech synthesizing device in the claim 1～4, it is characterized in that:

Described a plurality of output mechanism, the different original signal of output angle frequency respectively;

Described a part of output mechanism, be the odd number suitable output mechanism 1 or more that begin number little with the angular frequency of the original signal of from described a plurality of output mechanisms, exporting, described other output mechanism is the even number suitable output mechanism 1 or more that begins number little with the angular frequency of the original signal of exporting from described a plurality of output mechanisms.

6. speech synthesizing device as claimed in claim 5 is characterized in that:

Is π from the angular frequency of the original signal of the output mechanism of odd number output with phase differential from the angular frequency of the output mechanism output original signal of even number.

7. as any described speech synthesizing device in the claim 1～4, it is characterized in that:

Described each output mechanism has the signal generating mechanism that can generate the different multiple signal of waveform difference, and exports described original signal according to the signal that is generated by these signal generating mechanisms;

Described indicating mechanism indicates as the signal that the signal generating mechanism of these output mechanisms should generate in the described multiple signal any to described each output mechanism.

8. speech synthesizing device as claimed in claim 7 is characterized in that:

Described indicating mechanism indicates as the signal that the signal generating mechanism of these output mechanisms should generate the signal selected by the user in the described multiple signal to described each output mechanism.

9. the medium that can be read by computing machine that has program stored therein is characterized in that: make the computing machine conduct

A plurality of output mechanisms by the periodically variable original signal of the corresponding pitch period output level of each and the time span of the pitch of the corresponding synthetic video that will generate;

For described a plurality of output mechanisms indicate respectively original signal that these output mechanisms should export, with its on the one hand the level of the original signal of a part of output mechanism output from described a plurality of output mechanisms begun just at described pitch period that the back increases, the level of original signal of output mechanism output from other has just begun the mode that reduce the back at described pitch period on the other hand, the indicating mechanism that described each output mechanism is indicated;

Play a role.