GB2077078A

GB2077078A - System for discriminating human voice signal

Info

Publication number: GB2077078A
Application number: GB8112276A
Authority: GB
Original assignee: Bodysonic KK
Current assignee: Bodysonic KK
Priority date: 1980-04-21
Filing date: 1981-04-21
Publication date: 1981-12-09
Also published as: GB8403418D0; GB2143105A; FR2480978B1; IT8148303A0; CA1184506A; FR2480978A1; GB2143105B; DE3115801A1; DE3115801C2; GB2077078B; IT1142404B

Abstract

A system for descriminating a human voice signal from other sounds comprises a mixer 24 having two input channels and connected to a high pass filter 25 for removing subsonic sounds below 20 HZ. The output of the filter 25 is connected through a low pass filter 3 to the input of a voltage controlled variable gain amplifier 4, the output of which is connected to a body-felt vibration generator. The output of filter 25 is also connected through an equalizer 26 and a signal level compressor 27 to the inputs of a double bass sound region deriving circuit 28, a level detector 8, and a human voice fundamental wave emphasizing and deriving circuit 9. The double bass circuit 8 includes a low pass filter having a cut-off frequency of 60 Hz. The circuit 9 derives the fundamental human voice sound wave by regarding the human voice as an amplitude modulated wave with the fundamental waves generated in the vocal cords being regarded as the modulating waves and the formants imposed during passage through the vocal tubes as the carrier wave. The circuit 9 includes a detector and a band pass filter. The outputs of the circuits 28, 8 and 9 are connected to the inputs of an arithmetic circuit 10, the output of which is connected to the control terminal of the amplifier 4. In use, the amplifier 4 suppresses double bass signals originating from a human voice but passes other double-bass signals. <IMAGE>

Description

SPECIFICATION Method and system for discriminating human voice signal This invention relates to a method for discriminating a human voice signal from any other sound signal, and also to a system for the method.

In the living enviroment of human beings, various sounds are existent. In recording and reproducing the sounds, it is required to record and reproduce them at high sensitivity and high fidelity. In general, satisfying these requirements suffices.

In such general cases, all the sounds having entered, for example, a microphone have only to be equally amplified and reproduced. In some special uses, however, it becomes necessary to classify input signals depending on their properties or types and to subject a certain type of signals to processing different from those of the other types of signals.

For example, it may be desired to derive human voices from among other noises or to pick up the voices of an announcer separately from the other sounds such as music in case of recording sounds for a radio or television broadcast. It may also be desired to discriminate the human voice when providing the input signal for a body felt vibration generator, such a generator having recently been proposed and put into practical use.

The body-felt vibration generator is a device in which, in order to give rise to the ambience with sound reproducing equipment, signals having double bass sound components of, e.g., below 1 50 Hz forming amplifier output signals to be applied to loudspeakers (or headphones) are converted into mechanical vibrations so as to directly give the latter to the human body as body-felt vibrations, and the sounds and the body-felt vibrations of the double bass sound region are simultaneously given to an appreciator, thereby to provide an intense double bass sound feeling. The device is effective principally for the appreciation of music.

It has been experimentally confirmed that, when the device is combined with sounds other than music, for example, documentary sounds and effect sounds such as these which accompany an earth tremor or the feeling of a shock such as the roaring of a gun, the running sound of a street car or steam locomotive, noise emitted when a big tree is cut down, explosion, and the running sounds and engine sounds of an automobile, tractor etc., the ambience and dramatic effects which cannot be obtained with conventional sound reproducing equipment can be brought forth.

When such device is incorporated into the cinema, television or the like, real presence and numerous dramatic effects will be obtained. In this regard, a film drama, a television drama or the like is, in general, composed of the combination of human voices (conversation), music, and effect sounds.

Accordingly, when such sound signals are given to the appreciator in the form of the body-felt vibrations without performing any additional processing, good effects can be expected as to the sounds which accompany an earth tremor and the feeling of a shock, while body-felt vibrations are also produced for human voices (conversation). This latter effect has been experimentally revealed to cause a very unnatural feeling.

In order to produce good effects in the film drama, the television drama or the like, therefore, the advent of a device which is responsive to the musical and effect sounds but is not responsive to the human voices is desired.

The production and features of the human voice will now be considered. When the vocal cords vibrate to open and close the trachea at the natural frequency therof, periodic pulsating pressure waves are generated at the vocal cords. It is known that the pressure waves are pulse waves of nearly saw-tooth waveform whose fundamental wave components vary in a considerably wide range, in case of the voice of a grown up man during the ordinary conversation. (hereinafter "pulse wave" is referred to as "saw-tooth wave").The saw-tooth waves generated in the vocal-cords are given complicated formants such as resonance and anti resonance while propagating through the vocal tube consisting of the pharynx, oral cavity, nasal cavity etc., and they are emitted from the lips as voice. (In general, the resonance frequencies are distributed at frequencies considerably higher than the fundamental wave frequency of the saw-tooth waves generated in the vocal-cords).

These are schematically illustrated in Figs 2(a) to 2(c). Figs. 2 (a) shows the saw-tooth wave generated in the vocal-cords, while Fig.

2(b) shows the voice waveform given the formants in the vocal tube. The saw-tooth waves produced in the vocal cords include higher harmonics abundantly, and hence, when they are given the formants in the reasonance circuit part of the vocal tube, they become as shown in Fig.2 (b).

As seen from this figure, the fundamental wave component decreases, and the formant frrequency components increase considerably.

On the other hand, in general, when sawtooth waves are used to amplitude-modulate a carrier wave, a waveform as shown in Fig.2 (c) is obtained. When Figs. 2 (b) and 2(c) are compared, it will be seen that although the processes of production of the waveforms are very similar macroscopically.

Therefore, when the voice waveform in Fig.

2 (b) is regarded as a kind of amplitudemodulated wave, the detection thereof permits to readily emphatically derive the fundamental wave component generated in the vocal-cords.

This is the fundamental principle of this invention.

In accordance with one aspect of this invention, there is provided a method for discriminating a human voice signal in which a voiced phone part of the human voice is regarded as an amplitude modulated wave, the fundemental wave components generated in the vocal cords being regarded as the modulating wave and the formants which are imposed on the fundamental components during passage through the vocal 4ube being regarded as the carrier wave, and in which the fundamental components generated in the vocal cord are derived and emphasised so as to discriminate the human voice from other sounds.

In accordance with another aspect of this invention, there is provided a system for discriminating a human voice signal comprising:- control means having an input terminal, an output terminal and a control terminal for controlling signals passing between said input and output terminals by a signal applied to said control terminal: and-a double-bass sound region deriving circuit for deriving double-bass sound components the frequency of which is lower than that of the fundamental wave components of the human voice of a signal applied, in use, to said input terminal of said control means, an input terminal of said double-bass region deriving circuit being connected to the input terminal of said control means and the output terminal of said doublebass region deriving circuit being connected to the control terminal of said control means.

In accordance with a further aspect of-this invention, there is provided a system for discriminating a human voice signal wherein an input level deriving circuit which rectifies an input signal and integrates the rectified signal to provide a direct current signal proportional to the input signal level, and a human-voice fundamental wave emphasizing and deriving circuit which derives the fundamental wave components generated in the vocal-cords from the input signal have their inputs connected to an input terminal of control means having said input terminal, an output terminal and a control terminal for controlling a signal which passes from said input terminal to said output terminal in accordance with a signal applied to said control terminal; an output terminal of said input level deriving circuit and an output terminal of said human-voice fundamental wave emphasizing and deriving circuit being connected to an arithmetic circuit which operates onthe output of said human-voice fundamental wave emphasizing and deriving circuit and the output of said input level deriving circuit, and an output terminal of said arithmetic circuit being connected to said control terminal of said control means.

This invention will now be described in more detail by way of example with reference to the drawings in which:, Figure 1 is a block diagram of a system for discriminating human voice signals from other signals in accordance with this invention: Figure 2 (a) to 2 (d) are waveform diagrams schematically showing the waveforms of a human voice and sounds of music etc:, Figure 3 is a block diagram of another system for discriminating human voice signals from other signals in accordance with this invention:, Figure 4 is a block diagram showing a detector circuit which forms a human-voice fundamental wave emphasizing and deriving circuit in Fig. 3:, Figure 5 and 6 are block diagrams each showing a detector circuit used in practice, and Figure 7 is a block diagram showing a further system according to the present invention.

Hereunder, this invention will be illustratively desribed in conjunction with a body-felt vibration generator which is one example of application thereof. When the frequency components of human voices, music, effect sounds etc, are studied, the low region of the human voices (chiefly, male voice) extends to below 80 Hz and the high region extends to above 10 kHz in both the male and female voices though they are also dependent upon recording circumstances and individual differences.

The effect sounds are of a very large number of sorts, and have various frequency spectra. An earth tremor or such a sound as causing the feeling of a shock is characterized by including a wide frequency distribution and abundant double-bass sound components.

Musical sounds often have a frequency distribution which is uniform and wide compared with those of the other sounds.

In view of these facts, it might be possible to distinguish human voice sounds from other sounds by passing them through an appropriate filter. However, a low-pass filter which can remove the double bass sound fundamental wave component of human voices extending to below 80 Hz needs to have its cuttoff frequency set at 60 Hz or below, which poses a problem. More specifically, among frequency components below approximately 1 50 Hz which are effective as body-felt vibrations, the frequency component density is highest at spectra near 100 Hz, and the distribution density at and below 50-60 Hz is lower.

Therefore, signals having been passed through the aforecited low-pass filter have most of the components effective for the body-felt vibrations removed, and the effect of the body-felt vibration generator is drastically spoilt.

Since the sound which accompanies an earth tremor includes double-bass sound components of and below 60 Hz sufficiently, it can give rise to the body-felt vibrations. With out the components near 100 Hz, however, the real feeling effect is lessened, and flat vibrations of unvivid feeling are caused. This situation maybe likened to the sound of a loudspeaker from which a tweeter has been detached so that it cannot give forth high pitched sounds.

Fig.1 shows an example of a system which has eliminated such problem. Referring to the figure, numeral 1 designates an input termi nal, which receives a sound signal from an amplifier. A first low-pass filter 3 having a cutoff frequency of approximately 1 50 Hz and a voltage-controlled variable gain amplifier (VCA) 4 are connected between the input terminal 1 and output terminal 2.

The low-pass filter 3 serves to derive from its input signal the double-bass sound compo nent below approximately 1 50 Hz convenient for the body-felt vibration and to supply the derived component to the body-felt vibration generator. The voltage-controlled variable gain amplifier 4 is an amplifier whose gain in creases or decreases upon application of a voltage to a control end 4a thereof. In the present example, an amplifier whose gain increases with an applied positive voltage is employed.

Further connected to the input terminal 1 is the input side of a second low-pass filter 5 having a cutoff frequency of approximately 60 Hz. The output side of the second low-pass filter 5 is connected to the control end 4a of the voltage-controlled variable gain amplifier 4 through a rectifier circuit 6 as well as an integration circuit 7.

In this circuit, when an effect sound of the type which accompanies an earth tremor in cluding double bass sound components of and below 60 Hz abundantly has entered the input terminal 1 as the input sound signal, both the two low-pass filters pass the double bass sound components of the signal. An output signal of the second low-pass filter 5 is rectified by the rectifier circuit 6, and is there after integrated by the integration circuit 7.

An output voltage of the integration circuit 7 is applied to the control terminal 4a of the voltage-controlled variable gain amplifier 4, and increases the gain of this amplifier. Thus, the double bass sound component signal hav ing passed through the first low-pass filter 3 is amplified greatly and is delivered to the out put terminal 2. Accordingly, when the body felt vibration generator is connected to the output terminal 2, an appreciator can obtain the body-felt vibration. Since the output signal includes frequency components near 100 Hz, a real vibration effect can be afforded.

Now, there will be described a case where a human voice signal has entered as the input signal. Since the human voice signal scarcely includes the double bass sound components of and below 60 Hz, any double bass sound component does not pass through the second low-pass filter 5, so that the output voltage of the integration circuit 7 diminishes. Therefore, the gain of the voltage-controlled variable gain amplifier 4 decreases, the signal having passed through the low-pass filter 3 is scarcely amplified, and the output signal at the output terminal 2 becomes zero or very small. Accordingly, the human voice hardly gives rise to a body-felt vibration. Even with this circuit, the intended purpose is tentatively achieved.However, among the musical and effect sounds desired to generate the body-felt vibrations, there are a good many that do not include components of and below 60 Hz or that include them only a little. Therefore, the system of Fig. 1 which cannot generate a body-felt vibration when the component of and below 60 Hz does not exist is not always satisfactory.

Fig 3 is a block diagram of a system according to this invention having solved the problem. This system is arranged to emphasize and derive a fundamental wave component generated by the sound source of vocal cords, and then compare it with the total input signal level, thereby to discriminate the human voice from other sound.

The system is constructed of an input level deriving circuit 8, a human-voice fundamental wave emphasizing and deriving circuit 9, a arithmetic circuit 10, and the low-pass filter 3 and the voltage-controlled variable gain amplifier 4 described before.

First, the human-voice fundamental wave emphasizing and deriving circuit 9 will be described in detail. Basically, this circuit is a detector circuit. It will be explained in conjunction with a simple circuit shown in Fig. 4.

The circuit of Fig. 4 consists of a detector 11 and a band-pass filter 1 2 for selecting 80-300 Hz. According to the circuit, the fundamental wave component of the vocalcords sound source can be derived.

On the other-hand, a waveform of a sound other than the human voice, for example, the waveform of a musical sound is as shown in Fig. 2 (d). The signal of this sort has a wide frequency spectrum from a low region to a high region as shown in the figure. When this signal is detected and passed through the band-pass filter of the circuit in Fig. 4, an output provided at an output terminal thereof is not very great. This is attributed to the facts that since the original waveform has the wide frequency range, the percentage of the signal component to pass through the band-pass filter 1 2 relative to the input signal is small when the original signal is detected and passed through the band-pass filter 12, and that the waveform is not a definite shape which can be deemed an amplitude-modulated wave as shown in Fig. 2 (b).

Accordingly, when the input signal and output signal of the circuit of Fig. 4 are respectively rectified, appropriately weighted and are compared and operated on it becomes possible to distinguish the human voice from other sounds. That is, in case where an input signal of fixed level is applied to the circutit of Fig.4, the signal whose output level is outputted relatively high is regarded as the human voice, and the signal whose output level is outputted relatively low is regarded as other sounds.

Since a waveform given the formants is not a simple sinusoidal wave as in a carrier wave shown in Fig. 2 (c, the human voice waveform shown in Fig. 2 (b) does not become a vertically symmetric form as in the amplitudemodulated wave of Fig. 2 (c) but becomes a vertically asymmetric form as in Fig. 2 (b) in many cases. For this reason, in case of detecting the input signal with the circuit of Fig. 4, the solution of the calculation becomes greatly different depending upon whether an envelope on the positive side or an envelope on the negative side is taken.

In case of the waveform in Fig.2 (b), it is apparent that the distinguishability is enhanced with the envelope on the positive side because the output level at the output terminal in Fig. 4 becomes higher therewith. Although the waveform shown in Fig. 2 (b) has high peaks on the positive side, actually there is the inverse case where high peaks exist on the negative side. Therefore, the detector 11 in Fig. 4 needs to derive the envelope of higher level on either the positive or negative side or to derive the envelopes on both the positive and negative sides as added.

In order to fulfill this condition, the detector 11 may be made the full-wave rectification type. Since, however, the full-wave rectifier circuit is a kind of frequency multiplier circuit when viewed in terms of frequency, the input frequency is doubled. Let it now be supposed that the detector 11 is the full-wave rectification type in the circuit of Fig.4 and that a sinussoidal wave of 50 Hz has entered the input thereof. Then, a pulsating waveform having a fundamental wave component of 100 Hz (double frequency) appears at the detection output, with the result that an output comes out through the band-pass filter having the pass band of 80-300 Hz. Therefore, when the double bass sound associated with music or an earth tremor is input, its frequency is multiplied to provide a relatively high output, resulting in the drawback that the distinguishability is lowered.

In order to eliminate such drawback, a circuit shown in Fig. 5 disposes a high-pass filter (low-cut filter) 1 3 which removes the double bass sound region not being very important for the discrimination of the human voice. Even when the succeeding detector 14 of the full-wave rectification type performs the frequency multiplication, the component to pass through the band-pass filter 1 2 lessens.

According to the result of an experiment, it was suitable to set the cut-off frequency of.

the high-pass filter 13 at approximately 1 30 Hz.

The output of the band-pass filter 1 2 is rectified by the succeeding rectifier circuit (AC-to-DC converter) 1 5. Further, the output of the rectifier circuit 1 5 has the high fre- quency component removed by the succeeding low-pass filter 1 6 and becomes a DC output proportional to the output of the bandpass filter 1 2.

Fig. 6 shows an example of another circuit arrangement which executes the same operation. This circuit is such that the intermediate portion of the circuit of Fig. 5 has been modified. In this circuit, detectors 1 7 and 18 are not of the full-wave rectification type having the frequency multiplying action but are of the half-wave rectification type, so as to derive the positive side envelope by means of the detector circuit 1 7 and the negative side envelope by means of the detector 18. The detected output on the negative side has its polarity inverted by an inverter 1 9.

The respective detected outputs are applied to band-pass filters 20 and 21 and are rectified by rectifier circuits 22 and 23. Thereafter, the rectified outputs are combined and applied to a low-pass filter 1 6. The high-pass filter 1 3 at the input portion is unnecessary in principle, but it provides for the enhancement of the distinguishability. The detectors and rectifier circuits in Figs. 4 to 6 should desirably be absolute va!ue circuits or linear detection circuits in order to reduce operational errors due to nonlinearity.

Next, the input level deriving circuit 8 in Fig. 3 will be described. This circuit integrates or smooths the input signal as rectified and provides a DC voltage proportional to the input level.

The input sides of the input level deriving circuit 8 and the human-voice fundamental wave emphasizing and deriving circuit 9 are connected to the input terminal 1 together with the input side of the low-pass filter 3.

The output sides of the input level deriving circuit 8 and the human-voice fundamental wave emphasizing and deriving circuit 9 are respectively connected to input terminals 1'0a and 1 0b of the arithmetic circuit 10.

The output signal of the input level deriving circuit 8 and the output signal of the humanvoice fundamental wave emphasizing and deriving circuit 9 are weighted to appropriate levels. In addition, an appropriate weighting such as selection of the time constant of the integration circuit or the cut off frequency of the smoothing low-pass filter is executed in the time axis region.

Using the appropriately weighted signals, the arithmetic circuit 10 performs the following operation. It subtracts the output of the human-voice fundamental wave emphasizing and deriving circuit 9 from the output of the input level deriving circuit 8. Then, in the case where the input signal is the human voice, the output of the human-voice fundamental wave emphasizing and deriving circuit 9 becomes greater than that of the input level deriving circuit 8 owing to the appropriate weighting, and hence, a negative signal is provided as the output of the arithmetic circuit 10. In contrast, in the case where the input signal is music or any other sound, the output of the human-voice fundamental wave emphasizing and deriving circuit 9 becomes smaller than that of the input level deriving circuit 8, and hence the output of the arithmetic circuit 10 is a positive signal.

On the other hand, the input signal passes through the low-pass filter 3 (having a cutoff frequency of approximately 1 50 Hz) for deriving components effective for the body-felt vibrations and is applied to the voltage-controlled variable gain amplifier 4. Therefore when the input signal is the human voice, the output of the voltage-controlled variable gain amplifier 4 is inhibited or weakened, whereas when the input signal is any other sound signal, the gain of the voltage-controlled variable gain amplifier 4 is increased to greatly amplify the input signal.

The voltage-controlled varible gain amplifier 4 can be replaced with a voltage-controlled variable frequency filter (VCF). It can also be replaced with a gate in the form of an analog switch, or the like. Since however, the binary "on"-' 'off" switching in this case gives rise to an unnatural feeling, the gentle and continuous "on"-' 'off" control based on the analog increase and decrease by the voltage-controlled variable gain amplifier 4 is more favourable after all.

In the above description, the human voices have referred only to the cases of voiced phones. Since unvoiced phones (consonants produced without the vibrations of the vocal cords) have their spectra in a higher frequency band, they are removed by the low-pass filter for deriving the body-felt vibration signal (having a cuttoff frequency of approximately 1 50 Hz) and cause no problem (the low-pass filter in Fig.1 or Fig.3).

Even in the cases of voiced phones, voices of high fundamental wave frequencies such as children's voices, female voices and highpitched voices given forth in singing a song cause no problem for the same reason. It has been experimentally verified that the difference of the distinguishabilities based on different languages is scarcely noticeable which is thought to owe to the same reason.

Shown in Fig. 7 is block diagram of practical system for performing this invention, in which the elements of both the circuits in Figs. 1 and 3 are combined. In this case, two input terminals 1 are disposed so as to receive signals through two channels and to mix them by means of a mixer circuit 24. Connected on the output side of the mixer circuit 24 is a subsonic filter 25 for removing unnecessary subsonic waves of and below 20 Hz.

The output side of the subsonic filter 25 is connected to a voltage-controlled variable gain amplifier 4 through a low-pass filter 3 for deriving a body-felt vibration signal (having a cuttoff frequency of 1 50 Hz) and to an equalizer circuit 26. The equalizer circuit 26 gives the double-bass region and the high-pitched region appropriate equalizing curves in advance in order to enhance the human-voice distinguishability.

Connected to the output of the equalizer circuit 26 is a level compressor circuit 27, the output of which is connected to the input of a double-bass region deriving circuit 28, an input level deriving circuit 8 and a humanvoice fundamental wave emphasizing and deriving circuit 9. The output sides of these circuits 28, 8 and 9 are respectively connected to the inputs 1 0c, 1 0a and 1 0b of an arithmetic circuit 10.

The level compressor circuit 27 functions to compress the dynamic range of the input signal so as to prevent errors from developing in discrimination and operations (if the level is too low, the discrimination will be difficult, and if it is too high, an operational amplifier will be saturated). The double-bass sound region deriving circuit 28 is a circuit which detects sound having abundant double-bass sound components, such as the sound which accompanies an earth tremor, and which corresponds to the part enclosed with a one-dot chain line in Fig. 1. The input level deriving circuit 8 and the human-voice fundamental wave emphasizing and deriving circuit 9 correspond to the circuits of the same numerals in Fig. 3, respectively.

In this arrangement, the arithmetic circuit 10 functions to subtract the output of the human-voice fundamental wave emphasizing and deriving circuit 9 from the sum of the output of the double-bass sound region deriving circuit 28 and that of the input level deriving circuit 8. As with the system shown in Fig. 3, the output signal of the arithmetic circuit 10 controls the voltage-controlled variable gain amplifier 4. When the weighting of the double bass sound region deriving circuit 28 is made comparatively high and a dead band is included so as to prevent the circuit from responding to levels below a threshold value, good results are obtained.

Claims

1. A method of discriminating a human voice signal in which a voiced phone part of the human voice is regarded as an amplitude modulated wave, the fundamental wave components generated in the vocal cords being regarded as the modulating wave and the formants which are imposed on the fundamental components during passage through the vocal tube being regarded as the carrier wave, and in which the fundamental components generated in the vocal cord are derived and emphasized so as to discriminate the human voice from other sounds.

2. A system for discriminating a human voice signal comprising: -control means having an input terminal, an output terminal and a control terminal for controlling signals passing between said input and output terminals by a signal applied to said control terminal: and -a double-bass sound region deriving circuit for deriving double-bass sound components the requency of which is lower than that of the fundamental wave components of the human voice of a signal applied, in use, to said input terminal of said control means, an input terminal of said double-bass region deriving circuit being connected to the input terminal of said control means and the output terminal of said double-bass region deriving circuit being connected to the control terminal of said control means.

3. A system for discriminating a human voice signal as defined in claim 2, wherein said control means includes a voltage-controlled variable gain amplifier whose gain varies in accordance with a voltage applied to its control terminal.

4. A system for discriminating a human voice signal as defined in claim 2, wherein said control means includes a voltage-controlled variable frequency filter, the frequency characteristics of which are controlled in accordance with a voltage applied to its control terminal.

5. A system for discriminating a human voice signal as defined in claim 2, wherein said control means includes a gate which is constructed as an analog switch which turns "on" -"off" in a binary fashion in accordance with a voltage applied to its control terminal.

6. A system for discriminating a human voice signal as defined in any of claims 2 to 5 wherein the control means includes a first low pass filter and the double bass deriving circuit includes a second low pass filter.

7. A system for discriminating a human voice signal wherein an input level deriving circuit which rectifies an input signal and integrates the rectified signal to provide a direct current signal proportional to the input signal level, and a human-voice fundamental wave emphasizing and deriving circuit which derives the fundamental wave components generated in the vocal-cords from the input signal have their inputs connected to an input terminal of control means having said input terminal, an output terminal and a control terminal for controlling a signal which passes from said input terminal to said output terminal in accordance with a signal applied to said control terminal: an output terminal of said input level deriving circuit and an output terminal of said human-voice fundamental wave emphasizing and deriving circuit being connected to an arithmetic circuit which operates on the output of said human-voice funda- mental wave emphasizing and deriving circuit and the output of said input level deriving circuit: and an output terminal of said arith- metic circuit being connected to said control terminal of said control means.

8. A system for discriminating a human voice signal as defined in claim 7, wherein said control means includes a voltage-controlled variable gain amplifier whose gain varies in accordance with a voltage applied to its control terminal.

9. A system for discriminating a human voice signal as defined in claim 7, wherein said control means includes a voltage-controlled variable frequency filter the frequency characteristics of which are controlled in accordance with a voltage applied to its control terminal.

1 0. A system for discriminating a human voice signal as defined in claim 7, wherein said control means includes a gate which is constructed as an analog switch which turns "on" and "off" in binary fashion in accordance with a voltage applied to its control terminal.

11. A system for discriminating a human voice signal as defined in any of claims 7 to 10 in which the control means includes a filter.

1 2. A system for discriminating a human voice signal substantially as herein before described with reference to and as shown in Fig.

1, or Figs. 3 and 4, or Figs. 3 and 5, or Figs.

3 and 6 or Fig.7.