US3387090A - Method and apparatus for displaying speech - Google Patents
Method and apparatus for displaying speech Download PDFInfo
- Publication number
- US3387090A US3387090A US395876A US39587664A US3387090A US 3387090 A US3387090 A US 3387090A US 395876 A US395876 A US 395876A US 39587664 A US39587664 A US 39587664A US 3387090 A US3387090 A US 3387090A
- Authority
- US
- United States
- Prior art keywords
- voltage
- sound
- speech
- amplitude
- frequencies
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Expired - Lifetime
Links
- 238000000034 method Methods 0.000 title description 12
- 208000019300 CLIPPERS Diseases 0.000 description 13
- 208000021930 chronic lymphocytic inflammation with pontine perivascular enhancement responsive to steroids Diseases 0.000 description 13
- 230000014509 gene expression Effects 0.000 description 10
- 230000004044 response Effects 0.000 description 10
- 239000003990 capacitor Substances 0.000 description 9
- 230000006870 function Effects 0.000 description 9
- 238000001228 spectrum Methods 0.000 description 9
- 230000000007 visual effect Effects 0.000 description 8
- 238000010586 diagram Methods 0.000 description 7
- 230000001419 dependent effect Effects 0.000 description 6
- 230000008859 change Effects 0.000 description 4
- 230000000694 effects Effects 0.000 description 4
- 230000007613 environmental effect Effects 0.000 description 4
- 210000001847 jaw Anatomy 0.000 description 4
- 238000004458 analytical method Methods 0.000 description 3
- 230000008901 benefit Effects 0.000 description 2
- 239000002131 composite material Substances 0.000 description 2
- 238000001514 detection method Methods 0.000 description 2
- 238000005259 measurement Methods 0.000 description 2
- 210000003254 palate Anatomy 0.000 description 2
- 230000001960 triggered effect Effects 0.000 description 2
- 206010011878 Deafness Diseases 0.000 description 1
- 230000009471 action Effects 0.000 description 1
- 230000003321 amplification Effects 0.000 description 1
- 230000001143 conditioned effect Effects 0.000 description 1
- 230000007547 defect Effects 0.000 description 1
- 230000009977 dual effect Effects 0.000 description 1
- 238000001914 filtration Methods 0.000 description 1
- PCHJSUWPFVWCPO-UHFFFAOYSA-N gold Chemical compound [Au] PCHJSUWPFVWCPO-UHFFFAOYSA-N 0.000 description 1
- 239000010931 gold Substances 0.000 description 1
- 229910052737 gold Inorganic materials 0.000 description 1
- 238000002955 isolation Methods 0.000 description 1
- 239000000203 mixture Substances 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 238000003199 nucleic acid amplification method Methods 0.000 description 1
- 238000004806 packaging method and process Methods 0.000 description 1
- 230000008569 process Effects 0.000 description 1
- 230000005236 sound signal Effects 0.000 description 1
- 230000003068 static effect Effects 0.000 description 1
- 210000001260 vocal cord Anatomy 0.000 description 1
- 230000001755 vocal effect Effects 0.000 description 1
- 238000004804 winding Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
Definitions
- This invention concerns an electronic means and apparatus for correlating acoustic speech to visual patterns and more specifically to such means and apparatus for presenting such patterns to display distinguishable characteristic differences in phonemes and variations in phonemes.
- Sounds produced by the human voice are a composite of frequencies tha-t include, in addition to a resonant fundamental frequency, a grouping of frequencies over the entire articulated range-This range is from a very low frequency of only a few cycles per second to an upper frequency in the neighborhood of 10,000 cycles per second.
- a sound made by the human voice is analyzed spectrographically, it has been discovered that the intensity level of the frequencies become predominant in about -three ranges, or formants, none of which is in the vicinity of the fundamental.
- formants are determined by the resonant characteristics of the laryngeal tone through the vocal tract.
- the larynax harmonic is not exactly at a formant, the various resonant frequencies of the cavities of the throat, mouth, lips, and nasal passages do determine where the formants occur.
- the formants change in relative position and amplitude. The combination of these varying elements is detectable by the ear which is capable of discriminating these changes and interpreting such changes into aural recognition.
- voiced and unvoiced sounds Two terms of art that have special meanings in describing the origin of phonemes are voiced and unvoiced sounds.
- a voiced sound is derived mainly from the vibrating vocal folds.
- the unvoiced sounds are derived without parts of the body vibrating but through making air vibrate.
- the unvoiced t may be distinguished from tent icc the voiced d
- the unvoiced p may be distinguished from the voiced b
- the unvoiced k may be distin guished from the voiced g and so forth.
- Unvoiced sounds are further subclassified into two categories, the fricative sounds, generally made by the tongue on the upper palate near the front of the mouth, and the plosive sounds, generally made by the lips coming completely together or by the tongue on the upper palate near the back of the mouth.
- second means connected to said first means and principally responsive to the number of time axis crossings of the composite range of speech frequencies (and relatively nonresponsive to the amplitude of the speech frequencies) for producing a discrete number of spiked pulses (at virtually constant amplitude if said second means is independent of amplitude) per a unit of time, and
- third means connected to said second means to produce per the unit of time a voltage amplitude proportional to the number and amplitude of the pulses produced by said second means,
- said voltage amplitude being characteristic of the speech frequencies and capable of being recorded and observed by any convenient means
- each speech sound has its own distinct shape relatively independent of loudness with which the sound is enunciated and the frequency spectrum (such as in a male or female voice spectrum) in which the sound is enunciated.
- the illustrated embodiment reveals an electronic arrangement comprising a microphone, an a-mplifier with a preamplifier including a pad for emphasizing the high frequencies relative to the low frequencies, a peak clipper, a differential circuit, an integrating circuit, and a display unit.
- the microphone may be of any commercial type that is capable of passing high frequency sounds without appreciable attenuation.
- the amplifier may be any type of audio or wide band amplifier in combination with either a preemphasis network capable of amplifying the high frequency components of the signal received from the microphone or deemphasizing the low frequency spectrum relative to the high frequency components.
- the peak clipper preferably clips the signal received from the amplifier at an amplitude level just above the environmental noise conditions so as to make the remainder of the circuit substantially independent of amplitude. As will be explained, this is not true for all embodiments, but for simplification of explanation, the signal from the peak clipper essentially carries its entire intelligence in the number of times the signal crosses over from negative to positive per unit of time rather than in amplitude variations in the signal.
- the differential circuit receives the signal from the peak clipper and converts the number of time-axis crossings of the clipped signal to an equal number of spiked pulses.
- the integrator circuit having a time constant equal to the selected unit of time receives the pulses from the differentiating circuit and produces a voltage amplitude proportional to the number of pulses.
- An oscilloscope, strip recorder, etc. may be connected to the output from the integrator circuit so as to present an image that is characteristic of the sound spoken into the microphone.
- the imperfect speech may be that of a child just learning the language, that of an adult with a speech impediment, that of a person with a regional or foreign accent, or that of any other person or speech simulator.
- Another means is to present the reference sound through an electronic channel, much the same as that described above for the basic display device described above.
- the channel may be simplified by eliminating many of the circuits required in the basic channel and recording on electronic recording tape or other means the reference display.
- a double-trace oscilloscope may then be used to simultaneously display to the student the reference display and the one he makes himself.
- such a device would act to discipline the student in urging him to repeat the sound the same number of times as the recorded standard and help him conform to the duration of the standard as well as the general shape.
- Another method that may be used in connection with the reference and the basic channels would be the addition of a voltage comparator. Such a comparator would produce an output voltage dependent upon the average voltage difference between the reference and the basic channels. Such difference could 4be either simultaneous or a total difference over the entire duration of a sound cycle. The voltage may either be metered or displayed as on an oscilloscope. The object for the student, then would be to try to reduce the voltage to as low a value as possible.
- a refinement of the above arrangement may be the addition of a limit trigger circuit that would turn on an out-of-limits device when the voltage difference from the comparator exceeded a predetermined value, or condition. If the value is not exceeded then the predetermined condition may be tightened up a bit by an appropriate logic circuit.
- FIG. 1 is a frequency spectrum chart of the frequencies commonly present in a spoken phoneme.
- FIG. 2 is a frequency spectrum chart of the phoneme shown in FIG. 1 after preemphasis in accordance with this invention.
- FIG. 3 is a block diagram of one system embodiment of this invention.
- FIG. 4 is a schematic diagram of one embodiment of this invention.
- FIG. 5 is a partial schematic diagram of a portion of one embodiment of this invention.
- FIG. 6 is a waveform diagram of sample traces produced in accordance with this invention.
- FIG. 7 is a pictorial interconnection diagram in accordance with one system embodiment of this invention.
- FIG. 8 is a block diagram of various alternate arrangements of system components in accordance with this invention.
- FIG. 1 is a typical representation of a spectrographic analysis of frequency components plotted to compare -their relative amplitudes .in a speech sound.
- the fundamental frequency 2 of the originating vibrating folds for the sound is very small with respect to the many other component frequencies in the diagram.
- the fundamental frequency is in the to 500 c.p. s. range.
- the frequencies in the range near this first peak value are defined as occurring at the first formant 4.
- the maximum frequency is not necessarily either the harm-onic frequency of the large cavity of the mouth or throat or an even multiple of the fundamental, although it is strongly inuenced by both.
- the amplitude falls off to a relatively small value and then increases to a second peak, smaller than the first in amplitude.
- the frequencies in the range near this second peak are classed together and identified as the second formant 6.
- a third group of frequencies are defined as the third formant, the peak amplitude value of this group being typically smaller in amplitude than the second formant peak value.
- the sound to be examined on the display of the embodiment of this invention shown in FIG. 3 may be spoken into any detection and converting means such as a microphone 12, which may be any conventional electroacoustic transducer that -responds to sound waves and which delivers essentially equivalen-t electric waves over the wide ⁇ band of frequencies necessary for examination and discrimination.
- a microphone 12 may be any conventional electroacoustic transducer that -responds to sound waves and which delivers essentially equivalen-t electric waves over the wide ⁇ band of frequencies necessary for examination and discrimination.
- any microphone used have a certain
- the analysis diagrammed in FIG. 1 is of a representative signal produced from the microphone 12.
- Microphone 12 may be connected to any conventional wideband lamplifier 14 matched with -proper matching impedance characteristics.
- One such amplifying means is the Preferred Circuit No. PSC 19 found in NAVWEPS 16-1-519-2, printed Apr. l, 1962.
- Included within the amplifier may be any type of common -preemphasis network means capable of emphasizing the high frequencies with respect to the low frequencies.
- Such a network may conveniently take the form of a tunable high pass filter comprised of passive components that attenuates the lows more th-an the highs, although more sophisticated networks containing active component-s are in common usage.
- a filtering system with selectable filter pads or with a variable component .so that the degree of preemphasis may be easily controlled is desirable .in some applica-tions, such las adjusting the waveshape during initial setup conditions.
- waveform 16 The output from amplifier 14 is represented by waveform 16 in FIG. 3.
- waveform 16 may best be examined with reference to FIG. 2 where a spectrographic analysis lsimilar to FIG. 1 is shown.
- the normal speech sounds have comm-only a first form-ant 4, a second formant 6 and a third formant 8 in descending amplitudes.
- the first formant 4 is relatively deemphasized, either by actual deemph-asis, less emph-asis, ⁇ or by absence of emphasis so as to provide essentially fiat amplification over the range of frequencies of the first formant, and appears now as first formant 18.
- the second -formant 20, yby contrast is amplified and is essentially the same amplitude as (or possibly even la little higher than) the amplitude of first formant 18.
- thi-rd formant 22 is amplified to be as high a's or higher than second formant 20.
- the envelope of the unvoiced frequency sounds are amplified greatly so that the peak portion of this envelope, also, is as high as, or possibly a little higher than, -the peaks of the various formants.
- a satisfactory amount of preempha'sis has been a circuit that preemphasizes decibels at 5 kilocycles and 20 deci'bles at 20 kilocycles.
- this circuit means translates the number of signal crossings per unit of time into a voltage amplitude, so that the more numerous the crossings the larger the voltage.
- FIG. 4 depicts successively, an impedance matching circuit 26, a peak clipper 28, a differentiat-or 30, and an integrator 32.
- the impedance matching circuit 26 is shown as a matching transformer 34, although other commonly known types of circuit-s may be used equal success. Since the output from the amplifier may be only a few ohms, a transformer is used to provide the necessary isolation for the remainder of the circuit. The output, or secondary, winding of the transformer normally ha-s a-n impedance on the order of a few .hundred ohms.
- the secondary of transformer 34 may be connected to an amplitude limiting or clipping means such as a peak 6 clipper 28, a convenient form of which may be a pair of diodes 36 and 38 as shown in FIG. 4.
- the diodes are connected so that the cathode of diode 36 is connected to the more positive side of the transformer and diode 38 is connected so that its cathode is connected to the more negative side of the transformer.
- a ground 40 may be connected to the bottom line.
- diode 38 If an internal resistance of diode 38 is assumed then when the voltage level on positive cycles of the voltage exceeds a threshold level the diode will conduct and appear to be thefinternal resistance of the diode, thereby establishing a maximum equivalent to the voltage drop across this resistance, as explained starting on page 583 of Electronic Fundamentals and Applications by John D. Ryder, copyright 1950 by Prentice-Hall, Inc. Because the threshold level of the diode may be assumed to be very small, on the order of a fraction of a volt, lthe clipping level is very small, ideally just above the environmental noise level of the room or background.
- the signal from the peak clipper 28 may lthen be applied to a differentiating means such as a differentiator 30.
- a differentiating means such as a differentiator 30.
- One such circuit may simply be a combination of a capacitor 42 and resistor 44 connected so that the capacitor is in series with the positive line from the peak clipper and the resistor is shunted between output side of the capacitor and ground. The theory of this differentiator is given starting on page 569 of the Ryder reference cited above.
- the output from differentiator 30 may be considered to be responsive to phase reversals and may be represented by a series of pulses at the voltage level of the clipped voltage height, there being a pulse for each positive-going signal, or change of signal from negative-going to positive-going. Since the clipping level is at a very small value, as explained above, the number of pulses is equivalent to the number of time axis crossings of the signal in the positive direction.
- each pulse is established by the time constant of capacitor 42 and resistor 44, and not by the duration of the applied signal. Therefore, each pulse is uniform in duration as well as amplitude.
- the output from the differentiator 30 may then be applied to an integrating means such as integrator 32, a convenient form of which may be merely a combination of a diode 46, a capacitor 48 and a'resistor 50.
- Diode 46 conducts each time a positive pulse is received from the differentiator and applies the voltage to the capacitor 48 of the integrator 22, which in turn, bleeds off across resistor 50. If a number of pulses pass through diode 46, before the voltage can bleed off across resistor 50, then the voltage level will be an effective measurement of the number of pulses, or an effective voltage summing means. That is, the number of pulses received will be proportional to the voltage level.
- a satisfactory time constant for the operation of the integrator has been found to be 0.2 second.
- This voltage level may be represented on a display unit, a convenient one of such units being an oscilloscope 52, although a strip recorder or similar device may be equally acceptable.
- Each voice sound is represented with a trace having unique and identifiable amplitude characteristics. Since an oscilloscope has a high input resistance, and since a relatively long time constant including a high resistance value is required for the integrator, resistor 50 may be merely the input resistance of the oscilloscope.
- the diodes used in the peak clipper inherently overcome this diiiculty. This is because the voltage drop across a conducting diode is somewhat a function of current, so that instead of the procurement of a truly flat output when the diode conducts during the clipping operation, the output voltage is slightly humped or rounded. The wider the waveform (as for the low frequency signals) the higher the peak of the hump is allowed to go and hence the higher the peak will be on the subsequent and corresponding differentiator pulse. This means that the time axis crossings response is not linear, but tends to be logarithmic and allows the same scale on the display unit to be used for meaningful results of .high and low frequencies.
- FIG. 4 Another alternate circuit that may be used in place of the peak clipper and differentiator shown in FIG. 4 is a tunnel diode. Characteristic of a tunnel diode is that it produces a spiked output when a certain applied voltage level is exceeded, which output is similar to the output from differentiator 30.
- the integrator circuit in addition to the one shown in FIG. 4, may also conveniently take the form of an .RC combination or an LR combination, as described n page 571 of the Ryder text reference above and both of which are common in the art.
- a preferred structure of the RC combination that has been used successfully is the Daniels type integrator. This arrangement shows two sections of resistors and capacitors, as shown in FIG. 5 in which resistor 58 is three times the value of resistor 54 and capacitor 60 is one-third the value of capacitor 56.
- This circuit arrangement has extremely good drop-off voltage characteristics so that it can operate with a relatively short time constant and be responsive to high freD quencies.
- each sound is represented by a voltage that has a distinct shape relatively independent of loudness (although an extremely loud signal will have an effect on the maximum height of the hump of the voltage waveform from the diodes in the clipper circuit) of the sound.
- This shape depending on the components used, may or may not be linear and may or may not be logarithmic in response. However, it will be dependent on time axis crossings of the generating sound signal.
- Another characteristic of the shape is that the overall frequency of the sound has little effect on the resulting waveshape. This means that a deep male voice enunciating an expression produces a trace that will be similar to a woman or a child with a high voice enunciating the same expression. The only difference will be the height of the trace (function of voltage) and the slight amount of variation caused by the non-linear response. But since this response is a relative response, then the overall appearances of the traces are highly similar in form or shape.
- FIG. 6 Resulting waveforms for a few example expressions are shown in FIG. 6.
- the relatively unvoiced expression ka shown in waveform 62 has a much steeper onset than the relatively voiced expression ga" shown in waveform 64. This is to be expected since there are a greater number of high frequencies in the unvoiced sounds than in the voiced sounds.
- Waveforms 70 and 72 show the comparison between the indication of the pronunciation'of the phonemes bee and bi as in bit Waveform 70 shows that there is a somewhat slow build up of frequency at the beginning and even slower, or more gradual, drop off of frequency at the end. A careful pronunciation of this phoneme confirms the truth of the trace.
- Waveform 72 by comparison starts and stays at virtnally the same frequency throughout its trace. This, too, is confirmed by a careful pronunciation of the expression, which reveals that the vowel sound is initiated toward the front of the mouth and not so much toward the rear as in bee Finally, waveforms 74 and 76 show the traces of the expressions cha and jaw, respectively. Both sounds start approximately at the same frequency, but the cha trace 74 maintains the vowel sound at the same frequency for a much longer duration than does the jaw trace 76.
- One convenient use of the invention may be the displaying of the speech sounds from a student with those of a reference used as a standard of excellence. The student just learning the language, or trying to correct or perfect his speech affected by some defect or accent, may merely attempt to match the display with his own spoken sounds.
- One technique for accomplishing such a comparison may be through the use of tracing the standard on a piece of tracing paper and overlaying the face of a display oscilloscope therewith.
- the trace may be traced by any convenient means, such as photographically.
- FIG. 7 a microphone 77, a preamplifier 78, an amplifier 80, a character determining circuit 82, and one channel of a doubletrace oscilloscope 86 form the 4basic channel and reference 9 channel 84 and the other channel of the double-trace oscilloscope 86 form the com-plete display reference channel.
- the boxes of the basic channel are shown arranged in the manner shown to obtain maximum use of standard component packaging.
- the microphone 77, the preamplifier 80, and the double-trace oscilloscope 86 are standard units and are commercially avail-able as complete entities. Notice that the preamplifier must have means for emphasizing the high frequencies and preferably should also have means for de-emphasizing the low frequencies. Also, the emphasis should preferably be adjustable in various frequency ranges. Such preamplifers are standard components for hi-fidelity equipment.
- the character determining circuit may be any convenient circuit that produces a voltage responsive to timeaxis crossings of the frequency, as explained above, for displaying on one of the two display channels of the oscilloscope.
- the reference channel may be a duplicate of the basic channel starting with a tape recording of the reference sound.
- the channel can be simplified by merely taping the voltage that results from the action of such a channel of components so that it may be displayed on the second trace channel of the oscilloscope.
- the reference sound may be repeated several times so that the student can watch the references trace while speaking the sound himself and attempting to conform as closely as possible therewith.
- FIG. 8 shows a reference channel 88 and a basic channel 90 similar to the ones for FIG. 7. But, instead of an oscilloscope connected to the respective outputs from these two'channels, a voltage comparator 92 is shown.
- lIn F-IG. 8 when the two traces are identical, then there will be no output. When there is a difference, there will be an output. In addition, the more the difference, the greater the output.
- the output from voltage comparator 92 may be applied directly to -an oscilloscope 94 or it may be applied to an integrator 96, and then to oscilloscope 94 or voltage meter (not shown).
- the advantage of using an integrator is that a longtime constant may be lused so that the total amount of voltage difference may be shown for a whose phoneme, rather than the relatively instantaneous voltage difference that would otherwise be shown.
- limit trigger circuit 98 such as a Schmitt trigger
- limit trigger circuit 98 such as a Schmitt trigger
- an impulse triggers an out-of limits indicating device 104, which may conveniently be a light. If the limit is not reached then there is no output pulse on line 106. But, there is an output directly -from the comparator on line 108 each time a word is spoken so that Nor gate 100 is conditioned to allow the variable level device 102 to change.
- This variable level device 102 may be any convenient device, such as a counter and logic arrangement that selects progressively larger (or smaller) impedances for changing the control voltage for the limit trigger circuit 10 98. What is established, in any event, is a tightening, or resetting, of the limits of circuit 98 so that a subsequent voltage difference from the comparator at the same level as the time before, which failed to trigger the out-oflimits ⁇ device 104, this time triggers device 104. This makes it harder and harder for the student to mispronounce a sound without device 104 indicating, having the effect of forcing him into closer match of the reference.
- said voltage having -a unique and identifiable variable amplitude characteristic as a function of time for every uttered speech sound
- the means for producing a representation of said voltage including means for producing a visual display of such voltage as .a function of time.
- detecting means for converting sounds in the range of speech frequencies to electrical energy, amplifying means connected to said detecting means such that the frequency components of the electrical energy above an intermediate frequency are emphasized relative to the frequency components of the electrical energy below the intermediate frequency,
- a speech display device comprising detecting means for converting the range of speech frequencies including up to approximately 8000 cycles per second to electrical energy
- amplifying means connected to said detecting means such that t-he frequency components of the electrical energy above an intermediate frequency, which is below approximately 5000 cycles per second, are emphasized relative to the frequency components of the electrical energy below such intermediate frequency,
- differentiating means connected to said limiting means for producing a series of pulses, the number of said pulses being proportional to the number of time axis crossings,
- integrating means connected to said differentiating means for producing a voltage amplitude dependent upon the number of pulses per unit of time and the amplitude of said pulses
- said limiting means is connected to said amplifying means via a positive and a negative line and said limiting means comprises a pair of diodes, one of said diodes connected wit-h its cathode to the positive line and its anode to the negative line and the other of said diodes connected with its cathode to the negative line and its anode to the positive line.
- said limiting means is approximately logarithmically responsive to the frequencies of the energy signal from said amplifier means such that the lower frequencies are not so limited as the higher frequencies, and
- a speech display device comprising detecting means for converting the range of speech frequencies including up to about 8000 Hz. to electrical energy
- amplifying means connected to caid detecting means such ⁇ that the frequency components of the electrical energy above about 5000 Hz. are emphasized rela tive to the frequency components of the electrical energy below about 5000 Hz.
- each produced pulse being of about equal duration and amplitude dimension with every other produced pulse
- integrating means connected to said pulse-producing means for producing a voltage amplitude dependent upon the number of pulses per unit of time and the amplitude of said pulses
- a speech display device comprising detecting means for converting the range of speech frequencies including up to about 8000 Hz, to electrical energy
- amplifying means connected to said detecting means such that the frequency components of the electrical energy bove about 5000 Hz. are emphasized relative to the frequency components of the electrical energy below about 5000 Hz.
- each produced pulse being of approximately equal duration and amplitude dimension with every other produced pLlse
- integrating means connected to said pulse-producing means for producing a voltage amplitude dependent upon the number of pulses per unit of time and the amplitude of said pulses, said voltage having a unique and identifiable variable amplitude characteristic for every uttered speech sound,
- a speech display device comprising detecting means for converting the range of speech frequencies including up to about 8000 Hz. to electrical energy, amplifying means connected to said detecting means such that the frequency components of the electrical energy above about 5000 Hz.
- each produced pulse 4 being of approximately equal duration and amplitude dimension with every other produced pulse
- integrating means connected to said pulse-producing means for producing a voltage amplitude dependent upon the number of pulses per unit of time and the amplitude of said pulses, said voltage having a unique and identifiable variable amplitude characteristic for every uttered speech sound
- said integrating means comprises a first resistance connected to the output of said pulse producting means
- a second resistance with a first end connected to the connection between said first resistance and said first capacitance and with its second end being connected to the input of said display means, and having a resistance value approximately three times that of said first resistance
- a system for providing a comparison of a displayed speech sound with a recognized reference comprising i3 as a function of time for every uttered speech ⁇ sound,
- a speech display device for comparing a displayed speech so-und with a recognized reference, comprising first electronic means for presenting a recorded reference voltage being characteristic of the recognized reference sound;
- second electronic means for presenting a voltage being characteristic of the speech sound to be compared including detecting means for converting the range of frequencies of said speech sound including up to approximately 800 cycles per second to electrical energy,
- amplifying means connected to said detecting means such that the frequency components of the electrical energy above an intermediate frequency, which is below approximately 5000 cycles per second, are emphasized relative to the frequency components of the electrical energy below said intermediate frequency, means connected to said amplifier means responsive to time axis crossings of the energy signal from said amplifier means so as to produce an output comprising a series of spiked pulses at substantially constant amplitude, the number of said pulses being proportional to the number of time axis crossings,
- a display device having dual display channels, one
- a device for comparing a speech sound of unknown quality with a recognized reference sound com-1 prising a first electronic means for presenting a recorded reference voltage being characteristic of the recognized. reference Y sound, second electronic means for presenting a voltage being characteristic of the speech sound to be compared,
- comparator means connected to said rst electronic means and second electronic means ⁇ for producing a voltage which is a measure of the voltage difference between said first and second means
- said indicating means connected to said comparator means triggered when the voltage from said comparator means exceeds a preset amplitude value, said indicating means comprising a limit circuit connected to said comparator means that produces an output signal when a preset voltage amplitude value from said comparator means is exceeded, and
- warning means connected to said limit circuit and activated by said output signal.
- a device for comparing a speech sound of unknown quality with a recognized reference sound comprising a first electronic means for presenting a recorded reference voltage being characteristic of the recognized reference sound, second electronic means for presenting a voltage being characteristic of the speech sound to be compared,
- comparator means connected to said first electronic means and second electronic means for producing a voltage which is a measure of the voltage difference between said first and second means
- a speech display device for comparing a speech sound of unknown quality with a recognized reference sound, comprising first electronic means for presenting a recorded reference voltage being characteristic of the recognized reference sound,
- second electronic means for presenting a voltage being characteristic of the speech sound to be compared, said voltage produced by the second electronic means being related to the phase reversals per unit time of an electrical signal corresponding to such speech sound with higher frequencies precmphasized,
- comparator means connected to said first electronic means and said second electronic means for producing a voltage which is a measure of the voltage difference -between the outputs of said first and second means
- a display device connected to said comparator means for displaying the voltage level as an indication of quality of the speech sound as related to the reference sound.
- a device for comparing a speech sound of unknown quality with a recognized reference sound cornprising a first electronic means for presenting a recorded reference voltage being characteristic of the recognized reference sound, including detecting means for converting the range cf frequencies in said reference sound including up to approximately 8000 cycles per second to electrical energy,
- amplifying means connected to said detecting means such that the frequencycomponents of the electrical energy above an intermediate frequency, which is below approximately 5000 cycles per second, are emphasized relative to the frequency components of the electrical energy below said intermediate frequency,
- voltage summing means connected to the last said-QT; means for producing a voltage the arnplitude' of which corresponds to the number of said-l spiked pulses per unit of time;
- a second electronic means for presenting a recorded reference voltage ⁇ being characteristic of the speech sound to be compared including detecting means for converting the range of speech frequencies including up to approximately 8000 cycles per second to electrical energy
- amplifying means connected to said detecting means such that the frequency components of the electrical energy above any intermediate frequency, which is below approximately 5000 cycles per second, are emphasized relative to the frequency components of the electrical energy below said intermediate frequency,
Landscapes
- Engineering & Computer Science (AREA)
- Computational Linguistics (AREA)
- Signal Processing (AREA)
- Health & Medical Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Human Computer Interaction (AREA)
- Physics & Mathematics (AREA)
- Acoustics & Sound (AREA)
- Multimedia (AREA)
- Measurement Of Mechanical Vibrations Or Ultrasonic Waves (AREA)
Description
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US395876A US3387090A (en) | 1964-09-11 | 1964-09-11 | Method and apparatus for displaying speech |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US395876A US3387090A (en) | 1964-09-11 | 1964-09-11 | Method and apparatus for displaying speech |
Publications (1)
Publication Number | Publication Date |
---|---|
US3387090A true US3387090A (en) | 1968-06-04 |
Family
ID=23564916
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US395876A Expired - Lifetime US3387090A (en) | 1964-09-11 | 1964-09-11 | Method and apparatus for displaying speech |
Country Status (1)
Country | Link |
---|---|
US (1) | US3387090A (en) |
Cited By (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US3499986A (en) * | 1966-09-28 | 1970-03-10 | Philco Ford Corp | Speech synthesizer |
US3546584A (en) * | 1966-11-30 | 1970-12-08 | Standard Telephones Cables Ltd | Apparatus for analyzing a complex waveform containing pitch synchronous information |
US3855418A (en) * | 1972-12-01 | 1974-12-17 | F Fuller | Method and apparatus for phonation analysis leading to valid truth/lie decisions by vibratto component assessment |
US3991304A (en) * | 1975-05-19 | 1976-11-09 | Hillsman Dean | Respiratory biofeedback and performance evaluation system |
US4020567A (en) * | 1973-01-11 | 1977-05-03 | Webster Ronald L | Method and stuttering therapy apparatus |
US4492917A (en) * | 1981-09-03 | 1985-01-08 | Victor Company Of Japan, Ltd. | Display device for displaying audio signal levels and characters |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US3202761A (en) * | 1960-10-14 | 1965-08-24 | Bulova Res And Dev Lab Inc | Waveform identification system |
US3294918A (en) * | 1962-05-18 | 1966-12-27 | Polaroid Corp | Electronic conversions of speech |
US3316353A (en) * | 1963-08-05 | 1967-04-25 | Voice Systems Inc | Lisp meter |
-
1964
- 1964-09-11 US US395876A patent/US3387090A/en not_active Expired - Lifetime
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US3202761A (en) * | 1960-10-14 | 1965-08-24 | Bulova Res And Dev Lab Inc | Waveform identification system |
US3294918A (en) * | 1962-05-18 | 1966-12-27 | Polaroid Corp | Electronic conversions of speech |
US3316353A (en) * | 1963-08-05 | 1967-04-25 | Voice Systems Inc | Lisp meter |
Cited By (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US3499986A (en) * | 1966-09-28 | 1970-03-10 | Philco Ford Corp | Speech synthesizer |
US3546584A (en) * | 1966-11-30 | 1970-12-08 | Standard Telephones Cables Ltd | Apparatus for analyzing a complex waveform containing pitch synchronous information |
US3855418A (en) * | 1972-12-01 | 1974-12-17 | F Fuller | Method and apparatus for phonation analysis leading to valid truth/lie decisions by vibratto component assessment |
US4020567A (en) * | 1973-01-11 | 1977-05-03 | Webster Ronald L | Method and stuttering therapy apparatus |
US3991304A (en) * | 1975-05-19 | 1976-11-09 | Hillsman Dean | Respiratory biofeedback and performance evaluation system |
US4492917A (en) * | 1981-09-03 | 1985-01-08 | Victor Company Of Japan, Ltd. | Display device for displaying audio signal levels and characters |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US3855416A (en) | Method and apparatus for phonation analysis leading to valid truth/lie decisions by fundamental speech-energy weighted vibratto component assessment | |
Strevens | Spectra of fricative noise in human speech | |
Ladefoged et al. | Loudness, sound pressure, and subglottal pressure in speech | |
Peterson et al. | Control methods used in a study of the vowels | |
Lindau | The story of/r | |
Hughes et al. | Spectral properties of fricative consonants | |
Stevens et al. | Speaker authentication and identification: a comparison of spectrographic and auditory presentations of speech material | |
Dunn | Methods of measuring vowel formant bandwidths | |
US3855418A (en) | Method and apparatus for phonation analysis leading to valid truth/lie decisions by vibratto component assessment | |
Horii et al. | A masking noise with speech‐envelope characteristics for studying intelligibility | |
US3855417A (en) | Method and apparatus for phonation analysis lending to valid truth/lie decisions by spectral energy region comparison | |
WO1990011593A1 (en) | Method and apparatus for speech analysis | |
Abberton | Some laryngographic data for Korean stops | |
Abbs et al. | Effect of acoustic cues in fricatives on perceptual confusions in preschool children | |
US3387090A (en) | Method and apparatus for displaying speech | |
Lane | Psychophysical parameters of vowel perception. | |
US3603738A (en) | Time-domain pitch detector and circuits for extracting a signal representative of pitch-pulse spacing regularity in a speech wave | |
Airas et al. | Emotions in short vowel segments: effects of the glottal flow as reflected by the normalized amplitude quotient | |
US3676595A (en) | Voiced sound display | |
US3925616A (en) | Apparatus for determining the glottal waveform | |
US4276445A (en) | Speech analysis apparatus | |
Efremova et al. | Intelligibility of tonic accents | |
US4401850A (en) | Speech analysis apparatus | |
Boe et al. | A statistical analysis of laryngeal frequency: Its relationship to intensity level and duration | |
Derrick et al. | Aero-tactile integration in fricatives: Converting audio to air flow information for speech perception enhancement |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
AS | Assignment |
Owner name: TORONTO-DOMINION BANK, THE Free format text: SECURITY INTEREST;ASSIGNOR:TRACOR, INC., (SEE RECORD FOR REMAINING GRANTORS);REEL/FRAME:004829/0701 Effective date: 19871216 Owner name: TORONTO-DOMINION BANK, THE,STATELESS Free format text: SECURITY INTEREST;ASSIGNOR:TRACOR, INC., (SEE RECORD FOR REMAINING GRANTORS);REEL/FRAME:004829/0701 Effective date: 19871216 |
|
AS | Assignment |
Owner name: BANK OF AMERICA NATIONAL TRUST AND SAVING ASSOCIAT Free format text: SECURITY INTEREST;ASSIGNOR:TORONTO DOMINION BANK, THE,;REEL/FRAME:005284/0163 Effective date: 19880801 Owner name: TORONTO-DOMINION BANK, THE Free format text: SECURITY INTEREST;ASSIGNORS:TRACOR, INC.;LITTLEFUSE, INC.;TRACOR AEROSPACE, INC.;AND OTHERS;REEL/FRAME:005234/0127 Effective date: 19880801 Owner name: BANK OF AMERICA NATIONAL TRUST AND SAVINGS ASSOCIA Free format text: SECURITY INTEREST;ASSIGNORS:TORONTO-DOMINION BANK;TRACOR, INC.;REEL/FRAME:005224/0276 Effective date: 19880801 Owner name: BANK OF AMERICA NATIONAL TRUST AND SAVINGS ASSOCIA Free format text: SECURITY INTEREST;ASSIGNOR:TRACOR, INC.;REEL/FRAME:005217/0247 Effective date: 19880801 Owner name: BANK OF AMERICA NATIONAL TRUST AND SAVINGS ASSOCIA Free format text: SECURITY INTEREST;ASSIGNOR:TRACOR INC.;REEL/FRAME:005217/0224 Effective date: 19880801 Owner name: BANK OF AMERICA AS AGENT Free format text: SECURITY INTEREST;ASSIGNOR:TORONTO-DOMINION BANK, THE;REEL/FRAME:005197/0122 Effective date: 19880801 |
|
AS | Assignment |
Owner name: TRACOR, INC. Free format text: RELEASED BY SECURED PARTY;ASSIGNOR:BANK OF AMERICA NATIONAL TRUST AND SAVINGS ASSOCIATION AS COLLATERAL AGENT;REEL/FRAME:005957/0542 Effective date: 19911227 Owner name: TRACOR, INC. Free format text: RELEASED BY SECURED PARTY;ASSIGNOR:BANK OF AMERICA NATIONAL TRUST AND SAVINGS ASSOCIATION AS COLLATERAL AGENT;REEL/FRAME:005957/0562 Effective date: 19911220 |