US3770891A - Voice identification system with normalization for both the stored and the input voice signals - Google Patents
Voice identification system with normalization for both the stored and the input voice signals Download PDFInfo
- Publication number
- US3770891A US3770891A US00248740A US3770891DA US3770891A US 3770891 A US3770891 A US 3770891A US 00248740 A US00248740 A US 00248740A US 3770891D A US3770891D A US 3770891DA US 3770891 A US3770891 A US 3770891A
- Authority
- US
- United States
- Prior art keywords
- signals
- channel
- signal
- channels
- pitch
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Expired - Lifetime
Links
- 238000010606 normalization Methods 0.000 title description 6
- 238000001228 spectrum Methods 0.000 claims abstract description 11
- 230000008878 coupling Effects 0.000 claims description 12
- 238000010168 coupling process Methods 0.000 claims description 12
- 238000005859 coupling reaction Methods 0.000 claims description 12
- 239000011159 matrix material Substances 0.000 claims description 6
- 230000003595 spectral effect Effects 0.000 claims description 2
- 238000006243 chemical reaction Methods 0.000 description 6
- 239000003990 capacitor Substances 0.000 description 5
- 238000000034 method Methods 0.000 description 5
- 238000010586 diagram Methods 0.000 description 3
- 230000008569 process Effects 0.000 description 3
- 230000004048 modification Effects 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 230000006978 adaptation Effects 0.000 description 1
- 230000003321 amplification Effects 0.000 description 1
- 238000004458 analytical method Methods 0.000 description 1
- 230000008859 change Effects 0.000 description 1
- 230000001276 controlling effect Effects 0.000 description 1
- 238000006880 cross-coupling reaction Methods 0.000 description 1
- 238000003199 nucleic acid amplification method Methods 0.000 description 1
- 230000001105 regulatory effect Effects 0.000 description 1
- 238000004088 simulation Methods 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L21/00—Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
Definitions
- the detected signals derived from the group of resonances are constantly regrouped to a reference numerical region along the outputs of a plurality of numerically arranged channels in order that, the signals derived from the varying pitch frequencies are shifted to the output of the first channel (at high speed), while the signals derived from other frequencies are shifted to channel outputs by the same factors of multiplication from the first channel that the original frequencies differ from the pitch frequenciesnThe mutually related amplitude ratios of significant signals of the regrouped signals are then matched with prerecorded voiceprint signals of the individual to be identified.
- SHEET F 5 1 39 111 37 37 311 111 73 73 2,489 111 2 2 41 111 33 38 330 111 74 74 2,637 111 3 3 44 111 39 39 349 111 75 75 2,794 111 4 4 46 111 40 370 111 76 76 2,960 11 5 5 49 111 41 41 392 111 77 77 3,1 36 111 6 6 52 111 42 42 415 111 70 176 3,322 111 7 7 55111 43 43 440 111 79 79 3,520 111 a 1'8 58 111 44 44 466 111 80 00 3,730 111 9 9 69.
- the main object is to provide a highly reliable electronic arrangement capable of automatically selecting the correct voiceprint signals from a large number of sequentially fed prerecorded signals at high speed.
- a corollary object is to provide related arrangements which may be used for the purpose of command operation by a specific individual in the midst of a number of speakers.
- a bsnk of channels arranged in a predetermined numerical order, each one of which is prearranged with a plurality of signal-admitting inputs, and a plurality of signal-switching inputs, respectively.
- the incoming group of signals are applied to the corresponding ones of the plurality of signal-admitting inputs of each one of the plurality of channels, so that any one of the applied signals can be admitted to the output of any one of the channels by the operation of a respective signal-switching input.
- a plurality of prearranged combinations of signal-regrouping control signals are then applied (at high speed) sequentially to the plurality of signal-switching inputs until the detected signals derived from the lowest (pitch) frequency appears at the first channel output, representing the required reference signal regrouping.
- This signal regrouping is held in steady state, while at the same time switching sequence of signal regrouping combinations starts from a reference beginning for continual pitch frequency hunting until the pitch frequency changes to establish a new signal regrouping at the channel outputs.
- FIGS. 1 and 2 are block diagrams of the voiceprint identifying arrangements, according to the invention
- FIG. 3 is a graphical diagram showing how different groups of signals are regrouped at the channel outputs, according to the invention
- FIG. 4 shows the special center frequency sub-divisions of the pass-band filters, as used in the present invention
- FIG. 5 is a numerical chart showing how the detected signal outputs of the pass-band filters are switched linearly to the outputs of the numerically arranged channels, in accordance with the invention
- FIG. 6 is the switching arrangement used herein
- FIG. 7 is an amplitude equalization arrangement in accordance with the invention.
- the information contained in complex waves depends on greater detail than the simple harmonic relations shown, and it requires further sub-divisions between the harmonic intervals.
- the center frequencies of the sub-dividing pass-band filters are arranged as in FIG. 4, wherein the sub-divisions are similar to the standard musical scale.
- the sub-divisions are arranged in a series of digitals at like intervals in harmonic successions corresponding to the digitals in preceding intervals.
- the numerical ratios of the numerals from the second through the thirteenth numerals with respect to the first numeral is the same as the numerical ratios of the 14th through the 25th numeral with respect to the 13th numeral, and exemplary demonstration.
- Such a numerical arrangement simplifies the actual channel switching arrangement, because it requires linearly sequenced numerical transfer without cross coupling of any of the pass-band outputs. This is shown in greater clarity by the numerical chart in FIG. 5.
- SIGNAL REGROUPING NUMERICAL CHART The signal regrouping arrangement is shown in greater clarity by the numerical chart in FIG. 5, wherein the top row of the numerals represent the channels, and the rows of numerals below represent the sequence of the numerals left hand of the sub-band frequencies of FIG. Fog example, in the first row (FIG. all of the detected outputs of pass-band filters (starting from filter number-l are applied to the inputs of the channels starting from channel number-l. In the second row, the detected filter outputs starting from the number-2 filter are applied to all of the channels starting from the channel number-1, and so on.
- FIG. 6 The arrangement of analog channel switches is shown in FIG. 6.
- the detected signals from the passband filters are applied to the drain electrodes of transistors 01 through Q4, respectively.
- the source electrodes of these transistors are connected to ground in series with the output resistors R1, R2, R3, and Rn, respectively, which represent the channel outputs.
- anyone of the detected signals fromf f j may be coupled to the resistor R1 of channel-l. This is done by the additional transistors Q5, Q8, Q10, the source electrodes of which are all connected in parallel with the source electrode of Q1.
- the detected signal from f is admitted to the output resistor R1.
- the second distributor pulse is applied to the gate-electrode of Q5 (also to those connected in parallel) the detected signal from f is admitted to the output resistor R1 of channell, and so on.
- the output signals from the set-reset flip-flops in the R-S blocks 17- 19 in FIG. 1 are applied to the parallel connected gate-electrodes of the transistors in FIG. 6, the detected signals from the pass-band filters in FIG. 1 are regrouped across the channel output resistors shown, in accordance with the numerical chart of FIG. 5.
- SIGNAL CONVERSION SYSTEM l have described novel signal conversion systems in my related patents, for example, US. Pat. No. 3,622,706 and 3,659,051, and reference may be made to these patents.
- the voice sound wave in block 1 is applied to the passband filters in blocks 2-4 in parallel, and their outputs are first detected in blocks 5-7, respectively, and further applied to the signal-admitting inputs of the switching channels in block 24.
- the out puts of filters 24 are also applied to the senseamplifiers in blocks 8-10, respectively, which are provided with set-reset flip-flop outputs, so that they produce 1" level output signals when their inputs receive useful signals from the filter outputs above a threshold level.
- the outputs of these sense-amplifiers are applied to the first inputs of gates 11-13, respectively, and the second inputs of these gates are excited by the sequential 1" level pulses of the pulse distributor in block 25, so that only those gates which have received simultaneous signals at their first inputs from the respective sense-amplifiers operate by the distribution pulses.
- the distributor starts applying sequential pulses at I level to the input of gate 11, and that this gate has also received 1 level signal from the sense-amplifier 8, it applies an operating pulse to the one-shot 14, which in turn operates the set-reset flipflop in block 17 into set state by way of a-c coupling (a-c and d-c couplings are available in integrated circuits).
- the set state output of flip-flop 17 is a-c coupled to the one-shot in block 20, which in turn operates and applies 0" level pulse to the multi-input gate in block 23 for operation.
- the output 1 level pulse of gate 23 is inverted into 0 level pulse, and applied in parallel through d-c coupling to the reset inputs of the set-reset flip-flops in blocks 17-19 for reset operating states.
- the flip-flop in block 17 remains in set operating state by reason that the d-c coupled pulse at its set input has a longer pulse period than the d-c coupled pulse at its reset input, as illustrated by pulse waveforms under the one-shots 14-16 and -22.
- the set output of flip-flop 17 is amplitied in block 26 (this amplification might be necessary if the different types of integrated circuit devices availabe commercially, as utilized herein, are not found to be compatible one with another), and a selected signalregrouping combination in the matrix of block 27 is applied to the channel switches in block 24 for the required signal regrouping at their outputs.
- the distributor For continuous signal-regrouping operation, the distributor must be reset for a new start of signal regrouping. At the start of this new pulse distribution, however, the pass-band filter outputs may have changed, and accordingly, the flip-flop outputs of the sense-amplifiers 8-10 must also be reset simultaneously with the resetting of the distributor. This is done by mixing the output pulses of one-shots 14-16 in the multi-input gate 28, the output of which is inverted in block 29 and applied (through d-c coupling) to the distributor 25 and sense-amplifiers 8-10 for reset operation, after which period the distributor starts hunting the output signals of the sense-amplifiers.
- the output of filter 2 is zero
- the output of sense-amplifier 8 is also at 0 level
- the first distribution pulse does not operate the gate 11.
- the gate 12 operates at the arrival of the second distribution pulse, and according to the previously described mode of operation, the flip-flop 17 is driven into rest operation, and the flip-flop 18 is driven into set operation for a new signal regrouping at the channel outputs of block 24.
- the pulse frequency of the pulse generator in block 31) is not critical, and may be adjusted to any useful frequency as desired.
- the gate 31 may be used to gate out the output pulses of generator during resetting of the distributor.
- the output signals of channel-1 through channel-n are differentiated with the signals which are prerecorded in the a, a, a, a; b, b, b, b; c, c, c, 0 blocks, in the operational-amplifiers 36-39.
- These signaldifferentiating circuits at the channel outputs are similar, and therefore, description of the circuitryin reference to one of the channels is typical to the rest of the channels.
- the output is zero, or at least below a threshold level.
- the output of block 36 is above zero, and its polarity depends on which of the two input signals is larger than the other. Because of this undetermined output polarity, the output of amplifier 36 is passed through a push-pull output amplifier in block 40, and applied to a sense-amplifier in block 44, which is provided with a one-shot output. Thus, when the two input signals of block 36 are of equal amplitudes, the output of sense-amplifier 44 is zero, and when these two input signals are of unequal amplitudes, the sense-amplifier 44 produces a one-shot pulse output.
- the output of sense-amplifier 44 is applied to one of the inputs of gate 49 in series with the normally idle switching transistor Q11.
- the two matching signals are of equal amplitudes, it represents the correct information sought and it is false when the two signals differ in amplitudes.
- the transistor 011 is switched-ON, and the output of sense-amplifier is zero, the information fed to the gate 49 is correct, whereas, when the transistor Q11 is switched-ON, and the sense-amplifier 44 produces an output pulse, the information fed to the gate 49 is false.
- the 0 blocks (having matching signals present) apply ON signals to the gate-electrodes of their respective switching transistors Q1 1, Q12, and Q14, for admitting the respective source-terminal outputs to the inputs of multi-input gate 49, as shown.
- the a blocks of these channels also apply the matching signals (analog signals) to the first inputs of AND gates 50, S1 and 53.
- the pulse generator in block 54 applies a pulse to all of the second inputs of the AND gates 50-53, and to all of the switches (not shown in order to avoid crowding of the drawing) between the amplitude equalizers in blocks 32-35 and the operational-amplifiers 36-39, respectively, for operation.
- the signal matching of the arriving signals is correct in the channel outputs (channel-1, channel-2, channel-n)
- the switched-ON outputs of O11, Q12 and Q14, and also the pulse from the generator 54 will be at 1 level to the inputs of gate 49, and therefore, the output of this gate will be at 0 level, as a signal representing the correct identification of the individual in question.
- the output of this gate will not be at 0 level, indicating false signal. But a decision on matching of only the a signals may not be sufficient, and therefore, the b signals may also be tried for matching, and getting an average prior to a final decision of correctness.
- the main object of the matching system described herein is a modification (signal regrouping) of the signals prior to matching, it is obvious that the prerecorded signal rnust be modified in the same manner, as described by way of the arrangement in FIG. 1.
- signal regrouping there are other variables that must be modified, for example, the amplitude variation that occurs during each segment of the information to be analyzed. in the case that the arriving signals are first regrouped and recorded for matching with the prerecorded signals, the amplitudes of both recorded signals may be adjusted for the proper matching process.
- the original voice signal varies in amplitude uncontrollably from time to time, even when tried to duplicate a previous articulation, especially when the sequence of amplitude modulations in a whole spoken word are to be considered for matching. For this reason, it is preferable to include an automatic amplitude control of the channel output signals.
- the automatic amplitude control is achieved by the equalization amplifiers in blocks 32-35, but the equalization is controlled only by the output signal of channel-l, as reference amplitude for matching.
- the amplitude equalization is established by the peak signal in channel-l, as a feedback control to a reference level. This may be explained in more clarity by a simple equalization arrangement of HG. 7, as in the following AMPLITUDE EQUALIZAER in FIG. 7, the channel output is applied to the resistor R4 in series with the variable-resistance transistor Q15, so that the transfer from channel-l to R4 may be voltage divided by change of resistance of the transistor Q15.
- the gate electrode ofQlS is normally forward biased by B to the lowest reference resistance between drain and source electrodes of 015, and this forward bias is variably reduced by an opposing voltage stored in the capacitor C through a feed-back switching path.
- the information to be analyzed occurs between the peaks of the pitch wave in the voice sound, and this pitch wave is represented by the output wave of the first channel.
- Pitch selection has been used previously for various purposes, and different circuit arrangements already exist, which may be utilized for deriving pitch pulses either directly from the original voice sound wave, or from the channel-1 output of the arrangement shown in FIG. 1.
- the control voltage of capacitor C is changed during the short pitch pulses by operation of the normally OFF switches in blocks 56 and 57 for negative feed-back.
- This feed-back is controlled by the zener diode D1, which becomes conductive in reverse direction only when the voltage across RL is above a reference voltage. Accordingly, when a pitch pulse operates the switches in blocks 56 and 57, and assuming that the amplified voltage across load resistor RL is above the zener regulation voltage, the capacitor C is charged in polarity as to increase the internal resistance of Q15 to a point where the voltage across C is equal to the zener regulation voltage.
- the channel output signal As transferred to R4, has been regulated to a reference level, and this reference level has also been established in the rest of the channel outputs by the parallel connection of the capacitor C to each one of the amplitude equalizers that are used at the other channel outputs. in controlling the signal amplitude across Rd, and assuming that the voltage across C is in the forward direction during a pitch pulse period, the capacitor C is discharged until the reference voltage regulation has been established.
- the use of the switch in block 57 may be prefereable when the amplifier 55 is chosen of a capacitive coupled amplifier, but actually the choice is irrelevant herein.
- AND gates 50-53 are actually analog switches. But the meaning of the AND gate is broad, and some analog switches are now appearing in the name of AND gates with some modified terminology attached to it. Accordingly, the term AND gate used herein does not necessarily mean having two on-and-off input signals, but one of the signals may vary in magnitude, and the other used as an onand-off switching signal. Gates of this type are used between the amplitude equalizers 32-35 and the operational amplifiers 36-39, which are not shown in the drawing, but their requirement is referenced in writing by the pulses of the pulse-generator 54 applied to them during the signal matching performance.
- the system described herein When the system described herein is used for the purpose of selecting a particular individual from a group of speakers for some required performance, the system may be incorporated in conjunction with my phonetic sound recognition systems, for example, as described in my US. Pat. No. 3,622,706, so that during phonetic sound recognition an output switch is operated only when simultaneously voiceprint recognition occurs.
- my phonetic sound recognition systems for example, as described in my US. Pat. No. 3,622,706, so that during phonetic sound recognition an output switch is operated only when simultaneously voiceprint recognition occurs.
- a voice identification system wherein the voice signals of a speaking individual to be identified are compared with prerecorded voice signals of the same individual, but wherein each group of resonances in the sound spectrum representing a segment of said voices vary within the pitch intervals of the speaking and prerecorded voices respective of each other, and in'conjunction with variations in resonance-amplitude from pitch to pitch periods, the system for normalyzing said variations during said comparison, comprising first and second identical normalyzing systems for the said speaking and prerecorded voice signals, the first system comprising a plurality of band-pass filters for subdividing the resonances of the speaking voice sound waves; a plurality of detecting means for deriving detected signals from said filters; means for deriving a control signal (outputs of R-S blocks 17-19) from said detected signals which represent a segment of the speakers speech during a pitch time interval, and said control signal having a time period equal to said time interval; a matrix comprising a plurality of signal regrouping combinations, each combination having been prearranged for
- center frequency responses of said plurality of band-pass filters in said sound spectrum are arranged in a series of frequencies within harmonic intervals, each frequency of the series of frequencies in an interval corresponding to the harmonic of a frequency in the preceding interval; and said matrix comprises a plurality of channels arranged in linearly sequenced numerical order, each channel output representing the output of at least one or more AND gates, the number of AND gates in each channel being equal to the number of the numerical location of that channel in said numerical order; first parallel connections of the numerically corresponding first outputs of sad gates in the plurality of channels; coupling means from said detected signals to said first parallel connected inputs, whereby each signal in a group may be coupled to any one of the first inputs of a number of channels for channel output selection during said control signal interval, second parallel connections of the second inputs in said AND gates in an arrangement that the second inputs of the AND gates in said channels starting from the channel located in the lowest numerical position in said numerical order are connected to successively advancing ones from the corresponding second inputs
Landscapes
- Engineering & Computer Science (AREA)
- Computational Linguistics (AREA)
- Health & Medical Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Human Computer Interaction (AREA)
- Physics & Mathematics (AREA)
- Acoustics & Sound (AREA)
- Multimedia (AREA)
- Quality & Reliability (AREA)
- Signal Processing (AREA)
- Stereophonic System (AREA)
Abstract
In an articulated phonetic sound wherein the information bearing group of resonances vary within different frequency regions in the sound spectrum, the detected signals derived from the group of resonances are constantly regrouped to a reference numerical region along the outputs of a plurality of numerically arranged channels in order that, the signals derived from the varying pitch frequencies are shifted to the output of the first channel (at high speed), while the signals derived from other frequencies are shifted to channel outputs by the same factors of multiplication from the first channel that the original frequencies differ from the pitch frequencies. The mutually related amplitude ratios of significant signals of the regrouped signals are then matched with prerecorded voiceprint signals of the individual to be identified.
Description
United States Kalfaian atent [191 Nov. 6, I973 VOICE IDENTIFICATION SYSTEM WITH NORMALIZATION FOR BOTH THE STORED AND THE INPUT VOICE SIGNALS [76] Inventor: Meguer V. Kalfaian, 962 Hyperion Ave., Los Angeles, Calif.
22 Filed: Apr. 28, I972 [21] Appl. No.: 248,740
[52] U.S. CL... 179/1 SB [51] Int. Cl. G101 l/00 [58] Field of Search 179/] SA, 1 SB, 15.55 R, 179/1 VS; 340/148, 146.3 C; 35/35 C [56] References Cited UNITED STATES PATENTS 3,466,394 9/1969 French 179/1 SB 3,622,706 11/1971 Kalfaian.. l79/l SA 3,431,359 3/1969 Kalfaian.. 179/1 SA 3,509,280 4/1970 Jones... 179/1 SB 3,525,811 8/1970 Trice... 179/1 SB 3,546,584 12/1970 Scarr... 179/1 SB 3,439,122 4/1969 Coker 179/1 SA FOREIGN PATENTS OR APPLICATIONS 43/23,648 10/1968 Japan .1 179/1 SB AMPLITUDE EQUAL/ZERS- T0 A STANDARD REFER/V65 LEI/IL, CONTROLLED 57 THE Primary Examiner-William C. Cooper Assistant ExaminerJon Bradford Leaheey A ttomey Meguer V. Kalfaian [57] ABSTRACT In an articulated phonetic sound wherein the information bearing group of resonances vary within different frequency regions in the sound spectrum, the detected signals derived from the group of resonances are constantly regrouped to a reference numerical region along the outputs of a plurality of numerically arranged channels in order that, the signals derived from the varying pitch frequencies are shifted to the output of the first channel (at high speed), while the signals derived from other frequencies are shifted to channel outputs by the same factors of multiplication from the first channel that the original frequencies differ from the pitch frequenciesnThe mutually related amplitude ratios of significant signals of the regrouped signals are then matched with prerecorded voiceprint signals of the individual to be identified.
3 Claims, 7 Drawing Figures ARRANGEMENT FOR Al/fO/IAT/CAZLY HATCH/N6 VD/CEPRINT SIENALS WIT/I THAT 0F PRL'RECORDED VD/CfPR/NTS PATENTEDHUY s 1913 3,770,891
SHEET F 5 1 1 39 111 37 37 311 111 73 73 2,489 111 2 2 41 111 33 38 330 111 74 74 2,637 111 3 3 44 111 39 39 349 111 75 75 2,794 111 4 4 46 111 40 370 111 76 76 2,960 11 5 5 49 111 41 41 392 111 77 77 3,1 36 111 6 6 52 111 42 42 415 111 70 176 3,322 111 7 7 55111 43 43 440 111 79 79 3,520 111 a 1'8 58 111 44 44 466 111 80 00 3,730 111 9 9 69. 111 45 494 111 91 21 3,951 111 10 10 65111 46 46 523 111 82 82 4,186 111 11 11 69 111 47 47 554 111 83 33 4,434 111 12 12 73 111 43 48 587 111 84 84 4,698 111 13 13 78 111 49 49 622 111 95 65 4,972 111 14 14 82 111 50 659 111 86 86 5,2 74 111 15 15 87 111 51 51 692 111 37 87 5,520 111 16 16 92 111 52 52 740 111 as 62 5, 920 111 17 17 98 111 53 53 724 111 89 89 6,272 111 18 18 104111 54 54 231 111 90 90 6,644 1 19 19 110 111 55 880 111 91 91 7,040 111 20 20 116 111 56 56 932 111 92 92 7,460 111 21 21 123 111 57 57 998 111 93 93 7,902 111 22 22 131 111 52 56 1,046 111 94 94 8,372. 111 23 23 139 111 59 59 1,109 111 95 95 8,868 111 24 24 147 111 60 60 1,175 111 96 96 9,396 111 25 25 156 111 61 61 1,244 111 97 97 9 956 111 26 26 165 111 62 62 [,3/8 111 92 792 10,542 111 27 27 175 111 63 63 1,397 111 99 99 11,176 111 22 28 105 111 64 64 1,479 111 100 100 11, 840 111 NUMERICAL ARRANGHE/VT OF THE PASS'BAND FILTERS AND THEIR (ENTER FREQUENCIES IN THE SOUND SPEC TRUN lA/VEN TOR VOICE IDENTIFICATION SYSTEM WITH NORMALIZATION FOR BOTH THE STORED AND THE INPUT VOICE SIGNALS This invention relates to Voiceprint analysis, and more particularly to an automatic identification system by matching the voiceprint signals of an individual with that of a large storage of prerecorded voiceprint signals, for example, Nationally stored voiceprint signals, in the same order of prerecorded fingerprint patterns. The main object is to provide a highly reliable electronic arrangement capable of automatically selecting the correct voiceprint signals from a large number of sequentially fed prerecorded signals at high speed. A corollary object is to provide related arrangements which may be used for the purpose of command operation by a specific individual in the midst of a number of speakers.
In all voice recognition systems the most difficult problem involved is the enormous variations that occur in the speaking voice. These variations, however, are provided by Nature, so that an almost an infinite number of distinguishable informations can be derived therefrom. But it is not a simple process to locate a desired information in these unendingly varying complexities, and therefore, in a practical ultimate of realization, it becomes an absolute must that these variations are first normalized, so that the desired information can be picked out by a standard matching process with that of some predetermined parameters. In such a practice, the normalization must be highly precise, and highly stable without requiring any error-adjustments whatsoever. This is achieved in the ultimate accuracy by the switching arrangement of FIG. 1, wherein the group of signals contained in the arriving complex signal are regrouped in a reference arrangement at the outputs of a bank of channels in block 24, whereby the elemental signals composing the complex signal are rendered distinguishable for selection in any desired fashion. The regrouped signals at the channel outputs are then automatically matched with pre-stored signals in an arrangement of FIG. 2, for final identification 'of a particular voice desired.
BRIEF OF THE SIGNAL REGROUPING USED There are used a bsnk of channels arranged in a predetermined numerical order, each one of which is prearranged with a plurality of signal-admitting inputs, and a plurality of signal-switching inputs, respectively. The incoming group of signals are applied to the corresponding ones of the plurality of signal-admitting inputs of each one of the plurality of channels, so that any one of the applied signals can be admitted to the output of any one of the channels by the operation of a respective signal-switching input. A plurality of prearranged combinations of signal-regrouping control signals are then applied (at high speed) sequentially to the plurality of signal-switching inputs until the detected signals derived from the lowest (pitch) frequency appears at the first channel output, representing the required reference signal regrouping. This signal regrouping is held in steady state, while at the same time switching sequence of signal regrouping combinations starts from a reference beginning for continual pitch frequency hunting until the pitch frequency changes to establish a new signal regrouping at the channel outputs.
SPECTRUM NORMALIZATION In reference to variations of resonances in the sound spectrum of a speaking voice, a group of resonances representing a specific phonetic sound will vary in different frequency regions of the sound spectrum. In all previously proposed systems for frequency normalization, the system must provide a highly complicated follow-up system to determine the mutually related frequency ratios between the arriving group of resonances. For example, assuming that the various resonances in the sound wave are harmonically related to the lowest (pitch) frequency, the values may be shown as, F, +f F 2f F 3f,,, F nf,,, where (f,,) is the varying frequency components; and (F +f is the fixed reference pitch frequency component. Obviously,
' the artificial generation of the (F) components necessary for obtaining the shown values would involve critical and undesired control systems. But one method of signal conversion may be substituted by another without changing the specific information to be analyzed. Thus, instead of changing the varying frequencies in the sound wave into fixed frequencies, we may first derive detected signals from the pass-band filters (for subdividing the sound into sub-bands), and regroup them in an arrangement of numerical order, such as, l, 2, 3, n, which by simulation may assume the values as: l =F +f,,,2=F +2f,,,3=F +3f ,andn=F,,+nf,,, where (1) represents the fixed reference numeral (fixed reference pitch frequency). By such numerical conversion (digital conversion, as far as frequency components are concerned, without changing the amplitude components), we may now deal with on-and-off conditions which can be established in the highest order of control and stability with the present day digital techniques. Accordingly, the novel switching system for such numerical conversion, as used in the present invention will now be described by way of the accompanying illustrations, wherein:
FIGS. 1 and 2 are block diagrams of the voiceprint identifying arrangements, according to the invention; FIG. 3 is a graphical diagram showing how different groups of signals are regrouped at the channel outputs, according to the invention; FIG. 4 shows the special center frequency sub-divisions of the pass-band filters, as used in the present invention; FIG. 5 is a numerical chart showing how the detected signal outputs of the pass-band filters are switched linearly to the outputs of the numerically arranged channels, in accordance with the invention; FIG. 6 is the switching arrangement used herein; and FIG. 7 is an amplitude equalization arrangement in accordance with the invention.
GRAPHICAL REPRESENTATION OF SIGNAL REGROUPING AT CHANNEL OUTPUTS Referring to the graphical illustration of FIG. 3, assume a bank of channels, as represented by the blocks 1 through n at D, each of which having a plurality of inputs, as represented by the blocks at A, B and C, drawn under each channel. In this arrangement, assume that the second and third shaded blocks (at A) are in the second and fourth multiples of the first shaded block, respectively. Similarly, when the group of signals are distributed along the shaded inputs of the channels, as at B, it is also seen that the second and third shaded blocks are in the second and fourth multiples of the first shaded block, respectively. In still further exemplary arrangement, when the input group of signals are distributed along the shaded inputs, as at C, it is further seen that the shaded blocks in the second and third shaded blocks are in the second and fourth multiples of the first shaded block, respectively. By such examples, accordingly, we may regroup any one of these three groups of input signals to a reference numerical region at the outputs of the numerically arranged channels, as at D, and we may derive the numerical ratios in a simpler mode than it would require by hunting the numerical locations in the blocks at A, E or C. At this point, however, the problem remains as to how to determine what combination of switching that is required for each of the group of signals at A, B or C, in order to obtain the reference signal regrouping at D. This is done simply by a prearranged matrix of a plurality of switching combinations, which are applied to the bank of channels sequentially until the signal derived from the lowest (pitch) frequency in the original voice is shifted to the first channel output for the required switching at D.
SPECIAL CENTER FREQUENCY PASS-BAND FILTERS In reference to the illustration of FIG. 3, the information contained in complex waves, such as Voiceprints, depends on greater detail than the simple harmonic relations shown, and it requires further sub-divisions between the harmonic intervals. Thus in order to obtain high accuracy of signal regrouping without causing any cross switching of the input signals to the channels, the center frequencies of the sub-dividing pass-band filters are arranged as in FIG. 4, wherein the sub-divisions are similar to the standard musical scale. In this arrangement, it will be noted that the sub-divisions are arranged in a series of digitals at like intervals in harmonic successions corresponding to the digitals in preceding intervals. Thus, the numerical ratios of the numerals from the second through the thirteenth numerals with respect to the first numeral is the same as the numerical ratios of the 14th through the 25th numeral with respect to the 13th numeral, and exemplary demonstration. Such a numerical arrangement simplifies the actual channel switching arrangement, because it requires linearly sequenced numerical transfer without cross coupling of any of the pass-band outputs. This is shown in greater clarity by the numerical chart in FIG. 5.
SIGNAL REGROUPING NUMERICAL CHART The signal regrouping arrangement is shown in greater clarity by the numerical chart in FIG. 5, wherein the top row of the numerals represent the channels, and the rows of numerals below represent the sequence of the numerals left hand of the sub-band frequencies of FIG. Fog example, in the first row (FIG. all of the detected outputs of pass-band filters (starting from filter number-l are applied to the inputs of the channels starting from channel number-l. In the second row, the detected filter outputs starting from the number-2 filter are applied to all of the channels starting from the channel number-1, and so on. As stated in the foregoing, such simplicity of switching sequence becomes inherently accurate, as long as the harmonic sequence at like intervals is considered in sub-dividing the sound spectrum, because the number of sub-band divisions and center frequencies of the pass-band filters may be arranged other than the frequencies shown in FIG. 4 without affecting the required accuracy.
ANALOG CHANNEL SWITCHES The arrangement of analog channel switches is shown in FIG. 6. The detected signals from the passband filters are applied to the drain electrodes of transistors 01 through Q4, respectively. The source electrodes of these transistors are connected to ground in series with the output resistors R1, R2, R3, and Rn, respectively, which represent the channel outputs. In ac cordance with the numerical chart of FIG. 5, anyone of the detected signals fromf f j (FIG. 1) may be coupled to the resistor R1 of channel-l. This is done by the additional transistors Q5, Q8, Q10, the source electrodes of which are all connected in parallel with the source electrode of Q1. Thus, when the first distributor pulse is applied to the gate electrode of Q1 the detected signal from f is admitted to the output resistor R1. When the second distributor pulse is applied to the gate-electrode of Q5 (also to those connected in parallel) the detected signal from f is admitted to the output resistor R1 of channell, and so on. Thus, as the output signals from the set-reset flip-flops in the R-S blocks 17- 19 in FIG. 1 are applied to the parallel connected gate-electrodes of the transistors in FIG. 6, the detected signals from the pass-band filters in FIG. 1 are regrouped across the channel output resistors shown, in accordance with the numerical chart of FIG. 5.
Having described the details of parts for the specific signalconversion system utilized herein, the actual system for Voiceprint identification will now be described by the block diagrams of FIGS. 1 and 2, as in the following:
SIGNAL CONVERSION SYSTEM l have described novel signal conversion systems in my related patents, for example, US. Pat. No. 3,622,706 and 3,659,051, and reference may be made to these patents. For the specific purpose, however, and referring to FIG. 1, the voice sound wave in block 1 is applied to the passband filters in blocks 2-4 in parallel, and their outputs are first detected in blocks 5-7, respectively, and further applied to the signal-admitting inputs of the switching channels in block 24. The out puts of filters 24 are also applied to the senseamplifiers in blocks 8-10, respectively, which are provided with set-reset flip-flop outputs, so that they produce 1" level output signals when their inputs receive useful signals from the filter outputs above a threshold level. The outputs of these sense-amplifiers are applied to the first inputs of gates 11-13, respectively, and the second inputs of these gates are excited by the sequential 1" level pulses of the pulse distributor in block 25, so that only those gates which have received simultaneous signals at their first inputs from the respective sense-amplifiers operate by the distribution pulses. Thus assuming that the distributor starts applying sequential pulses at I level to the input of gate 11, and that this gate has also received 1 level signal from the sense-amplifier 8, it applies an operating pulse to the one-shot 14, which in turn operates the set-reset flipflop in block 17 into set state by way of a-c coupling (a-c and d-c couplings are available in integrated circuits). The set state output of flip-flop 17 is a-c coupled to the one-shot in block 20, which in turn operates and applies 0" level pulse to the multi-input gate in block 23 for operation. The output 1 level pulse of gate 23 is inverted into 0 level pulse, and applied in parallel through d-c coupling to the reset inputs of the set-reset flip-flops in blocks 17-19 for reset operating states. At this point, however, while the flip-flops 17-19 are driven into reset operating states, the flip-flop in block 17 remains in set operating state by reason that the d-c coupled pulse at its set input has a longer pulse period than the d-c coupled pulse at its reset input, as illustrated by pulse waveforms under the one-shots 14-16 and -22. Thus, the set output of flip-flop 17 is amplitied in block 26 (this amplification might be necessary if the different types of integrated circuit devices availabe commercially, as utilized herein, are not found to be compatible one with another), and a selected signalregrouping combination in the matrix of block 27 is applied to the channel switches in block 24 for the required signal regrouping at their outputs.
For continuous signal-regrouping operation, the distributor must be reset for a new start of signal regrouping. At the start of this new pulse distribution, however, the pass-band filter outputs may have changed, and accordingly, the flip-flop outputs of the sense-amplifiers 8-10 must also be reset simultaneously with the resetting of the distributor. This is done by mixing the output pulses of one-shots 14-16 in the multi-input gate 28, the output of which is inverted in block 29 and applied (through d-c coupling) to the distributor 25 and sense-amplifiers 8-10 for reset operation, after which period the distributor starts hunting the output signals of the sense-amplifiers. In this new start, assume that the output of filter 2 is zero, the output of sense-amplifier 8 is also at 0 level, and accordingly, the first distribution pulse does not operate the gate 11. Whereas, if the output of sense-amplifier 9 is at 1 level, the gate 12 operates at the arrival of the second distribution pulse, and according to the previously described mode of operation, the flip-flop 17 is driven into rest operation, and the flip-flop 18 is driven into set operation for a new signal regrouping at the channel outputs of block 24. The pulse frequency of the pulse generator in block 31) is not critical, and may be adjusted to any useful frequency as desired. The gate 31 may be used to gate out the output pulses of generator during resetting of the distributor.
MATCHING OF VOICEPRINTS WlTl-I THAT OF PRERECORDED VOICEPRINT SIGNALS As has been stated in the foregoing, all information bearing signals that are prone to variations, must necessarily however, contain the component signals for specificity during specific variation. Thus, a specific group of component signals within a larger group of component signals must represent the information of specificity. By the signal-regrouping arrangement described in the foregoing, it is only necessary to select a significant group of signals from the regrouped signals at the channel outputs of block 24 in FIG. 1, and match with prerecorded group of specific signals for identification of the information sought. This may be accomplished by the matching arrangement of FIG. 2.
In FIG. 2, the output signals of channel-1 through channel-n are differentiated with the signals which are prerecorded in the a, a, a, a; b, b, b, b; c, c, c, 0 blocks, in the operational-amplifiers 36-39. These signaldifferentiating circuits at the channel outputs are similar, and therefore, description of the circuitryin reference to one of the channels is typical to the rest of the channels. Thus, assuming that the two input signals of the operational amplifier in block 36 are of equal amplitudes, the output is zero, or at least below a threshold level. On the other hand, when these two input sig nals are of unequal amplitudes, the output of block 36 is above zero, and its polarity depends on which of the two input signals is larger than the other. Because of this undetermined output polarity, the output of amplifier 36 is passed through a push-pull output amplifier in block 40, and applied to a sense-amplifier in block 44, which is provided with a one-shot output. Thus, when the two input signals of block 36 are of equal amplitudes, the output of sense-amplifier 44 is zero, and when these two input signals are of unequal amplitudes, the sense-amplifier 44 produces a one-shot pulse output. The output of sense-amplifier 44 is applied to one of the inputs of gate 49 in series with the normally idle switching transistor Q11. In a signal matching practice, we may assume that when the two matching signals are of equal amplitudes, it represents the correct information sought and it is false when the two signals differ in amplitudes. As an exemplary condition, accordingly, when the transistor 011 is switched-ON, and the output of sense-amplifier is zero, the information fed to the gate 49 is correct, whereas, when the transistor Q11 is switched-ON, and the sense-amplifier 44 produces an output pulse, the information fed to the gate 49 is false. By combining the functional operations of different matching channels, accordingly, we may obtsin the following:
Assume that the a, a, a, a signals of the prerecorded signals are to be matched with the arriving signals from the channel outputs, and that the prerecorded signals are significant only in the channel-1, channel-2, and channel-n the blank block of channel-3 indicating that there will be no matching ofa signal. The 0 blocks (having matching signals present) apply ON signals to the gate-electrodes of their respective switching transistors Q1 1, Q12, and Q14, for admitting the respective source-terminal outputs to the inputs of multi-input gate 49, as shown. The a blocks of these channels also apply the matching signals (analog signals) to the first inputs of AND gates 50, S1 and 53. Now all operating conditions being ready, the pulse generator in block 54 applies a pulse to all of the second inputs of the AND gates 50-53, and to all of the switches (not shown in order to avoid crowding of the drawing) between the amplitude equalizers in blocks 32-35 and the operational-amplifiers 36-39, respectively, for operation. Thus assuming that the signal matching of the arriving signals is correct in the channel outputs (channel-1, channel-2, channel-n), the switched-ON outputs of O11, Q12 and Q14, and also the pulse from the generator 54, will be at 1 level to the inputs of gate 49, and therefore, the output of this gate will be at 0 level, as a signal representing the correct identification of the individual in question. On the other hand, if one of the input signals to the gate 49 is other than 1 level, the output of this gate will not be at 0 level, indicating false signal. But a decision on matching of only the a signals may not be sufficient, and therefore, the b signals may also be tried for matching, and getting an average prior to a final decision of correctness.
In some cases it may be desired that instead of ON and OFF decisions made in reliance to the output of gate 49, a fine gradient of combined signals at the outputs of operational-amplifiers 36-39 is further analyzed, and these signals are readily obtainable at the outputs of amplifiers 3639. Similarly, since the channel-output signals are regrouped, it may be necessary to know what frequency range in the sound spectrum were the original voice sound originated. This information is also readily provided by the outputs of flip-flops 17-19 in HO. 1, as indicated by the arr-owed output terminals.
Since the main object of the matching system described herein is a modification (signal regrouping) of the signals prior to matching, it is obvious that the prerecorded signal rnust be modified in the same manner, as described by way of the arrangement in FIG. 1. Besides signal regrouping, however, there are other variables that must be modified, for example, the amplitude variation that occurs during each segment of the information to be analyzed. in the case that the arriving signals are first regrouped and recorded for matching with the prerecorded signals, the amplitudes of both recorded signals may be adjusted for the proper matching process. But in this case also, the original voice signal varies in amplitude uncontrollably from time to time, even when tried to duplicate a previous articulation, especially when the sequence of amplitude modulations in a whole spoken word are to be considered for matching. For this reason, it is preferable to include an automatic amplitude control of the channel output signals.
The automatic amplitude control is achieved by the equalization amplifiers in blocks 32-35, but the equalization is controlled only by the output signal of channel-l, as reference amplitude for matching. For example, as shown in the drawing of FIG. 2, the amplitude equalization is established by the peak signal in channel-l, as a feedback control to a reference level. This may be explained in more clarity by a simple equalization arrangement of HG. 7, as in the following AMPLITUDE EQUALIZAER in FIG. 7, the channel output is applied to the resistor R4 in series with the variable-resistance transistor Q15, so that the transfer from channel-l to R4 may be voltage divided by change of resistance of the transistor Q15. The gate electrode ofQlS is normally forward biased by B to the lowest reference resistance between drain and source electrodes of 015, and this forward bias is variably reduced by an opposing voltage stored in the capacitor C through a feed-back switching path. The information to be analyzed occurs between the peaks of the pitch wave in the voice sound, and this pitch wave is represented by the output wave of the first channel. Pitch selection has been used previously for various purposes, and different circuit arrangements already exist, which may be utilized for deriving pitch pulses either directly from the original voice sound wave, or from the channel-1 output of the arrangement shown in FIG. 1. Thus, the control voltage of capacitor C is changed during the short pitch pulses by operation of the normally OFF switches in blocks 56 and 57 for negative feed-back. This feed-back is controlled by the zener diode D1, which becomes conductive in reverse direction only when the voltage across RL is above a reference voltage. Accordingly, when a pitch pulse operates the switches in blocks 56 and 57, and assuming that the amplified voltage across load resistor RL is above the zener regulation voltage, the capacitor C is charged in polarity as to increase the internal resistance of Q15 to a point where the voltage across C is equal to the zener regulation voltage. When the switches 56 and 57 become in OFF states again, the channel output signal, as transferred to R4, has been regulated to a reference level, and this reference level has also been established in the rest of the channel outputs by the parallel connection of the capacitor C to each one of the amplitude equalizers that are used at the other channel outputs. in controlling the signal amplitude across Rd, and assuming that the voltage across C is in the forward direction during a pitch pulse period, the capacitor C is discharged until the reference voltage regulation has been established. The use of the switch in block 57 may be prefereable when the amplifier 55 is chosen of a capacitive coupled amplifier, but actually the choice is irrelevant herein.
In reference to commercially available AND gates, they are generally referred to as being on-and-off devices by two input signals at "1 levels. Whereas, the AND gates 50-53 are actually analog switches. But the meaning of the AND gate is broad, and some analog switches are now appearing in the name of AND gates with some modified terminology attached to it. Accordingly, the term AND gate used herein does not necessarily mean having two on-and-off input signals, but one of the signals may vary in magnitude, and the other used as an onand-off switching signal. Gates of this type are used between the amplitude equalizers 32-35 and the operational amplifiers 36-39, which are not shown in the drawing, but their requirement is referenced in writing by the pulses of the pulse-generator 54 applied to them during the signal matching performance.
When the system described herein is used for the purpose of selecting a particular individual from a group of speakers for some required performance, the system may be incorporated in conjunction with my phonetic sound recognition systems, for example, as described in my US. Pat. No. 3,622,706, so that during phonetic sound recognition an output switch is operated only when simultaneously voiceprint recognition occurs. in view of the broad usefulness of the system disclosed herein, accordingly, it becomes obvious that the spe cific arrangements described herein are exemplary, and various modifications, adaptations and substitutions of parts may be made without departing from the true spirit and scope of the invention.
What I claim, is:
l. in a voice identification system wherein the voice signals of a speaking individual to be identified are compared with prerecorded voice signals of the same individual, but wherein each group of resonances in the sound spectrum representing a segment of said voices vary within the pitch intervals of the speaking and prerecorded voices respective of each other, and in'conjunction with variations in resonance-amplitude from pitch to pitch periods, the system for normalyzing said variations during said comparison, comprising first and second identical normalyzing systems for the said speaking and prerecorded voice signals, the first system comprising a plurality of band-pass filters for subdividing the resonances of the speaking voice sound waves; a plurality of detecting means for deriving detected signals from said filters; means for deriving a control signal (outputs of R-S blocks 17-19) from said detected signals which represent a segment of the speakers speech during a pitch time interval, and said control signal having a time period equal to said time interval; a matrix comprising a plurality of signal regrouping combinations, each combination having been prearranged for regrouping said detected signals in a pitch time interval to a reference numerical region in a numerically arranged plurality of channels, in an order that, said control signal in the group of said detected signals is made to appear at a reference channel in said reference numerical region, and each one of said combinations having been prearranged for operation by said control signal, means for deriving pitch pulses from said speaking voice; a plurality of signalamplitude equalizers connected to said plurality of channels, a plurality; of parallel coupling means connected from said reference channel to said amplitude equalizers; coupling means from said pitch pulses to the amplitude equalizer of said reference channel for equalizing said control signal to a reference amplitude during the applied pulse period, whereby the signal amplitudes of the rest of the channels are controlled proportionally by said reference channel, and thereby establishing spectral and amplitude normalized detected signals at said plurality of channels, representing said first system; and means for comparing said signal normalized channel signals of said first and second systems and deriving matching signals, which may further be combined with said control signals of said first and second systems for deriving a final decision signal that may be used to recognize the individual 2. The system as set forth in claim 1, wherein said normalyzed signals from said plurality of channels of said first and second systems are applied to first and second inputs of a plurality of operational amplifiers, for obtaining said comparison signals; a normally inoperative mixer means for mixing said comparison 'signals; and a pulse generator of a predetermined frequency for operating said mixer means, so that discrete signals during said pulses may be obtained for interpretation and recognition of said individual.
3. The system as set forth in claim 1, wherein the center frequency responses of said plurality of band-pass filters in said sound spectrum are arranged in a series of frequencies within harmonic intervals, each frequency of the series of frequencies in an interval corresponding to the harmonic of a frequency in the preceding interval; and said matrix comprises a plurality of channels arranged in linearly sequenced numerical order, each channel output representing the output of at least one or more AND gates, the number of AND gates in each channel being equal to the number of the numerical location of that channel in said numerical order; first parallel connections of the numerically corresponding first outputs of sad gates in the plurality of channels; coupling means from said detected signals to said first parallel connected inputs, whereby each signal in a group may be coupled to any one of the first inputs of a number of channels for channel output selection during said control signal interval, second parallel connections of the second inputs in said AND gates in an arrangement that the second inputs of the AND gates in said channels starting from the channel located in the lowest numerical position in said numerical order are connected to successively advancing ones from the corresponding second inputs of the gates in the succeeding channels; and coupling means from said control signal to said second parallel connections for establishing said signal regrouping.
Claims (3)
1. In a voice identification system wherein the voice signals of a speaking individual to be identified are compared with prerecorded voice signals of the same individual, but wherein each group of resonances in the sound spectrum representing a segment of said voices vary within the pitch intervals of the speaking and prerecorded voices respective of each other, and in conjunction with variations in resonance-amplitude from pitch to pitch periods, the system for normalyzing said variations during said comparison, comprising first and second identical normalyzing systems for the said speaking and prerecorded voice signals, the first system comprising a plurality of band-pass filters for sub-dividing the resonances of the speaking voice sound waves; a plurality of detecting means for deriving detected signals from said filters; means for deriving a control signal (outputs of R-S blocks 17-19) from said detected signals which represent a segment of the speaker''s speech during a pitch time interval, and said control signal having a time period equal to said time interval; a matrix comprising a plurality of signal regrouping combinations, each combination having been prearranged for regrouping said detected signals in a pitch time interval to a reference numerical region in a numerically arranged plurality of channels, in an order that, said control signal in the group of said detected signals is made to appear at a reference channel in said reference numerical region, and each one of said combinations having been prearranged for operation by said control signal, means for deriving pitch pulses from said speaking voice; a plurality of signal-amplitude equalizers connected to said plurality of channels, a plurality; of parallel coupling means connected from said reference channel to said amplitude equalizers; coupling means from said pitch pulses to the amplitude equalizer of said reference channel for equalizing said control signal to a reference amplitude during the applied pulse period, whereby the signal amplitudes of the rest of the channels are controlled proportionally by said reference channel, and thereby establishing spectral and amplitude normalized detected signals at said plurality of channels, representing said first system; and means for comparing said signal normalized channel signals of said first and second systems and deriving matching signals, which may further be combined with said control signals of said first and second systems for deriving a final decision signal that may be used to recognize the individual.
2. The system as set forth in claim 1, wherein said normalyzed signals from said plurality of channels of said first and second systems are applied to first and second inputs of a plurality of operational amplifiers, for obtaining said comparison signals; a normally inoperative mixer means for mixing said comparison signals; and a pulse generator of a predetermined frequency for operating said mixer means, so that discrete signals during said pulses may be obtained for interpretation and recognition of said individual.
3. The system as set forth in claim 1, wherein the center frequency responses of said plurality of band-pass filTers in said sound spectrum are arranged in a series of frequencies within harmonic intervals, each frequency of the series of frequencies in an interval corresponding to the harmonic of a frequency in the preceding interval; and said matrix comprises a plurality of channels arranged in linearly sequenced numerical order, each channel output representing the output of at least one or more AND gates, the number of AND gates in each channel being equal to the number of the numerical location of that channel in said numerical order; first parallel connections of the numerically corresponding first outputs of sad gates in the plurality of channels; coupling means from said detected signals to said first parallel connected inputs, whereby each signal in a group may be coupled to any one of the first inputs of a number of channels for channel output selection during said control signal interval, second parallel connections of the second inputs in said AND gates in an arrangement that the second inputs of the AND gates in said channels starting from the channel located in the lowest numerical position in said numerical order are connected to successively advancing ones from the corresponding second inputs of the gates in the succeeding channels; and coupling means from said control signal to said second parallel connections for establishing said signal regrouping.
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US24874072A | 1972-04-28 | 1972-04-28 |
Publications (1)
Publication Number | Publication Date |
---|---|
US3770891A true US3770891A (en) | 1973-11-06 |
Family
ID=22940470
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US00248740A Expired - Lifetime US3770891A (en) | 1972-04-28 | 1972-04-28 | Voice identification system with normalization for both the stored and the input voice signals |
Country Status (1)
Country | Link |
---|---|
US (1) | US3770891A (en) |
Cited By (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US4069393A (en) * | 1972-09-21 | 1978-01-17 | Threshold Technology, Inc. | Word recognition apparatus and method |
EP0059650A2 (en) * | 1981-03-04 | 1982-09-08 | Nec Corporation | Speech processing system |
US4363102A (en) * | 1981-03-27 | 1982-12-07 | Bell Telephone Laboratories, Incorporated | Speaker identification system using word recognition templates |
US4926488A (en) * | 1987-07-09 | 1990-05-15 | International Business Machines Corporation | Normalization of speech by adaptive labelling |
US6505154B1 (en) * | 1999-02-13 | 2003-01-07 | Primasoft Gmbh | Method and device for comparing acoustic input signals fed into an input device with acoustic reference signals stored in a memory |
US20110112838A1 (en) * | 2009-11-10 | 2011-05-12 | Research In Motion Limited | System and method for low overhead voice authentication |
US20110112830A1 (en) * | 2009-11-10 | 2011-05-12 | Research In Motion Limited | System and method for low overhead voice authentication |
US20140095161A1 (en) * | 2012-09-28 | 2014-04-03 | At&T Intellectual Property I, L.P. | System and method for channel equalization using characteristics of an unknown signal |
Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US3431359A (en) * | 1965-09-17 | 1969-03-04 | Meguer V Kalfaian | Amplitude equalizer of speech sound waves with high fidelity |
US3439122A (en) * | 1966-06-15 | 1969-04-15 | Bell Telephone Labor Inc | Speech analysis system |
US3466394A (en) * | 1966-05-02 | 1969-09-09 | Ibm | Voice verification system |
US3509280A (en) * | 1968-11-01 | 1970-04-28 | Itt | Adaptive speech pattern recognition system |
US3525811A (en) * | 1968-12-26 | 1970-08-25 | Fred C Trice | Remote control voting system |
US3546584A (en) * | 1966-11-30 | 1970-12-08 | Standard Telephones Cables Ltd | Apparatus for analyzing a complex waveform containing pitch synchronous information |
US3622706A (en) * | 1969-04-29 | 1971-11-23 | Meguer Kalfaian | Phonetic sound recognition apparatus for all voices |
-
1972
- 1972-04-28 US US00248740A patent/US3770891A/en not_active Expired - Lifetime
Patent Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US3431359A (en) * | 1965-09-17 | 1969-03-04 | Meguer V Kalfaian | Amplitude equalizer of speech sound waves with high fidelity |
US3466394A (en) * | 1966-05-02 | 1969-09-09 | Ibm | Voice verification system |
US3439122A (en) * | 1966-06-15 | 1969-04-15 | Bell Telephone Labor Inc | Speech analysis system |
US3546584A (en) * | 1966-11-30 | 1970-12-08 | Standard Telephones Cables Ltd | Apparatus for analyzing a complex waveform containing pitch synchronous information |
US3509280A (en) * | 1968-11-01 | 1970-04-28 | Itt | Adaptive speech pattern recognition system |
US3525811A (en) * | 1968-12-26 | 1970-08-25 | Fred C Trice | Remote control voting system |
US3622706A (en) * | 1969-04-29 | 1971-11-23 | Meguer Kalfaian | Phonetic sound recognition apparatus for all voices |
Cited By (12)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US4069393A (en) * | 1972-09-21 | 1978-01-17 | Threshold Technology, Inc. | Word recognition apparatus and method |
EP0059650A2 (en) * | 1981-03-04 | 1982-09-08 | Nec Corporation | Speech processing system |
EP0059650A3 (en) * | 1981-03-04 | 1983-11-16 | Nec Corporation | Speech processing system |
US4363102A (en) * | 1981-03-27 | 1982-12-07 | Bell Telephone Laboratories, Incorporated | Speaker identification system using word recognition templates |
US4926488A (en) * | 1987-07-09 | 1990-05-15 | International Business Machines Corporation | Normalization of speech by adaptive labelling |
US6505154B1 (en) * | 1999-02-13 | 2003-01-07 | Primasoft Gmbh | Method and device for comparing acoustic input signals fed into an input device with acoustic reference signals stored in a memory |
US20110112838A1 (en) * | 2009-11-10 | 2011-05-12 | Research In Motion Limited | System and method for low overhead voice authentication |
US20110112830A1 (en) * | 2009-11-10 | 2011-05-12 | Research In Motion Limited | System and method for low overhead voice authentication |
US8321209B2 (en) * | 2009-11-10 | 2012-11-27 | Research In Motion Limited | System and method for low overhead frequency domain voice authentication |
US8326625B2 (en) * | 2009-11-10 | 2012-12-04 | Research In Motion Limited | System and method for low overhead time domain voice authentication |
US8510104B2 (en) * | 2009-11-10 | 2013-08-13 | Research In Motion Limited | System and method for low overhead frequency domain voice authentication |
US20140095161A1 (en) * | 2012-09-28 | 2014-04-03 | At&T Intellectual Property I, L.P. | System and method for channel equalization using characteristics of an unknown signal |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Kapka et al. | Sound source detection, localization and classification using consecutive ensemble of CRNN models | |
CN1653519B (en) | Method for robust voice recognition by analyzing redundant features of source signal | |
Taniguchi et al. | An auxiliary-function approach to online independent vector analysis for real-time blind source separation | |
US3770891A (en) | Voice identification system with normalization for both the stored and the input voice signals | |
US3509280A (en) | Adaptive speech pattern recognition system | |
FR2274101A1 (en) | VOICE RECOGNITION PROCESS AND DEVICE IMPLEMENTING THIS PROCESS | |
Saruwatari et al. | Blind source separation for speech based on fast-convergence algorithm with ICA and beamforming. | |
GB831741A (en) | Method and apparatus for analysing the spatial distribution of a variable quantity or function | |
Venkataramani et al. | Neural network alternatives toconvolutive audio models for source separation | |
US3037077A (en) | Speech-to-digital converter | |
EP3182339A1 (en) | Reservoir computing device | |
Imoto | Acoustic scene classification using multichannel observation with partially missing channels | |
GB2182795A (en) | Speech analysis | |
Bohlender et al. | Improved deep speaker localization and tracking: Revised training paradigm and controlled latency | |
US3919481A (en) | Phonetic sound recognizer | |
US3619509A (en) | Broad slope determining network | |
US3870817A (en) | Phonetic sound recognizer for all voices | |
Briegleb et al. | Localizing spatial information in neural spatiospectral filters | |
US3067288A (en) | Phonetic typewriter of speech | |
Venkataramani et al. | End-to-end non-negative autoencoders for sound source separation | |
US3659051A (en) | Complex wave analyzing system | |
Haghighatshoar et al. | Low-power SNN-based audio source localisation using a Hilbert Transform spike encoding scheme | |
US3678201A (en) | Bandwidth compression system in phonetic sound spectrum | |
Kim et al. | Activity-Informed Industrial Audio Anomaly Detection Via Source Separation | |
US2892892A (en) | Vocoder absorption modulation system |