WO2000077772A2

WO2000077772A2 - Speech and voice signal preprocessing

Info

Publication number: WO2000077772A2
Application number: PCT/GB2000/002332
Authority: WO
Inventors: Ronald Chalmers; Mark Christopher Simpson; Steven Leslie Pae
Original assignee: Cyber Technology (Iom) Liminted
Priority date: 1999-06-14
Filing date: 2000-06-14
Publication date: 2000-12-21
Also published as: GB2367938A; GB9913773D0; WO2000077772A3; GB0200735D0; AU5547100A

Abstract

In a system or method of voice or speech recognition, a voice waveform signal modeled as the product of a power component and an informational component is divided into higher and lower frequency signals, corresponding to the information signal and the power signal. The signals are amplified separately and then combined. By applying higher amplification to the information signal, a more detailed sample of the initial waveform can be provided to voice recognition or word recognition apparatus.

Description

SPEECH AND VOICE SIGNAL PROCESSING

FIELD OF THE INVENTION

This invention relates to a system and method for processing signals derived from human speech, and relates especially to applications in voice and speech recognition systems.

BACKGROUND OF THE INVENTION

Differences between individuals in anatomy, language, and background, amongst other factors, mean that people sound different from one another. The human brain possesses an extraordinary power of voice recognition which enables one to recognize a known speaker, e.g. over the telephone, without explicit identification by any other means. On the other hand, electronic voice recognition systems are becoming of increasing importance in a number of areas. In numerous security applications it is useful to be able to recognize a voice and to distinguish it from other voices. Such applications include for example, a security access system to determine whether speech from a person requesting access to a secure area, is that of an authorized person or a would-be intruder.

Electronic speech recognition is a developing technique, and one which, because of the advent of cheap and substantial computer processing power, is gaining widespread currency. Computer programs are already available which can be trained to convert spoken words into text on screen, vastly facilitating the production of written text. Other software features permit commands to be entered to programs by speech which would otherwise require the use of input devices such as a keyboard or a pointing device such as a mouse. In another area, computer based information systems, for example for air flight schedules and fares, can be entirely automatic if such are able to recognize questions asked by a telephone inquirer and then, using a voice synthesis system, produce the appropriate answer.

In known techniques of both voice and speech recognition, a voice waveform, i.e. an electrical signal derived from human speech, is sampled, and a matrix of power levels is generated therefrom either by time or by frequency. The resultant pattern is stored, which is then compared with an existing sample. For example, in voice recognition, the comparison enables the making of a decision whether if the same speaker provided both samples, to enable speaker verification. The percentage of points in the stored and new samples, which must match before the samples are considered to originate from the same person, can be varied so as to provide a varied level of certainty, and therefore of security, depending on the circumstances.

In speech recognition, the stored and new samples are compared to determine if different speakers are speaking the same word. Generally a lower percentage of points in the pattern which must match is set than for voice recognition. Other versions of speech recognition software require speaker-specific training where samples of a particular speaker are initially accumulated. New samples of the same speaker are compared to these stored samples in order to determine the precise word spoken.

The usual approach taken in the prior art is to process a sample of the speech signal in the time domain. This sampling produces a 2-dimensional matrix of the speech signal which is then made subject to further treatment. Such treatment includes preliminary acoustic spectral analysis based on linear predictive coding, Mel frequency cepstral coefficients, cochlea modelling and others. One disadvantage of this type of approach concerns a fact known to the person skilled in the art, that most of the information contained in a speech signal reside in the higher end of the frequency spectrum. By not discriminating between regions of the frequency spectrum of the speech signal, over-emphasis is focused on the less important portion, with concomitant under-emphasis on the more important high frequency component. What results is a less accurate output than that which places proper weight.

Existing patents on speech recognition include the following: US patent 5,864,804; US patent 6,067,513; US patent 5,839,099; US patent 5,313,531 ; US patent 5,960,395; and Japanese patent 5143098A2.

SUMMARY OF THE INVENTION

This invention includes a method for processing a voice signal comprising filtering the components of the voice signal modeled as the product of a power component and an informational component to derive an amplified signal for voice or speech recognition.

In a variation of the invention, the step of filtering the components comprises linearizing the components of the voice signal; duplicating the linearized signal to produce a first linearized signal and a second linearized signal; passing the first linearized signals through a low pass filter producing a first filtered signal and passing the second linearized signals through a high pass filter producing a second filtered signal; and amplifying the filtered signals.

In a further variation of the invention, the filtered signals are amplified differentially. Optionally, the first filtered signal is amplified at a lower gain level than the second filtered signal.

In a variation of the invention the low pass filter and the high pass filter include approximately non-overlapping passbands whereby the higher and the lower spectral components of the linearized signal are separated.

In another variation of the invention, the step of linearizing the components of the voice signal comprises determining the logarithm of the voice signal; and combining the filtered signals, and determining the antilogarithm of the combined signals.

The invention also includes the variation where the step of linearizing the components of the voice signal comprises determining the logarithm of the voice signal; and further comprises determining the antilogarithm of the filtered signals.

According to another aspect of the invention, it further comprises applying an additional voice or speech recognition method to the processed signal. The further voice or speech recognition optionally comprises processing with a hidden Markov model.

In another embodiment of the invention, the method for generating a plurality of processed signals from a voice signal for input to voice or speech recognition further comprising duplicating the voice signal, and analyzing the duplicated voice signal with frequency response analysis. The frequency response analysis optionally involves processing based with a hidden Markov model. In a variation of the invention, the method for voice or speech recognition further comprises applying a voice or speech recognition process to the plurality of signals generated.

Another embodiment of the invention relates to a method for identifying an individual making a fraudulent application to gain unauthorized entry into a voice-activated secured entry system in combination with a database comprising vectors of personal information and voice information of all authorized applicants, comprises determining whether the application is a fraudulent or a non-fraudulent attempt to gain entry to the system, and recording the applicant's application voice information in the database if the application is determined to be fraudulent. In a variation, the invention further comprises cross-checking the voice information of a application determined to be non-fraudulent with voice information recorded in the database of previous fraudulent applications. Optionally, the cross-checking of the non-fraudulent application voice information against fraudulent application voice information occurs subsequent to the application process.

One embodiment of the invention relates to a method for voice or speech recognition comprises linearizing the components of a voice signal by determining the logarithm of the voice signal modeled as the product of a power component and an informational component; duplicating the linearized signal producing a first linearized signal and a second linearized signal; passing the first linearized signal through a low pass filter producing a first filtered signal and passing the second linearized signal through a high pass filter producing a second filtered signal; and amplifying both filtered signals, the first filtered signals amplified at a higher gain than the second filtered signal; combining the filtered signals; determining the antilogarithm of the combined signals; and applying voice or speech recognition to the determined signal.

In a variation of the invention, the passbands of the high and low pass filters subdivide the frequency spectrum at a breakpoint frequency to be preferably between 200 and 400 Hertz.

A further variation involves the breakpoint frequency set preferably at approximately 300 Hertz.

According to another aspect of the invention, the passbands of the high and low pass filters sub-divide the frequency spectrum at a breakpoint frequency determined ascertaining the context of the application and then generating the breakpoint frequency as a function of the context. The context comprises at least one of the personal characteristics of the speaker selected from the group comprising the gender, age, and language of the speaker.

A variation of the invention involves processing of the voice signal by a computer.

Another embodiment of the invention relates to a system for processing a voice signal comprising a device for filtering the components of the voice signal modeled as the product of a power component and an informational component, and a further device for deriving an amplified signal for voice or speech recognition.

The invention also includes the variation where the system for filtering the components comprises a device for linearizing the components of the voice signal; a device for duplicating the linearized signal to produce a first linearized signal and a second linearized signal; a device for passing the first linearized signals through a low pass filter producing a first filtered signal and device for passing Hie second linearized signals through a high pass filter producing a second filtered signal; and a device for amplifying the filtered signals.

In another variation, the filtered signals are amplified differentially.

Another variation of the above involves amplifying the first filtered signal at a lower gain level than the second filtered signal.

Another embodiment of the invention further comprising means for combining the filtered signals.

In another variation of the invention, the low pass filter and the high pass filter have approximately non-overlapping passbands whereby the higher and the lower spectral components of the linearized signal are separated.

According to another aspect of the invention the means for linearizing the components of the voice signal comprises means for determining the logarithm of the voice signal, and further comprises means for combining the filtered signals, and means for determining the antilogarithm of the combined signals.

In another variation, the means for the step of linearizing the components of the voice signal comprises means for determining the logarithm of the voice signal, and further comprises means for determining the antilogarithm of the filtered signals. A further variation further comprises means for applying voice or speech recognition to the processed signal. The means for further voice or speech recognition optionally comprises means for processing with a hidden Markov model.

Another aspect of the invention involves generating a plurality of processed signals from a voice signal for input to means for voice or speech comprises means for duplicating the voice signal, and means for analyzing the duplicated voice signal with frequency response analysis. The frequency response analysis optionally comprises processing with a hidden Markov model.

An another embodiment of the invention comprises means for applying a voice or speech recognition process to the plurality of signals.

Another embodiment of the invention relates to a system to identify an individual making a fraudulent application to gain unauthorized entry into a voice-activated secured entry system in combination with a database comprising vectors of personal information and voice information of all authorized applicants, which comprises means for determining whether the application is a fraudulent or a non-fraudulent attempt to gain entry to the system, and means for recording the applicant's application voice information in the database if the application is determined to be fraudulent.

Another variation comprises means for cross-checking the voice information of an application determined to be non-fraudulent with voice information recorded in the database of previous fraudulent applications. Optionally the means for cross-checking the non- fraudulent application voice information against fraudulent application voice information performs the cross-checking as a separate process from that carried out by the means for determining whether an application is fraudulent or non-fraudulent.

The invention also includes the embodiment where a system for voice or speech recognition comprises the following: input means for obtaining a voice signal; means for linearizing determining the logarithm of the voice signal modeled as the product of a power component and an informational component; means for duplicating the linearized signal producing a first linearized signal and a second linearized signal; means for passing the first linearized signals through a low pass filter having a lowpass passband producing a frst filtered signal and means for passing the second linearized signals through a high pass filter having a highpass passband producing a second filtered signal; means for amplifying both filtered signals, the first filtered signals amplified at a higher gain than the second filtered signal; means for combining the filtered signals; means for determining the antilogarithm of the combined signals; and means for applying voice or speech recognition to the determined signal.

The invention includes the variation where the passbands of the high and low pass filters sub-divide the frequency spectrum. Another variation involves the breakpoint frequency lying preferably in the range of the frequency spectrum between 200 to 400 Hertz. In another further variation, the breakpoint frequency is preferably 300 Hertz.

According to another aspect of the invention, the passbands of the high and low pass filters sub-divide the frequency spectrum at a breakpoint frequency determined by means for ascertaining the context of the application, and means for generating the breakpoint frequency as a function of the context. The context optionally comprises at least one of the personal characteristics of the speaker selected from the group comprising the gender, age, and language of the speaker.

An another embodiment of the invention relates to a system for providing a signal for voice or speech recognition comprising microphone means to generate a multi-frequency electrical signal from the voice signal; a logarithmic amplifier to receive the electrical signal; a high pass filter and a low pass filter connected to the logarithmic amplifier; first and second amplifiers electrically connected one to each filter; and an exponential amplifier connected to the first and second amplifiers.

In another variation, the system further comprises means for speech or voice recognition connected to the output of the exponential amplifier.

Another aspect of the invention involves a computer being used as the means for speech or voice recognition. BRIEF DESCRIPTION OF THE DRAWINGS

Embodiments of the invention will be described by way of example and with reference to the drawing in which: FIG. 1 is a block diagram of the invention.

DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS

It is an object of the invention to provide a system and method to improve the effectiveness of both voice and speech recognition systems by pre-treating the voice signal at the initial stage of processing an audio signal.

According to one embodiment of the invention, a method of voice or speech recognition comprises generating a voice waveform signal, dividing the signal into a higher frequency signal and a lower frequency signal, amplifying the higher and lower frequency signals separately, combining the amplified signals, and applying a voice or speech recognition method to the combined amplified signals.

Preferably the signals are divided at a frequency such that the higher frequency signal relates to the information content of the voice waveform and the lower frequency signal relates to the power content of the voice waveform.

The approach used in the present invention thus provides a novel method of processing a speech sample. The approach is to break down the speech into two components and then subject them to separate treatment. The lower frequency spectrum generally relates to the power or volume of the spoken word while the higher frequency components can be considered to contain the bulk of the information content of the sample, including the inflection of the spoken word. After separation of these components, they can be processed differently to maximize the efficiency of the speech or voice recognition system. This method enables much more information about the speech sample to be generated.

Compared with the prior art approach, the method of the present invention may be thought of as generating a three-dimensional picture of the voice sample, rather than a two- dimensional one. By utilizing more processor power on the higher frequency elements of the signal, the system can be much more precise and vastly faster than conventional systems. In terms of the number of points selected for comparison purposes, the prior art two- dimensional approach normally employed, may look at, for example, four thousand points, whereas the three-dimensional approach (using the same sampling frequency) can be set to identify and operate on many more times that number of samples. Because the samples in the approach used in the method of the present invention can be selected in a biased fashion, due to the separation process, the downstream processing for voice or speech recognition can make use of many more points of reference, for example ten or more times as many, leading to greater accuracy and thus a much diminished risk of false voice recognition, or much higher accuracy of reflection of the words actually spoken in the case of speech or word recognition applications.

Using the approach in accordance with the present invention, the standard technique of normalizing the sample power using automatic gain control circuitry can be dispensed with. By processing the lower frequency component separately, the power level of the total sample is more accurately known and hence easier to control or manipulate. The above approach is far superior to simply amplifying or attenuating the entire signal, as that produces no additional information. This means that the signal can be adjusted before processing by the rest of the voice recognition processors in a more refined way that gives rise to a higher acceptance rate of the voice.

_^ A voice signal V can be modeled as a combination of a volume or power signal P and an information signal I where

V = P I

By taking the logarithm of the signals one arrives at the simple equation:

Log V = Log P + Log I

thus providing an approach in which the signals are linearly separated but related by this simple combination. After such a separation, each signal can be processed in accordance with requirements, in contrast to prior art arrangements in which the entire voice signal V is amplified and processed.

The invention will now be described by way of a preferred embodiment with reference to the accompanying drawing which illustrates a system for capturing and processing human speech. To a person skilled in the art, it is obvious that the components of this system, with the exception of the microphone may be implemented either by analog circuitry or digital means after analog to digital conversion. For convenience of description, analog terminology is used presently.

In Figure 1 , a human being 10 is shown adjacent a microphone 12 connected to a linear amplifier 14 which is connected through a logarithmic amplifier 16 to a high pass filter 18 and a low pass filter 20. The filters 18, 20 are connected respectively to first and second amplifiers 22, 24 which are both connected through an exponential amplifier 26 to a voice or speech recognition apparatus 28.

When the human being 10 speaks into the microphone, the output of the linear amplifier 14 is the signal V = P I. The output of the logarithmic amplifier 16 is a signal log V = log P + log I which is connected to both of the filters 18, 20. This in effect linearizes the two components of the voice signal. The output of the high pass filter 18 is the signal log I which is amplified by amplifier 22 to form a signal log I'. The output of the low pass filter 20 is the signal log P and the amplified signal from amplifier 24 is the signal log P". This effects the separation of the components.

The frequency at which the signals are separated, the breakpoint, and therefore the dividing line between the passbands of the high pass and low pass filters 18, 20, is preferably around the 300 Hertz threshold. The breakpoint is to some extent arbitrary, as there, is no exact frequency at which the signal changes from a power signal to an information signal, but 300 Hertz is generally recognized in speech analysis as providing a reasonable boundary between the power/volume component and the information component of human speech. The breakpoint frequency may also be determined adaptively depending upon the personal characteristics of the speaker and the language spoken.

Since the filters 18, 20 would preferably not have infinite attenuation outside their passbands, which is clear to a person skilled in the art, there is some merging of the power and information signals near the breakpoint frequency, which are acceptable under the circumstances. Again, this is clear to a person skilled in the art.

Typically the amplification applied by the first amplifier 22 to the information signal is higher than the amplification applied by amplifier 24 to the power signal. The signals log I' and log P' are combined in the exponential amplifier 26 to give an combined amplified signal V = P' I', which is passed to the voice or speech recognition apparatus 28 for processing. Such a signal now includes enhanced detail in the information signal, thus allowing a more accurate matrix sample of the initial waveform. The lower amplification applied to the power signal will reduce the effect of differences in microphone sensitivity, loudness of speech, etc. The final voice or speech recognition may be based on, for example, a hidden Markov model or an adaptation of dynamic time warping, fuzzy logic, neural networks, template matching, expert systems, or a combination of these approaches for pattern recognition (See L. Rabiner and B. Juang, Fundamentals of Speech Recognition, Prentice-Hall, Englewood Cliffs, NJ, 1993; and CH. Lee, F.K. Soong and K.K. Paliwal (Eds.), Automatic Speech and Speaker Recognition: Advanced Topics, Kluwer, Boston, 1996).

Thus the voice or speech recognition process carried out in apparatus 28 can have a higher level of accuracy than has previously been possible. As is well-known, the output29 of apparatus 28 will be: a "yes" or "no" signal for voice recognition, and output corresponding to one or more words in question in speech recognition systems, for example when the apparatus 28 is a personal computer running a direct dictation to screen program.

A further advantage according to the invention is that it is now easy for a person skilled in the art to introduce automatic gain control using the lower frequency component in another embodiment. This ensures that the signal is presented for further processing subsequently at optimum levels and reduces variations due to the speaker moving away from the mike etc.

According to another preferred embodiment, the system includes an adaptive component in order to determine the optimal breakpoint (cut-off) frequency for delimiting power and information (i.e. inflection) based on the characteristics of the speaker. There are many variations of voice types: some voices have a lot of base resonance and a lot of inflection in that bass part and other voices have resonance in the higher frequency. A significant part of the resonance tends to be based on whether the input is from either a male or female, adult or child (but not necessarily so). The adaptive system allows for that differentiation. The voice differentiation could be implemented as a separate and prior component, or integrated as part of a feedback design shifting the break point depending upon the input voice signal. Alternatively, voice differentiation is first processed before it moves to the breakpoint where the power and information components are separated. In either case a higher level of voice and password acceptance results.

The object of another preferred embodiment is to transcend the limitation of a single language, for exmaple, English. Languages vary enormously with varying sound, inflection, pronunciations; these determine the different resonance levels and frequency responses. The database and the break point may vary with language; although largely language independent, the system maintains different processing options that can support different languages. First, the spectral components are split based on the cut-off frequency, which break point is dependent upon the type of voice. The voice can be further categorized so that the character of continued processing depends upon the nature and origin of the input language. Therefore a language dependent system results. The software recognizes the language and type of speaker and makes the processing adaptive to these features. That is to say, a French speaking person will be classified in a certain group; then if a female, in a subgroup thereof, further if a child: the processing branches off again and again prior to undergoing actual spectral separation. Subsequently, when another person uses the system, say a Chinese adult male, then the system adapts to this entirely new language and speaker type for voice comparison. The natural language, the gender, and the age of the speaker are all factors where context is permitted to influence the choice of an optimal breakpoint frequency.

A further preferred embodiment implements dual-level security with a parallel frequency response analysis system. This parallel system is run so that the voice/speech system effectively has two levels of security. The additional test could be based on a Hidden Markov model or some other similar model for the process. This allows the system to focus on its main application that is total voice recognition and its security. A check of the results from this standard speech recognition system and from the present invention gives rise to two masks that collectively achieves a greater level of verification. Therefore, the results of the standard system are taken and compared to this invention's results achieving better comparisons, analyses and outcome.

Still another preferred embodiment implements functionality to facilitate investigation of fraudulent applications. One component of the embodiment maintains a database of data and information about each user, the person's voice and voice inflection. If anyone attempts to make a fraudulent application then the person's input voice information is also recorded. This fraudulent application is cross-referenced or cross-checked to the main database and voice information, whether at the time of application or subsequently as for example in a batched process on an interval basis. A match of a fraudulent applicant's voice information to their real application voice information would lead to accurate and appropriate details for further security investigation.

Another preferred embodiment, as indicated via the dotted connections in Figure 1 , has outputs of the first and second amplifiers 22, 24 connected to separate exponential amplifiers 30, 32 and the power signal and information signal are then processed separately using different methodology suitable for the type of signal.

One preferred embodiment implements this invention by analog means using electronic circuitry, including logarithmic and exponential converters, operational amplifiers to realize sallen and Key high and low pass filters. Another preferred embodiment implements the invention digitally. In the case of the latter, an analog to digital converter digitizes the voice input from a microphone. A computer or dedicated hardware can then process the digitized voice signal. Filtering of the signal can be effected by standard digital filtering methods (See AN. Oppenheim and R.W. Schafer, with J.R. Buck. Discrete-Time Signal Processing, Second Edition. Prentice-Hall, Inc., Upper Saddle River, ΝJ, 1999).

It will be appreciated that the above description relates to the preferred embodiments by way of example only. Many variations on the apparatus for delivering the invention will be obvious to those knowledgeable in the field, and such obvious variations are within the scope of the invention as described and claimed, whether or not expressly described.

All patents and publications referred to in this paper are incorporated by reference in their entirety.

Claims

What is claimed is:

1. A method for processing a voice signal comprising filtering the components of the voice signal modeled as the product of a power component and an informational component to derive an amplified signal for voice or speech recognition.

2. The method of claim 1 , wherein the step of filtering the components comprises:

• linearizing the components of the voice signal;

• duplicating the linearized signal to produce a first linearized signal and a second linearized signal;

• passing the first linearized signal through a lowpass filter having a lowpass passband to produce a first filtered signal and passing the second linearized signal through a highpass filter having a highpass passband to produce a second filtered signal; and

• amplifying the filtered signals.

3. The method of claim 2, wherein the filtered signals are amplified differentially.

4. The method of any of claims 2 or 3, wherein the first filtered signal is amplified at a lower gain level than the second filtered signal.

5. The method of any of claims 2 to 4, further comprising combining the filtered signals.

6. The method of any of claims 2 to 5, wherein the low pass filter and the high pass filter include approximately non-overlapping passbands whereby the higher and the lower spectral components of the linearized signal are separated.

7. The method of any of claims 2 to 6, wherein:

• the step of linearizing the components of the voice signal comprises determining the logarithm of the voice signal; and • further comprises:

• combining the filtered signals; and

• determining the antilogarithm of the combined signals.

8. The method of any of claims 2 to 6, wherein: • the step of linearizing the components of the voice signal comprises determining the logarithm of the voice signal; and

• further comprises determining the antilogarithm of the filtered signals.

9. A method for voice or speech recognition comprises the method of any of claims 1 to 8, and further comprising applying voice or speech recognition to the processed signal and determining whether the voice signal is that of a recognized person in the case of voice recognition or one or more recognized words in the case of speech recognition.

10. The method of claim 9, wherein the further voice or speech recognition comprises processing with a hidden Markov model.

11. A method for generating a plurality of processed signals from a voice signal for input to _^yoice or speech recognition comprising the method of any of claims 1 to 8, and further comprising:

• duplicating the voice signal; and

• analyzing the duplicated voice signal with frequency response analysis.

12. The method claim 11 , wherein the frequency response analysis comprising processing with a hidden Markov model.

13. A method for voice or speech recognition comprising the method of any of claims 11 or 12, and further comprising applying a voice or speech recognition process to the plurality of signals.

14. A method for identifying an individual making a fraudulent application to gain unauthorized entry into a voice-activated secured entry system, the system in combination with a database comprising vectors of personal information and voice information of all authorized applicants, comprising the method of any of claims 9 to 10, or 13, and further comprising:

• determining whether the application is a fraudulent or a non-fraudulent attempt to gain entry to the system; and

• recording the applicant's application voice information in the database if the application is determined to be fraudulent.

15. The method of claim 14, further comprising cross-checking the voice information of a application determined to be non-fraudulent with voice information recorded in the database of previous fraudulent applications.

16. The method of claim 15, wherein the cross-checking of the non-fraudulent application voice information against fraudulent application voice information occurs subsequent to the application process.

17. A method for voice or speech recognition comprising: • linearizing the components of a voice signal by determining the logarithm of the voice signal modeled as the product of a power component and an informational component;

• duplicating the linearized signal producing a first linearized signal and a second linearized signal; • passing the first linearized signal through a low pass filter having a lowpass passband producing a first filtered signal and passing the second linearized signal through a high pass filter having a highpass passband producing a second filtered signal; and

• amplifying both filtered signals, the first filtered signals amplified at a higher gain than the second filtered signal;

• combining the filtered signals;

• determining the antilogarithm of the combined signals; and

• applying voice or speech recognition to the determined signal.

18. The method in any of claims 2 to 17, wherein the passbands of the high and low pass filters sub-divide the frequency spectrum at a breakpoint frequency to be between 200 and 400 Hertz.

19. The method of Claim 18 in which the breakpoint frequency is approximately 300 Hertz.

20. The method of any of claims 2 to 17, wherein the passbands of the high and low pass filters sub-divide the frequency spectrum at a breakpoint frequency determined by the following steps: • determining the context of the application; and • generating the breakpoint frequency as a function of the context.

21. The method of claim 20, wherein the context comprises at least one of the personal characteristics of the speaker selected from the group comprising the gender, age, and language of the speaker.

22. The method of claims 1 to 21 , wherein the voice signal is processed by a computer.

23. A system for processing a voice signal comprising means for filtering the components of the voice signal modeled as the product of a power component and an informational component and means for deriving an amplified signal for voice or speech recognition.

24. The system of claim 23, wherein the means of filtering the components comprises:

• means for linearizing the components of the voice signal;

•_ means for duplicating the linearized signal to produce a first linearized signal and a second linearized signal;

• means for passing the first linearized signals through a low pass filter producing a first filtered signal and means for passing the second linearized signals through a high pass filter producing a second filtered signal; and

• means for amplifying the filtered signals.

25. The system of claim 24, wherein the filtered signals are amplified differentially.

26. The system of any of claims 24 or 25, wherein the first filtered signal is amplified at a lower gain level than the second filtered signal.

27. The system of any of claims 24 to 26, further comprising means for combining the filtered signals.

28. The system of any of claims 24 to 27, wherein the low pass filter and the high pass filter having approximately non-overlapping passbands whereby the higher and the lower spectral components of the linearized signal are separated.

29. The system of any of claims 24 to 28, wherein:

• the means for linearizing the components of the voice signal comprises means for determining the logarithm of the voice signal; and • further comprises:

• means for combining the filtered signals; and

• means for determining the antilogarithm of the combined signals.

30. The system of any of claims 24 to 28, wherein:

• the means for linearizing the components of the voice signal comprises means for determining the logarithm of the voice signal; and

• further comprises means for determining the antilogarithm of the filtered signals.

31. A system for voice or speech recognition comprises the system of any of claims 23 to 30, and further comprising means for applying voice or speech recognition to the processed signal.

32. The system of claim 31 , wherein the means for further voice or speech recognition comprises means for processing with a hidden Markov model.

33. A system for generating a plurality of processed signals from a voice signal for input to means for voice or speech recognition comprises the system of any of claims 23 to 31 , and further comprises: • means for duplicating the voice signal; and

• means for analyzing the duplicated voice signal with frequency response analysis.

34. The system of claim 33, wherein the frequency response analysis comprising processing with a hidden Markov model.

35. A system for voice or speech recognition comprising the system of any of claims 33 or

34, and further comprising means for applying a voice or speech recognition process to the plurality of signals.

36. A system for identifying an individual making a fraudulent application to gain unauthorized entry into a voice-activated secured entry system in combination with a database comprising vectors of personal information and voice information of all authorized applicants, comprises the system of any of claims 31 to 33, and 35, and further comprising: • means for determining whether the application is a fraudulent or a non-fraudulent attempt to gain entry to the system; and

• means for recording the applicant's application voice information in the database if the application is determined to be fraudulent.

37. The system of claim 36, further comprising means for cross-checking the voice information of an application determined to be non-fraudulent with voice information recorded in the database of previous fraudulent applications.

38. The system of claim 37, wherein the means for cross-checking the non-fraudulent application voice information against fraudulent application voice information performs the cross-checking as a separate process from that carried out by the means for determining whether an application is fraudulent or non-fraudulent.

39. A system for voice or speech recognition comprises:

• input means for obtaining a voice signal;

• means for linearizing determining the logarithm of the voice signal modeled as the product of a power component and an informational component;

• means for duplicating the linearized signal producing a first linearized signal and a second linearized signal;

• means for passing the first linearized signals through a low pass filter having a lowpass passband producing a first filtered signal and means for passing the second linearized signals through a high pass filter having a highpass passband producing a second filtered signal; and • means for amplifying both filtered signals, the first filtered signals amplified at a higher gain than the second filtered signal;

• means for combining the filtered signals;

• means for determining the antilogarithm of the combined signals; and

• means for applying voice or speech recognition to the determined signal.

40. The system of any of claims 24 to 39, wherein the passbands of the high and low pass filters sub-divide the frequency spectrum.

41. The system of any of claims 24 to 40, wherein the breakpoint frequency lies in the range of the frequency spectrum between 200 to 400 Hertz.

42. The system of any of claims 40 and 41 wherein the breakpoint frequency is approximately 300 Hertz.

43. The system of any of claims 24 to 40, wherein the passbands of the high and low pass filters sub-divide the frequency spectrum at a breakpoint frequency determined by the following:

• means for ascertaining the context of the application; and

• means for generating the breakpoint frequency as a function of the context.

44. The system of claim 43, wherein the context comprises at least one of the personal characteristics of the speaker selected from the group comprising the gender, age, and language of the speaker <

45. A system for providing a signal for voice or speech recognition, the apparatus comprises

• microphone means to generate a multi-frequency electrical signal from the voice signal;

• a logarithmic amplifier to receive the electrical signal;

• a high pass filter and a low pass filter connected to the logarithmic amplifier. • a first amplifier electrically connected to the high pass filter and a second amplifier electrically connected the high pass filter, and

• an exponential amplifier connected to the first and second amplifiers

6 The system of claim 45 further comprising means for speech or voice recognition connected to the output of the exponential amplifier

7 The system of claim 46 in which the means for speech or voice recognition comprises a computer