WO1997009712A2 - Procede et systeme de traitement de signaux auditifs - Google Patents
Procede et systeme de traitement de signaux auditifs Download PDFInfo
- Publication number
- WO1997009712A2 WO1997009712A2 PCT/DK1996/000370 DK9600370W WO9709712A2 WO 1997009712 A2 WO1997009712 A2 WO 1997009712A2 DK 9600370 W DK9600370 W DK 9600370W WO 9709712 A2 WO9709712 A2 WO 9709712A2
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- time
- leading edge
- maximum
- signal
- εignal
- Prior art date
Links
- 238000000034 method Methods 0.000 title claims abstract description 63
- 238000012545 processing Methods 0.000 title claims abstract description 16
- 230000001052 transient effect Effects 0.000 claims abstract description 76
- 241000282414 Homo sapiens Species 0.000 claims abstract description 17
- 238000012544 monitoring process Methods 0.000 claims abstract description 3
- 230000004044 response Effects 0.000 claims description 18
- 230000008859 change Effects 0.000 claims description 14
- 241001465754 Metazoa Species 0.000 claims description 6
- 230000003247 decreasing effect Effects 0.000 claims description 6
- 230000001755 vocal effect Effects 0.000 claims description 5
- 230000014759 maintenance of location Effects 0.000 claims description 2
- 210000003477 cochlea Anatomy 0.000 abstract description 10
- 210000005036 nerve Anatomy 0.000 abstract description 4
- 238000001514 detection method Methods 0.000 description 18
- 230000008569 process Effects 0.000 description 6
- 230000005236 sound signal Effects 0.000 description 4
- 101100352626 Arabidopsis thaliana QUA2 gene Proteins 0.000 description 3
- 101100426871 Ustilago maydis (strain 521 / FGSC 9021) TSD2 gene Proteins 0.000 description 3
- 238000003708 edge detection Methods 0.000 description 3
- 238000001914 filtration Methods 0.000 description 3
- 230000006870 function Effects 0.000 description 3
- AYFVYJQAPQTCCC-GBXIJSLDSA-N L-threonine Chemical compound C[C@@H](O)[C@H](N)C(O)=O AYFVYJQAPQTCCC-GBXIJSLDSA-N 0.000 description 2
- 238000004458 analytical method Methods 0.000 description 2
- 238000010586 diagram Methods 0.000 description 2
- 238000005259 measurement Methods 0.000 description 2
- 238000005070 sampling Methods 0.000 description 2
- 238000001228 spectrum Methods 0.000 description 2
- 239000000969 carrier Substances 0.000 description 1
- 238000004891 communication Methods 0.000 description 1
- 235000009508 confectionery Nutrition 0.000 description 1
- 230000006837 decompression Effects 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 230000007274 generation of a signal involved in cell-cell signaling Effects 0.000 description 1
- 210000000056 organ Anatomy 0.000 description 1
- 230000009467 reduction Effects 0.000 description 1
- 230000001629 suppression Effects 0.000 description 1
- 230000001360 synchronised effect Effects 0.000 description 1
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/02—Feature extraction for speech recognition; Selection of recognition unit
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
- G10L25/03—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the type of extracted parameters
- G10L25/18—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the type of extracted parameters the extracted parameters being spectral information of each sub-band
Definitions
- the present invention relates to a method and system for signal processing, by which method and system features repre ⁇ senting distinct sound pictures in auditory signals are extracted from transients in auditory signals.
- the result of the processing may be used for identification of sound or of speech signals or for quality measurement of audio products or systems, such as loudspeakers, hearing instruments or hearing aids, telecommunication systems, or for quality measurement of acoustic conditions.
- the method of the present invention may also be used in connection with speech compres ⁇ sion and decompression in narrow band telecommunication or speech storing systems.
- the human ear has the ability to catch fast sound signals, detect sound frequency with great accuracy and differentiate between sound signals in complicated sound environments. For instance it is possible to understand what a singer is sing ⁇ ing in an accompaniment of musical instruments.
- transient component in an auditory signal in this invention may be interpreted as a fast change of the energy in an auditory signal, where the rise time of the energy change is at the most 3 ms, and a slower change of the energy level may be interpreted as a change of the quasi steady state component of an auditory signal.
- the transient and the quasi steady state component in an auditory signal may be defined as follows:
- the transient component in an auditory signal is the fast energy changes, that may be detected by means of an envelope detection using a lowpass filter with a rela ⁇ tively high cutoff frequency in the range 50-1500 Hz, and preferably in the range 300-1500 Hz.
- the quasi steady state component in an auditory signal i ⁇ the energy level, that may be detected by means an of envelope detection using a low-pass filter with a rela ⁇ tively low cutoff frequency in the range below 400 Hz, and preferably below 150 Hz.
- the fast energy changes in the auditory signal may also be detected without the use of envelope detection or without the use of a low pass filter.
- the nerve pulses launched from the cochlea are synchronised to the frequency of a sinus tone if the frequency is less than about 1.4 kHz. If the frequency of the tone is higher than about 1.4 kHz the pulses are launched randomly and less than once per period. Therefore the audi ⁇ tory perceptive faculty is tone oriented in the range up to about 1.4 kHz and transient oriented above.
- the frequency spectra of speech signals from human beings contain energy bands, called formants. These formants are carriers of outstanding transients, and if the formants are selected for transient analyses an important noise sup- pression may be obtained.
- WO 94/25958 it is described how the information hold in the shape of pulses representing the fast energy change ⁇ in auditory signal ⁇ are used for identifying distinct sound pictures, and in a preferred embodiment the shape of the leading edge of a pulse is determined by determining the pulse rise time or determining the slope variation. It is further preferred that the shape of the top part of the leading edge is determined, the top part starting at the point of the edge where the slope is maximum.
- the rise time of a pulse provided as an input to a filter is faster than the rise time of the impul ⁇ e response of the filter then, the rise time of the output of the filter generated in response to the input pulse will be substantially equal to the rise time of the impulse response of the filter.
- the ri ⁇ e time of the output of the filter generated in re ⁇ pon ⁇ e to the input pulse will be sub ⁇ tantially equal to the ri ⁇ e time of the input pul ⁇ e.
- the signal processing of sound signals in the cochlea may be simulated by a filter bank compri ⁇ ing a set of bandpass filters with different centre frequencies and that the bandwidths of these filters increase with increasing centre frequencies which again means that the rise time ⁇ of the impulse responses of the filters increase with increasing centre frequencies.
- the ri ⁇ e time of an output pul ⁇ e generated by a corresponding filter of the filter bank will be substantially equal to the rise time of the impulse re ⁇ pon ⁇ e of the filter when the ri ⁇ e time of the input pul ⁇ e i ⁇ fa ⁇ ter than the rise time of the impulse re ⁇ ponse of the filter and sub ⁇ tantially equal to the ri ⁇ e time of the input pulse when the rise time of the input pulse is slower than the rise time of the impulse respon ⁇ e of the filter.
- the rise time of the input pulse may be determined by determination of the two filters A and B of the filter bank having the narrowest bandwidths of the filters of the bank generating output pulses in response to the input pulse with ⁇ ub ⁇ tantially identical ri ⁇ e time ⁇ a ⁇ the ri ⁇ e time of the input pul ⁇ e mu ⁇ t be within the rise time range between the rise time of the impulse response of the filter A, B with the narrowest bandwidth and the rise time of the impulse respon ⁇ e of the filter with the largest bandwidth that is also lower than the bandwidths of the filters A, B.
- This ri ⁇ e time detection principle may be utilized by the auditory organ ⁇ of living beings and thi ⁇ could explain why the bandwidth ⁇ of the filters simulating cochlea ⁇ ound proce ⁇ ing are increa ⁇ ing with increa ⁇ ing centre frequencie ⁇ .
- sound ⁇ peech signals may be generated by modulation of pulses in filters that modulate the ⁇ hape of the pul ⁇ es as described above.
- Pulses to be modulated correspond to speech signal ⁇ generated in the articulation channel, e.g. by the vocal chord, and the proce ⁇ sing in the filters correspond to the modulation performed by adjustment of the articulation channel according to the phoneme proce ⁇ ed whereby the filter ⁇ modulate the shape of the pulses.
- the time between pulses to be modulated should sufficiently long to ensure that there i ⁇ no interference between output pul ⁇ e ⁇ generated in response to different input pulses.
- This object is accomplished by providing a method of proces- ⁇ ing an auditory ⁇ ignal to facilitate identification of abrupt energy changes within the auditory signal, which abrupt energy changes have a ri ⁇ e time of at the mo ⁇ t 3 m ⁇ , and which abrupt energy change ⁇ can be perceived by an animal ear ⁇ uch a ⁇ a human ear a ⁇ representing a distinct sound picture.
- the method comprises: deriving, from the auditory signal, a first signal com- prising transient pul ⁇ es corresponding to at least part of the abrupt energy change ⁇ , tracing or monitoring pul ⁇ e ⁇ in the fir ⁇ t transient signal, determining local maxima of the transient pul ⁇ e ⁇ , and generating a second transient signal wherein the value of at lea ⁇ t one determined local maximum of a pul ⁇ e in the fir ⁇ t tran ⁇ ient signal is hold at said maximum value for a pre ⁇ determined period of time t rfpr thereby generating a corre ⁇ sponding pul ⁇ e in the second transient signal, said predeter- mined period of time t rfpr being of at the most 5 ms.
- pulses in a train of two or more successive pulses in the first transient signal are ⁇ ubjected to the above de ⁇ cribed holding procedure, and one or more of the pulses is/are located at a distance in time from a preceding pulse which is ⁇ horter than the predetermined period of time t rfpr and ha ⁇ /have a local maximum greater than the local maximum of ⁇ aid preceding pul ⁇ e, the hold of the local maximum of said preceding pulse is maintained until the occurrence of the ⁇ ub ⁇ equent, greater local maximum and i ⁇ replaced by ⁇ aid ⁇ ub ⁇ equent, greater local maximum.
- the predetermined period of time t rfpr is shorter than or equal to 3 ms, or shorter than or equal to 2 m ⁇ . It i ⁇ even more preferred that t rfpr i ⁇ ⁇ horter than or equal to 1 m ⁇ , or about 0,7 m ⁇ .
- the shape of a pulse in the second transient pulse signal is an important feature for identifi- cation of the pulse.
- the shape of pulse ⁇ in the ⁇ econd tran ⁇ ient pulse signal are determined or identified, and preferably one or more distinct sound pictures is/are identified from the determined ⁇ hape.
- the ⁇ hape of a pulse may be characterized by the pulse rise time, the form of the leading edge, the duration of the pulse, and/or the fall time or the form of the lagging edge, and it is preferred that the form of the leading edge is determined by determining rise time, slope and/or slope variation of at least part of the leading edge.
- the frequency of the auditory signal is determined from the second transient signal based on the distance in time between succeeding leading edges of pul ⁇ e ⁇ in the ⁇ ignal.
- the method include ⁇ ⁇ electing pulses where the shape of the leading edge has a maximum slope greater than a prede ⁇ termined minimum value, thereby discarding pulses with a rather small maximum slope, which pulses may be considered a ⁇ representing noise components in the process of identifica ⁇ tion or representation of distinct sound picture ⁇ of the auditory ⁇ ignal.
- This object is accomplished by providing a method for select ⁇ ing leading edge ⁇ of tran ⁇ ient pulse ⁇ in a transient signal, said transient signal being derived from an auditory signal having abrupt energy changes with a rise time of at the mo ⁇ t 3 m ⁇ , and which abrupt energy change ⁇ can be perceived by an animal ear ⁇ uch a ⁇ a human ear a ⁇ representing a distinct sound picture.
- the method compri ⁇ e ⁇ : determining or mea ⁇ uring the maximum ⁇ lope of a leading edge of a pul ⁇ e in the tran ⁇ ient ⁇ ignal, comparing the obtained maximum ⁇ lope with a predetermined lower thre ⁇ hold value for maximum slopes of leading edges, and if the obtained maximum slope i ⁇ equal to or greater than the predetermined lower threshold value, selecting ⁇ aid leading edge a ⁇ a candidate to the leading edge of a pul ⁇ e.
- ⁇ everal leading pul ⁇ e edge ⁇ being candidate ⁇ for a ⁇ elected leading edge may be ob ⁇ erved within a ⁇ hort period of time. Thu ⁇ , it is preferred that if the transient signal comprises one or more sub ⁇ equent pul ⁇ e or pul ⁇ es, the leading edge or edges of which is/are located within a distance in time from the ⁇ elected candidate, which distance in time is ⁇ horter than a predetermined period of time, t s , of at the mo ⁇ t 4 m ⁇ , then the method further comprises: determining or measuring the maximum slope or ⁇ lope ⁇ of the leading edge or edge ⁇ of ⁇ aid ⁇ ub ⁇ equent pul ⁇ e or pulse ⁇ in the transient signal, comparing the obtained maximum slope or ⁇ lope ⁇ of the ⁇ ub ⁇ equent leading edge or edge ⁇ and the obtained maximum ⁇ lope of the selected candidate with one another, determining which of said leading edge ⁇ ha ⁇ the largest maximum slope, and selecting the leading edge with the largest maximum slope as the leading edge of a
- the predetermined period of time t s is shorter than or equal to 3,3 ms, or shorter than or equal to 2 ms, or even shorter than or equal to 1 ms.
- the method for ⁇ electing the leading edge of a ⁇ econd pul ⁇ e in the tran ⁇ ient ⁇ ignal further compri ⁇ e ⁇ : determining or mea ⁇ uring the maximum slope or slopes of the leading edge or edges of a pulse or pulses in the transi- ent signal subsequent to the selected leading edge of the first pulse within a distance in time from the leading edge of the fir ⁇ t pulse which is shorter than a predetermined period of time, t ep , of at the most 4 ms, said time period t ep being longer than or equal to the predetermined time period t s , comparing the obtained maximum ⁇ lope or ⁇ lope ⁇ of the ⁇ ubsequent leading edge or edges with the obtained maximum ⁇ lope of the leading edge of the fir
- the method for selecting the leading edge of the second pulse in the transient ⁇ ignal further compri ⁇ es: determining or measuring the maximum slope or slope ⁇ of one or more leading pul ⁇ e edge ⁇ located at a di ⁇ tance in time from the leading edge of the first pulse which is longer than or equal to the predetermined period of time, t ep , reducing the required threshold value of the maximum slope below the maximum slope of the leading edge of the first pul ⁇ e, and ⁇ electing the fir ⁇ t leading edge with a maximum ⁇ lope • greater than the required threshold value as the leading edge of a second pulse, which second pul ⁇ e may correspond to an abrupt energy change representing a distinct sound picture.
- the required threshold value for the maximum slope is decreased a ⁇ a function of time from the maximum ⁇ lope of the leading edge of the fir ⁇ t pul ⁇ e down to the predetermined lower thre ⁇ hold value.
- the required thre ⁇ hold value i ⁇ decreased exponentially with a predetermined time constant t c .
- the predetermined period of time t ep is shorter than or equal to 3,3 m ⁇ , or ⁇ horter than or equal to 2 m ⁇ , or even ⁇ horter than or equal to 1 m ⁇ .
- the shape of a selected leading edge of a pulse may represent an important feature for identification or representation of the corresponding di ⁇ tinct ⁇ ound picture. Thu ⁇ , it i ⁇ pre ⁇ ferred that the ⁇ hape of the ⁇ elected leading edge ⁇ of pul ⁇ e ⁇ i ⁇ determined, and/or a distinct sound picture i ⁇ identified from the determined shape.
- the shape of the selected leading edge of a pulse is determined by the obtained maximum ⁇ lope of the ⁇ elected leading edge.
- the ri ⁇ e time of a ⁇ elected leading edge of a pulse may also represent an important feature for identification or repre ⁇ sentation of the corresponding distinct sound picture.
- the ⁇ hape of a ⁇ elected leading edge of a pul ⁇ e i ⁇ characteri ⁇ ed by the rise time of the edge where the rise time i ⁇ determined as the time period from t b to t e , or by the ⁇ hape of the leading edge in the time period from t b to t e , where t b is the point in time where the slope of the leading edge has reached a threshold value for the beginning of the edge, d b , the ratio of said threshold value d b to the obtained maximum slope being predetermined, and t e is the point in time where the slope of the leading edge ha ⁇ decrea ⁇ ed from the maximum value to a thre ⁇ hold value for the end of the edge, d e , the ratio of said thres- hold value d e
- the value of d b is in the range of 30-100% of the obtained maximum slope, and the value of d e is in the range of 30-90% of the obtained maximum ⁇ lope.
- the value of d b may even more preferably be substantially equal to 50% or 100% of the obtained maximum slope, and the value of d e may even more preferably be sub ⁇ tantially equal to 70% of the obtained maximum ⁇ lope.
- the transient signal from which the leading edge or edge ⁇ is/are selected is a transient signal generated in accordance with one of the embodiments referring to gene- ration of the ⁇ econd tran ⁇ ient ⁇ ignal.
- the ⁇ y ⁇ tem com ⁇ pri ⁇ e ⁇ means for deriving, from the auditory signal, a first ⁇ ignal comprising transient pulses corresponding to at least part of the abrupt energy changes, and means for generating a second transient signal from said -first transient signal, said second signal generation mean ⁇ being adapted to hold the value of at least one local maximum of a pulse in the first transient signal at said maximum value for a predetermined period of time, t rfpr , thereby generating a corresponding pulse in the second transient signal, said predetermined period of time t rfpr being of at the most 5 ms.
- t rfpr is of at the mo ⁇ t 1 ms or about 0,7 ms.
- the invention also relates to a system for selecting leading edges of pulses in a transient signal, which signal repre- sents abrupt energy changes within an auditory signal.
- the sy ⁇ tem compri ⁇ e ⁇ mean ⁇ for determining or measuring the maximum slope of a leading edge of a pulse in the tran ⁇ ient ⁇ ignal, mean ⁇ for comparing the obtained maximum slope with a predetermined lower threshold value for maximum slope ⁇ of leading edge ⁇ , and mean ⁇ for, ba ⁇ ed on the re ⁇ ult of ⁇ aid compari ⁇ on, se ⁇ lecting a candidate to the leading edge of a pulse.
- the means for determining or measuring the maximum slope of a leading edge of a pulse are further adapted to determine or measure the maximum ⁇ lope or ⁇ lope ⁇ of a leading edge or edges of one or more pulses sub ⁇ equent to the ⁇ elected candi ⁇ date
- the comparing means are further adapted for comparing the obtained maximum slope or slopes of the subsequent leading edge or edges and the obtained maximum ⁇ lope of the ⁇ elected candidate with one another
- the ⁇ electing mean ⁇ are further adapted for, ba ⁇ ed on the re ⁇ ult of said comparison, selecting the leading edge with the largest maximum slope.
- any of the system ⁇ which comprises means for generating the second transient signal further comprises means for selecting lead ⁇ ing edges of pulses in a transient signal in accordance with an embodiment of the present invention, the leading edges being selected from the second transient signal.
- Fig. 1 show ⁇ a filter bank with N bandpass filters
- Figs. 2 and 3 show transient detection signal ⁇ of the ⁇ peech ⁇ ignal " ⁇ oftkey" for two filters having different center frequencies in a filter bank
- Fig. 4 how ⁇ the tran ⁇ ient detection signals of Fig. 3 of the vowel "i" a ⁇ in key
- Fig. 5 show ⁇ tran ⁇ ient detection ⁇ ignal ⁇ corre ⁇ ponding to the ⁇ peech signal of Fig. 4, with the speech signal being pro ⁇ Obd according to a preferred embodiment of refractoriness period processing,
- Fig. 6 show ⁇ tran ⁇ ient detection ⁇ ignal ⁇ corre ⁇ ponding to the speech signal of Fig. 4, with the speech signal being pro- ce ⁇ ed according to another preferred embodiment of refrac- torine ⁇ period proce ⁇ ing,
- Fig. 7 illu ⁇ trates selection of a leading edge of a transient pul ⁇ e according to a preferred embodiment of the invention
- Fig. 8 illustrate ⁇ the principle ⁇ of determination of maximum ⁇ lope and ri ⁇ e time of a leading edge of a tran ⁇ ient pul ⁇ e
- Fig. 9 hows transient detection signals, including an edge signal and a measure of the pitch period, corresponding to the speech signal "softkey" pronounced by a female
- Fig. 10 show ⁇ transient detection signal ⁇ , including an edge signal and a measure of the pitch period, corresponding to the vowel "i" as in key,
- Fig. 11 shows the edge signal of Fig. 10 filtered by a band- pa ⁇ filter
- Fig. 12 i a flow diagram illustrating a preferred embodiment of refractoriness period proce ⁇ ing
- Fig. 13 is a flow diagram illustrating a preferred embodiment of detection of a leading edge
- Fig. 14 is a plot of the bandwidths of cochlea bandpass filter ⁇ a ⁇ a function of centre frequency
- Fig. 15 i ⁇ a plot of ri ⁇ e time ⁇ of input and output pulses of a bandpa ⁇ filter and of the impulse response of the filter.
- the cochlea in the human ear can be regarded as an infinite number of bandpass filters, IBP, within the frequency range of the human ear.
- a filter bank may be employed for detecting formants and thereby detecting the transient con ⁇ dition ⁇ that hold the most well qualified information with a sub ⁇ tantial suppres ⁇ ion of noi ⁇ e.
- the bandwidth of the bandpas ⁇ filter ⁇ is chosen to be the same for all filter ⁇ in order to obtain the ⁇ ame envelope.
- Another choice might be to scale the bandwidth of the filters in accordance with the Bark ⁇ cale or Mel ⁇ cale.
- Fig. 1 shows a filter bank with N bandpass filters, BP ⁇ ⁇ -BP jj , followed by an envelope detection performed by use of rec ⁇ tification mean ⁇ , R- ⁇ R JJ , and lowpa ⁇ filter ⁇ , LP- L -LP J ⁇ .
- the rectification mean ⁇ are preferably one-way rectification means.
- the filter bank has to cover the transient oriented frequency range, and the centre frequency of the bandpa ⁇ filter ⁇ ha ⁇ therefore to be from about 1.4 kHz and upward ⁇ . To be able to detect sufficient fast transients the bandwidth has to be about 1.4 kHz.
- Figs. 2 and 3 the transient detection by mean ⁇ of a filter bank i ⁇ illu ⁇ trated.
- Figs. 2 and 3 show processed curves for the word " ⁇ oftkey" pronounced by a female and detected by mean ⁇ of two different bandpa ⁇ filter ⁇ .
- the abscis ⁇ as represent a time interval of 1 ⁇ and the ordinates in Figs. 2a, 2b, 3a and 3b represent the sound pressure of the corre ⁇ ponding ⁇ peech ⁇ ignal wherea ⁇ the ordinates of Fig ⁇ . 2c and 3c repre ⁇ ent the energy of the corre ⁇ ponding ⁇ peech ⁇ ignal.
- the bandpa ⁇ filter ⁇ are Butterworth filters of 6th order with a bandwidth on 1.4 kHz.
- the centre frequency i ⁇ about 1.5 kHz with a lover cutoff frequency at about 0.8 kHz and an upper cutoff frequency at about 2.2 kHz.
- the centre frequency is about 2.8 kHz with a lower cutoff frequency at about 2.1 kHz and an upper cutoff frequency at about 3.5 kHz.
- the lowpas ⁇ filter i ⁇ a Ith order Butterworth filter with a cutoff frequency at 700 Hz, and the pretran ⁇ ient ⁇ ignal i ⁇ the output ⁇ ignal from the bandpa ⁇ filter.
- the vowel "o” is very outstanding in the transient signal, but the other phonemes are very indistinct.
- Fig. 3c the vowel "o” is less outstanding but the other phonemes are much more di ⁇ tinct.
- the conclu ⁇ ion may be drawn that the vowel "o” should preferably be detected from the transient signal processed by the bandpa ⁇ filter with a centre fre ⁇ quency at 1.5 kHz, and the remaining phonemes should prefer ⁇ ably be detected from the transient signal processed by the bandpas ⁇ filter with a centre frequency at 2.8 kHz.
- each branch can be regarded a ⁇ a TSD (Tran ⁇ ient Signal Detector) .
- the number of branches in the sy ⁇ tem depend ⁇ on the demand on the ⁇ ystem, but the number should be in the range of 2-40.
- TSDl the TSD used in connection with the results of Fig. 2 having a centre frequency at 1.5 kHz
- TSD2 the TSD used in connection with the results of Fig. 3 having a centre frequency at 2.8 kHz
- Fig. 1 then illu ⁇ trates a TSD bank.
- Important features of fa ⁇ t energy changes of an auditory ⁇ ignal for identifying or repre ⁇ enting features that can be perceived by a human ear as repre ⁇ enting a di ⁇ tinct ⁇ ound picture may be the ⁇ hape of the leading edge and the period between the leading edge ⁇ .
- Thi ⁇ period i ⁇ called the refractorine ⁇ period.
- the nerve pul ⁇ e ⁇ launched from the cochlea are ⁇ ynchronized to the frequency of a ⁇ inu ⁇ tone if the frequency i ⁇ le ⁇ than about 1.4 kHz but not above thi ⁇ frequency.
- Thi ⁇ mean ⁇ that the refractorine ⁇ period of interest may be about 0.7 ms.
- the refractoriness period may be used for simplifying the proces ⁇ of detecting the leading edge of a tran ⁇ ient pul ⁇ e in the tran ⁇ ient component.
- Fig. 4 shows part of the curves of Fig. 3 proces ⁇ ed by TSD2. The curves shown in Fig. 4 repre- sent the signals obtained for the vowel "i" as in key.
- the transient signal of Fig. 4c is proces ⁇ ed without a refrac ⁇ torine ⁇ period.
- Fig ⁇ . 5a and 6a are identical to Figs. 4a and Figs. 5b and 6b are identical to Figs. 4b.
- the transient signal of Fig. 5c which represent ⁇ the energy of the corresponding speech signal is obtained from the bandpa ⁇ filtered pretransient signal in Fig. 5b by way of a rectification and by using a refractoriness period of l ms.
- the signal of Fig. 5c ha ⁇ not been ⁇ ubject to a lowpa ⁇ filtration. It i ⁇ preferred that the implementation of the refractorine ⁇ s period is performed by using a software algorithm which is described below in connection with Fig. 12.
- Fig. 6c show ⁇ a tran ⁇ ient ⁇ ignal which repre ⁇ ents the energy of the corresponding speech signal and which is obtained by performing a lowpas ⁇ filtration on the ⁇ ignal of Fig. 5c.
- All the ⁇ ignal ⁇ of Fig ⁇ . 4 and 5 hold the ⁇ peech information and may ea ⁇ ily be perceived by a human ear, although ⁇ ome noi ⁇ e i ⁇ introduced during the proce ⁇ of tran ⁇ ient detection resulting in the signal ⁇ of Fig ⁇ . 5 c and 6c.
- the ab ⁇ ci ⁇ a ⁇ repre ⁇ ent a time interval of 50 m ⁇ .
- the refractorine ⁇ period may be about 0.5 m ⁇ or longer but preferably le ⁇ than the minimum pitch period, that mean ⁇ less than about 3.3 ms.
- the shape of the leading edge may be one of the important feature ⁇ for repre ⁇ ⁇ enting a sound picture, and the maximum slope of the leading edge may be an important feature for the edge.
- the maximum slope of the leading edge may be the basi ⁇ for detec- ting the important feature ⁇ for identifying or repre ⁇ enting a di ⁇ tinct ⁇ ound picture.
- Fig. 7 the ab ⁇ ci ⁇ sa represent ⁇ a time interval of 50 m ⁇ , and the ⁇ ignals of Fig ⁇ . 7a, b and c correspond to the sig- nal ⁇ of Fig ⁇ . 6a, b and c, wherea ⁇ in Fig. 7d the differenti ⁇ ated ⁇ ignal of the signal of Fig. 7c, called differential signal, is shown.
- d em a predetermined minimum value
- the size of d em may depend on how the signal is normalised.
- the signals of Fig ⁇ . 2-7 are normali ⁇ ed to the maximum nu ⁇ merical value in the whole ⁇ ignal, and d em i ⁇ preferably selected to 2.5% of the maximum detected slope value.
- d em may be ⁇ elected otherwise, and preferably higher.
- the maximum slope may be detected by finding a maximum greater than the threshold d em and select this a ⁇ a candidate to be the maximum ⁇ lope of a leading edge, called d m . If there i ⁇ a greater maximum ⁇ lope for a given ⁇ earch time, t s , then choo ⁇ e thi ⁇ point a ⁇ having the maximum ⁇ lope of a leading edge, else choose the candidate.
- the search time t s may be selected to be les ⁇ than the minimum pitch period which means les ⁇ than about 3.3 m ⁇ , but preferably around 2 m ⁇ .
- the following leading edge may be detected a ⁇ illu ⁇ trated in Fig. 7d.
- t ep When the point for the maximum ⁇ lope for a leading edge i ⁇ detected, then for a time period, t ep , only a maximum ⁇ lope greater than the previou ⁇ maximum ⁇ lope will be accepted, in other word ⁇ , in thi ⁇ time period the thre ⁇ hold for accepting a leading edge i ⁇ equal to the previous maximum ⁇ lope.
- the thre ⁇ hold may be expo ⁇ nential decrea ⁇ ed with a time con ⁇ tant t c , which i ⁇ also illustrated in Fig. 7d.
- the time period for t ep may be less than the minimum pitch period, that mean less than about 3.3 ms, but preferably between 1-2 m ⁇ . However, t ep should be longer than or equal to the search time t s .
- the edge of a leading edge may be described as beginning at a point in time, t b , where the slope has the maximum slope, or a point in time before the point with the maximum slope, where the slope has reached a threshold value, d b , having a predetermined ratio to the maximum slope, and ending at the point, t e , after the point with the maximum slope, where the ⁇ lope ha ⁇ decreased to a threshold value, d e , having a prede- termined ratio to the maximum slope.
- This principle is il ⁇ lustrated in Fig. 8, where the amplitude of the leading edge is ⁇ hown a ⁇ A in Fig. 8a, and the differential of the leading edge i ⁇ ⁇ hown a ⁇ D in Fig. 8b.
- Fig ⁇ . 9 and 10 an edge detection following the above defined edge detector principles i ⁇ illu ⁇ trated .
- the absci ⁇ a ⁇ in Fig. 9 repre ⁇ ent a time interval of 1 ⁇ , while a time interval of 50 m ⁇ of the signal ⁇ in Fig. 9 i ⁇ repre ⁇ sented in Fig. 10, in which time interval the signal ⁇ for the vowel "i" in the word key are ⁇ hown.
- the tran ⁇ ient signal of Figs. 9c and 10c has been processed in accordance with the signal presented in Fig. 6c, and a leading edge signal named edge ⁇ ignal, see Figs. 9d and lOd, has been obtained by determining the rise time of selected leading edges.
- a graph of the pitch period between the selected edges is shown, Fig ⁇ . 9e and lOe. If the pitch period i ⁇ longer than 15 ms it i ⁇ set equal to 15 ms. A low resolution i ⁇ obtained in the printout of Fig. 9d due to a limited printer resolution.
- the transient signal detector TSD2 is used when proces ⁇ ing the ⁇ ignal ⁇ of Fig ⁇ . 9 and 10.
- the maximum slopes of pulse ⁇ in the tran ⁇ ient ⁇ ignal, Fig ⁇ . 9c and 10c, are determined, and for the selected leading edges the starting point in time, t b , of the edge is set equal to the point in time where the maximum ⁇ lope is detected, i.e. d b is equal to d m
- t e is equal to the point in time where the ⁇ lope ha ⁇ decreased to 70% of d m , i.e. d e is equal to 70 % of d m .
- the part of the leading edge of a pulse in the transient signal corresponding to the time interval of t b to t e is repre ⁇ ented a ⁇ the lead ⁇ ing edge of a pul ⁇ e in the edge signal, Figs. 9d and lOd.
- the edge signal holds the full speech information and may easily be perceived by a human ear, although some noise may be introduced during the proces ⁇ ing.
- the leading edge may be defined a ⁇ beginning at a leading threshold value, d b , greater than 50 % of the maximum slope, but preferably equal to the maximum ⁇ lope, and ending at a lagging thre ⁇ hold value, d e , greater than 50 % of the maximum ⁇ lope, but preferably 70% of the maximum ⁇ lope.
- the rise time of the leading edge may be defined a ⁇ the time period between t b and t e , and may in a preferred embodiment be used as representing a measure for the ⁇ hape of the lead ⁇ ing edge, and thu ⁇ forming the ba ⁇ i ⁇ for identification of a di ⁇ tinct sound picture.
- the pulses of the edge signal may al ⁇ o be cho ⁇ en a ⁇ the ba ⁇ i ⁇ for identification of a di ⁇ tinct ⁇ ound picture.
- edge detector can be u ⁇ ed a ⁇ a pitch detector, but known technique ⁇ for pitch detection can al ⁇ o be applied.
- the ⁇ hape of the leading edge of a ⁇ peech ⁇ ignal which ⁇ ignal may be a phoneme, may be considered a conclusive feature for narrow band communication. Therefore, only infor- mation about the leading edge, unvoiced or voiced, and/or pitch period, and/or loudnes ⁇ of the speech signal should need to be transmitted. Thu ⁇ , it ⁇ hould not be nece ⁇ ary to tran ⁇ mit information concerning the vocal filter, thereby ⁇ aving bandwidth.
- Information about a ⁇ peech signal being unvoiced or voiced, and/or the pitch period and/or loudnes ⁇ of the speech signal may be compressed and decompres ⁇ ed by mean ⁇ of known tech ⁇ nology, in which ⁇ peech ⁇ ignals are framed in time periods of 20-40 ms, and only the change in the parameters need to be tran ⁇ mitted.
- the leading edge may be compressed by identify ⁇ ing and representing the edge according to one of the embodi ⁇ ments of the present invention, for time frames of 20-40 m ⁇ by mean ⁇ of a template identification from a library or a book.
- the speech signal may be decompres ⁇ ed by mean ⁇ of a library or book of edge template ⁇ with corresponding standard filters, which filters should be excited by the edge tem ⁇ plate. Otherwise the speech signal may be decompressed by mean ⁇ of a library or book, with ⁇ tandard wave form ⁇ iden- tified by means of the edge template identification.
- FIG. 11 shows the edge signal of Fig. lOd filtered with the same bandpas ⁇ filter u ⁇ ed for processing the pretransient ⁇ ignal, Fig. 10b, i.e. the centre frequency i ⁇ about 2.8 kHz with a lower cutoff frequency about 2.1 kHz and an upper cutoff frequency about 3.5 kHz.
- the sound quali ⁇ ty of the signal represented in Fig. 11 is improved when compared to the ⁇ ignal of Fig. lOd.
- the ⁇ ignal of Fig. 11 may be compared with the pretran ⁇ ient ⁇ ignal of Fig. 10b.
- the edge ⁇ ignal may be proce ⁇ ed by mean ⁇ of a filter with another filter characteri ⁇ tic or by means of waveform de ⁇ coding.
- Fig. 12 how ⁇ a preferred embodiment of implementation of the refractorine ⁇ period.
- the definition ⁇ of the flow chart variables of the proces ⁇ of Fig. 12 are given a ⁇ follow ⁇ :
- PrvSi value of previous input ⁇ ample (Si (n-l), n > 0) .
- LeadingEdge a Boolean variable,- it is true if the sample is in a leading edge or in a refractorines ⁇ period, el ⁇ e it i ⁇ false.
- Fig. 13a how ⁇ a preferred embodiment of implementation of the edge detection principle.
- d differentiated transient signal (Differential signal) .
- n Index for ⁇ ample ⁇ of the differential ⁇ ignal.
- d prv A help variable and mostly the previous sample of the differential ⁇ ignal.
- d em Relative minimum thre ⁇ hold for the differential signal.
- d m Maximum slope for the edge.
- t s Search time in samples for the greatest local maximum of the slope greater than d m . t m :Sample no. for the detected maximum slope k :Index for the detected edge.
- thr Predetermined ratio of thre ⁇ hold value for the ⁇ lope at the beginning of the edge d b to the maximum ⁇ lope d m .
- thr c Predetermined ratio of thre ⁇ hold value for the ⁇ lope at the end of the edge d e to the maximum slope d m .
- Fig. 15 illustrate ⁇ that if the ri ⁇ e time of a pul ⁇ e provided as an input to a filter is slower than the rise time of the impul ⁇ e re ⁇ pon ⁇ e of the filter then, the rise time of the output of the filter generated in response to the input pulse will be sub ⁇ tantially equal to the rise time of the input pulse.
- Signal processing of sound signal ⁇ in the cochlea may be simulated by a filter bank comprising a ⁇ et of bandpass filters with different centre frequencies and wherein the bandwidths of these filters increase with increasing centre frequencies which again means that the ri ⁇ e times of the impulse responses of the filters increase with increasing centre frequencies.
- the ri ⁇ e time of an output pulse generated by a corresponding filter of the filter bank will be ⁇ ubstantially equal to the ri ⁇ e time of the impul ⁇ e re ⁇ pon ⁇ e of the filter when the ri ⁇ e time of the input pulse is faster than the ri ⁇ e time of the impul ⁇ e re ⁇ pon ⁇ e of the filter and ⁇ ub ⁇ tantially equal to the ri ⁇ e time of the input pul ⁇ e when the ri ⁇ e time of the input pul ⁇ e i ⁇ ⁇ lower than the rise time of the impulse response of the filter.
- the rise time of the input pul ⁇ e may be determined by determination of the two filters A and B of the filter bank having the narrowest bandwidths of the filters generating output pul ⁇ e ⁇ in re ⁇ ponse to the input pulse with sub ⁇ tantially identical rise times a ⁇ the rise time of the input pulse must be within the rise time range between the rise time of the impulse response of the filter A, B with the narrowest bandwidth and the ri ⁇ e time of the impul ⁇ e respon ⁇ e of the filter with the largest bandwidth that i ⁇ al ⁇ o lower than the bandwidth ⁇ of the filter ⁇ A, B.
- speech signal ⁇ may be generated by modulation of pul ⁇ e ⁇ in a filter that modulate ⁇ the ⁇ hape of the pul ⁇ e ⁇ a ⁇ de ⁇ cribed above.
- Pulse ⁇ to be modulated correspond to sound signals generated in the articulation channel, e.g. by the vocal chord, and the processing in the filters correspond ⁇ to the modulation performed by adju ⁇ tment of the articulation channel according to the phoneme proce ⁇ ed whereby the filters modulate the shape of the pulse ⁇ .
- the time between pul ⁇ e ⁇ to be modulated ⁇ hould ⁇ ufficiently long to ensure that there i ⁇ no interference between output pul ⁇ e ⁇ generated in response to different input pulses.
- the shape of the leading edge and the rise time may both be conclusive features.
- the leading edge may be detected a ⁇ de ⁇ cribed above, and in a preferred embodiment the edge detection i ⁇ ba ⁇ ed on a transient signal proces ⁇ ed with a refractorine ⁇ period either without a lowpass filtering as ⁇ hown in Fig. 5, or with a lowpa ⁇ filter a ⁇ ⁇ hown in Fig. 6.
- a phoneme may be identified by mean ⁇ of feature ⁇ , such as a cla ⁇ ification of the shape of the leading edges, mean pitch period, variation of pitch periods, and/or dynamic trend of the edge height in a time frame of 10-100 ms.
- the pre ⁇ ent invention i ⁇ preferably implemented utilizing a programmed proce ⁇ or ⁇ uch a ⁇ a microcomputer for real time applications but this i ⁇ not to be limiting.
- the pre ⁇ ent invention may al ⁇ o be implemented u ⁇ ing a dedicated hardware proce ⁇ or if de ⁇ ired or by a more powerful mainframe computer without departing from the pre ⁇ ent invention.
Landscapes
- Engineering & Computer Science (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Computational Linguistics (AREA)
- Health & Medical Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Human Computer Interaction (AREA)
- Physics & Mathematics (AREA)
- Acoustics & Sound (AREA)
- Multimedia (AREA)
- Measurement And Recording Of Electrical Phenomena And Electrical Characteristics Of The Living Body (AREA)
- Measurement Of The Respiration, Hearing Ability, Form, And Blood Characteristics Of Living Organisms (AREA)
Abstract
Priority Applications (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
AU67856/96A AU6785696A (en) | 1995-09-05 | 1996-09-04 | Method and system for processing auditory signals |
EP96928357A EP0850472A2 (fr) | 1995-09-05 | 1996-09-04 | Procede et systeme de traitement de signaux auditifs |
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
DK0974/95 | 1995-09-05 | ||
DK97495 | 1995-09-05 |
Publications (3)
Publication Number | Publication Date |
---|---|
WO1997009712A2 true WO1997009712A2 (fr) | 1997-03-13 |
WO1997009712A3 WO1997009712A3 (fr) | 1997-04-10 |
WO1997009712B1 WO1997009712B1 (fr) | 1997-05-22 |
Family
ID=8099600
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/DK1996/000370 WO1997009712A2 (fr) | 1995-09-05 | 1996-09-04 | Procede et systeme de traitement de signaux auditifs |
Country Status (3)
Country | Link |
---|---|
EP (1) | EP0850472A2 (fr) |
AU (1) | AU6785696A (fr) |
WO (1) | WO1997009712A2 (fr) |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2002025997A1 (fr) * | 2000-09-20 | 2002-03-28 | Leonhard Research A/S | Controle de qualite de transducteurs electroacoustiques |
WO2002080618A1 (fr) * | 2001-03-30 | 2002-10-10 | Leonhard Research A/S | Suppression du bruit dans la mesure d'un signal repetitif |
EP1293961A1 (fr) * | 1998-03-13 | 2003-03-19 | LEONHARD, Frank Uldall | Procédé de traitement de signaux pour l'analyse des transitoires de signaux vocaux |
WO2010086194A3 (fr) * | 2009-01-30 | 2011-09-29 | Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V. | Appareil, procédé et programme informatique pour la manipulation d'un signal audio comprenant un événement transitoire |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US3564142A (en) * | 1967-08-03 | 1971-02-16 | Ibm | Method of multiplex speech synthesis |
FR2274101A1 (fr) * | 1974-06-04 | 1976-01-02 | Fuji Xerox Co Ltd | Procede de reconnaissance de la voix et dispositif mettant en application ce procede |
US4382164A (en) * | 1980-01-25 | 1983-05-03 | Bell Telephone Laboratories, Incorporated | Signal stretcher for envelope generator |
WO1994025958A2 (fr) * | 1993-04-22 | 1994-11-10 | Frank Uldall Leonhard | Procede et systeme de detection et de production de phenomenes transitoire dans des signaux sonores |
-
1996
- 1996-09-04 EP EP96928357A patent/EP0850472A2/fr not_active Withdrawn
- 1996-09-04 AU AU67856/96A patent/AU6785696A/en not_active Abandoned
- 1996-09-04 WO PCT/DK1996/000370 patent/WO1997009712A2/fr not_active Application Discontinuation
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US3564142A (en) * | 1967-08-03 | 1971-02-16 | Ibm | Method of multiplex speech synthesis |
FR2274101A1 (fr) * | 1974-06-04 | 1976-01-02 | Fuji Xerox Co Ltd | Procede de reconnaissance de la voix et dispositif mettant en application ce procede |
US4382164A (en) * | 1980-01-25 | 1983-05-03 | Bell Telephone Laboratories, Incorporated | Signal stretcher for envelope generator |
WO1994025958A2 (fr) * | 1993-04-22 | 1994-11-10 | Frank Uldall Leonhard | Procede et systeme de detection et de production de phenomenes transitoire dans des signaux sonores |
Non-Patent Citations (4)
Title |
---|
DATABASE INSPEC INSTITUTE OF ELECTRICAL ENGINEERS, STEVENAGE, GB Inspec No. 5087386, KUMAR ET AL.: "Level crossing time interval circuit for micro-power analog VLSI auditory processing" XP002021408 & PROCEEDINGS OF 1995 IEEE WORKSHOP ON NEURAL NETWORKS FOR SIGNAL PROCESSING, 1 August 1995 - 2 September 1995, CAMBRIDGE, MA, US, pages 581-590, * |
DATABASE INSPEC INSTITUTE OF ELECTRICAL ENGINEERS, STEVENAGE, GB Inspec No. 5230761, VILLA ET AL.: "New perspectives in auditory coding: bases for a new cochlear behavioural model" XP002021409 & INTERNATIONAL WORKSHOP ON ARTIFICIAL NEURAL NETWORKS, PROCEEDINGS OF IWANN 95, 7 June 1996 - 9 June 1995, MALAGA-TORREMOLINOS, ES, pages 121-129, * |
IBM TECHNICAL DISCLOSURE BULLETIN, vol. 6, no. 7, December 1963, NEW YORK, US, pages 83-84, XP002021407 ANONYMOUS: "Speech Recognition System Using Formant Transient Detection. December 1963." * |
PROCEEDINGS OF THE NATIONAL AEROSPACE AND ELECTRONICS CONFERENCE (NAECON), vol. 1, 21 - 25 May 1990, DAYTON, OH, US, pages 57-63, XP000301963 AHN ET AL.: "Cochlear modeling using a general purpose digital signal processor" * |
Cited By (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
EP1293961A1 (fr) * | 1998-03-13 | 2003-03-19 | LEONHARD, Frank Uldall | Procédé de traitement de signaux pour l'analyse des transitoires de signaux vocaux |
WO2002025997A1 (fr) * | 2000-09-20 | 2002-03-28 | Leonhard Research A/S | Controle de qualite de transducteurs electroacoustiques |
WO2002080618A1 (fr) * | 2001-03-30 | 2002-10-10 | Leonhard Research A/S | Suppression du bruit dans la mesure d'un signal repetitif |
WO2010086194A3 (fr) * | 2009-01-30 | 2011-09-29 | Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V. | Appareil, procédé et programme informatique pour la manipulation d'un signal audio comprenant un événement transitoire |
US9230557B2 (en) | 2009-01-30 | 2016-01-05 | Fraunhofer-Gesellschaft Zur Foerderung Der Angewandten Forschung E.V. | Apparatus, method and computer program for manipulating an audio signal comprising a transient event |
Also Published As
Publication number | Publication date |
---|---|
EP0850472A2 (fr) | 1998-07-01 |
AU6785696A (en) | 1997-03-27 |
WO1997009712A3 (fr) | 1997-04-10 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US5884260A (en) | Method and system for detecting and generating transient conditions in auditory signals | |
US3855416A (en) | Method and apparatus for phonation analysis leading to valid truth/lie decisions by fundamental speech-energy weighted vibratto component assessment | |
US8488800B2 (en) | Segmenting audio signals into auditory events | |
CA2448182C (fr) | Segmentation de signaux audio en evenements auditifs | |
EP2549475A1 (fr) | Segmentation de signaux audio en évenements auditifs | |
AU2002252143A1 (en) | Segmenting audio signals into auditory events | |
EP0182989B1 (fr) | Normalisation de signaux de parole | |
WO1990011593A1 (fr) | Procede et appareil d'analyse de paroles | |
US5960373A (en) | Frequency analyzing method and apparatus and plural pitch frequencies detecting method and apparatus using the same | |
JPH0431898A (ja) | 音声雑音分離装置 | |
US5483617A (en) | Elimination of feature distortions caused by analysis of waveforms | |
Smith | A phoneme detector | |
WO1997009712A2 (fr) | Procede et systeme de traitement de signaux auditifs | |
EP1293961B1 (fr) | Procédé de traitement de signaux pour l'analyse des transitoires de signaux vocaux | |
US4982433A (en) | Speech analysis method | |
Kajita et al. | Subband-autocorrelation analysis and its application for speech recognition | |
RU2174714C2 (ru) | Способ выделения основного тона | |
KR100359988B1 (ko) | 실시간 화속 변환 장치 | |
WO1997009712B1 (fr) | Procede et systeme de traitement de signaux auditifs | |
Kiukaanniemi et al. | Long-term speech spectra: A computerized method of measurement and a comparative study of Finnish and English data | |
WO1993009531A1 (fr) | Traitement de signaux electriques et sonores | |
JPS61126600A (ja) | 音響波入力処理方法 | |
SU1111199A1 (ru) | Способ спектрального представлени вокализованного речевого сигнала | |
David et al. | Technique for Coding Speech Signals for Transmission over a Reduced Capacity Digital Channel | |
JPS61273599A (ja) | 音声認識装置 |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
AK | Designated states |
Kind code of ref document: A2 Designated state(s): AL AM AT AT AU AZ BB BG BR BY CA CH CN CU CZ CZ DE DE DK DK EE EE ES FI FI GB GE HU IL IS JP KE KG KP KR KZ LK LR LS LT LU LV MD MG MK MN MW MX NO NZ PL PT RO RU SD SE SG SI SK SK TJ TM TR TT UA UG US UZ VN AM AZ BY KG KZ MD RU TJ TM |
|
AL | Designated countries for regional patents |
Kind code of ref document: A2 Designated state(s): KE LS MW SD SZ UG AT BE CH DE DK ES FI FR GB GR IE IT LU MC NL PT |
|
AK | Designated states |
Kind code of ref document: A3 Designated state(s): AL AM AT AT AU AZ BB BG BR BY CA CH CN CU CZ CZ DE DE DK DK EE EE ES FI FI GB GE HU IL IS JP KE KG KP KR KZ LK LR LS LT LU LV MD MG MK MN MW MX NO NZ PL PT RO RU SD SE SG SI SK SK TJ TM TR TT UA UG US UZ VN AM AZ BY KG KZ MD RU TJ TM |
|
AL | Designated countries for regional patents |
Kind code of ref document: A3 Designated state(s): KE LS MW SD SZ UG AT BE CH DE DK ES FI FR GB GR IE IT LU MC NL PT |
|
DFPE | Request for preliminary examination filed prior to expiration of 19th month from priority date (pct application filed before 20040101) | ||
121 | Ep: the epo has been informed by wipo that ep was designated in this application | ||
WWE | Wipo information: entry into national phase |
Ref document number: 1996928357 Country of ref document: EP |
|
WWP | Wipo information: published in national office |
Ref document number: 1996928357 Country of ref document: EP |
|
REG | Reference to national code |
Ref country code: DE Ref legal event code: 8642 |
|
NENP | Non-entry into the national phase in: |
Ref country code: CA |
|
WWW | Wipo information: withdrawn in national office |
Ref document number: 1996928357 Country of ref document: EP |