US20060100828A1 - Impulse event separating apparatus and method - Google Patents
Impulse event separating apparatus and method Download PDFInfo
- Publication number
- US20060100828A1 US20060100828A1 US11/270,622 US27062205A US2006100828A1 US 20060100828 A1 US20060100828 A1 US 20060100828A1 US 27062205 A US27062205 A US 27062205A US 2006100828 A1 US2006100828 A1 US 2006100828A1
- Authority
- US
- United States
- Prior art keywords
- power
- event
- variation
- frequency sub
- signal
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Images
Classifications
-
- G—PHYSICS
- G01—MEASURING; TESTING
- G01H—MEASUREMENT OF MECHANICAL VIBRATIONS OR ULTRASONIC, SONIC OR INFRASONIC WAVES
- G01H3/00—Measuring characteristics of vibrations by using a detector in a fluid
- G01H3/04—Frequency
- G01H3/08—Analysing frequencies present in complex vibrations, e.g. comparing harmonics present
Definitions
- the present invention relates to an impulse event separating apparatus and method, and, more particularly, to a method of separating an impulse event from a successive sound, and an apparatus to perform the method.
- An impulse event that is, an impact sound
- the impact sound occurs suddenly in background sounds which are relatively stable and can be estimated.
- the impact sound can be modeled into a zero-state impulse response of a linear system.
- Examples of impact sounds include a simplex sound, such as the sound made by striking glass with a rod, and a complex sound, such as an explosive sound or the sound made when a coin falls to the floor.
- the impact sound generally has an onset stage and an attenuating stage.
- the physical event making the impact sound has a short duration and a high intensity. If the onset is detected, the start of the impact sound can be determined.
- an ideal impulse signal is linearly attenuated in the attenuating stage. That is, the energy of a log function substantially has a linear attenuation slope. According to this property, the event can be tracked, and the energy distribution of the impact sound can be calculated.
- the successive sounds in which the impact sound and the non-impact sound are mixed generally share frequency bands and overlap each other in the time domain, the impact sound must be distinguished from these successive sounds.
- the present invention provides an impulse event separating method, and an apparatus to perform the method, of detecting an onset from an input audio signal in each frequency band, detecting an event using the onset, and determining whether the event is an impulse event.
- an impulse event separating apparatus comprising a preprocessing unit which divides an input signal into frame units; an event detecting unit which divides the frame into a plurality of frequency sub-bands, obtains power variations and phase variations of the signals of each of the sub-bands to detect a plurality of onsets, and detects a plurality of events using the detected onsets; an event buffer which stores the detected events; and an impulse event determining unit which determines whether the detected events comprise an impulse event with reference to an impulse event property.
- an impulse event separating method comprising dividing an input signal into frame units and dividing each frame into a plurality of frequency sub-bands; obtaining a power variation and phase variation of the signal of each of the frequency sub-bands, and detecting a plurality of local onsets using the power variation and the phase variation; obtaining a global onset from the local onsets and triggering a plurality of event components using the local onsets and the global onset; tracking and combining the event components in each of the frequency sub-bands to form events; and determining whether the events comprise an impulse event with reference to an impulse event property.
- FIG. 1 is a block diagram illustrating an impulse event separating apparatus according to an embodiment of the present invention
- FIG. 2 is a block diagram illustrating an event detecting unit shown in FIG. 1 ;
- FIG. 3 is a block diagram illustrating a scout shown in FIG. 2 ;
- FIG. 4 is a block diagram illustrating a local onset detecting unit shown in FIG. 3 ;
- FIG. 5 defines an external and internal domain so as to combine a plurality of local onsets
- FIG. 6 illustrates an example of tracking ECs by a frequency sub-band
- FIGS. 7A through 7D illustrate the result of approximating a log power signal of an input signal.
- FIG. 1 is a block diagram illustrating an impulse event separating apparatus according to an embodiment of the present invention.
- the impulse event separating apparatus includes a preprocessing unit 10 , an event detecting unit 11 , an event buffer 12 , and an impulse event determining unit 13 .
- the preprocessing unit 10 divides an input audio signal into frame units, extracts a frequency band corresponding to an impulse event from each frame, and samples and converts the frequency band into a digital signal.
- the event detecting unit 11 detects an event from the digital signal, and the event buffer 12 buffers the event detected in the event detecting unit 11 .
- the impulse event determining unit 13 determines whether the event stored in the event buffer 12 is an impulse event, and separates the impulse event therefrom.
- FIG. 2 is a block diagram of the event detecting unit 11 in FIG. 1 .
- the event detecting unit 11 includes a controlling unit 20 , a plurality of scouts 21 a, 21 b, . . . 21 k, a plurality of event component (EC) pools 22 a, 22 b, . . . 22 k, and an event forming unit 23 .
- a controlling unit 20 a plurality of scouts 21 a, 21 b, . . . 21 k
- EC event component
- the controlling unit 20 divides a frame output from the preprocessing unit 10 into a plurality of sub-bands and outputs them to the scouts 21 a, 21 b . . . 21 k.
- the scouts 21 a, 21 b, . . . 21 k detect local onsets from the corresponding sub-bands and output the local onsets to the controlling unit 20 .
- the controlling unit 20 combines the local onsets detected in the scouts 21 a, 21 b, . . . 21 k to form a global onset, and feeds the global onset back to the scouts 21 a, 21 b . . . 21 k.
- each sub-band may be uniformly divided from the frequency band of the corresponding frame, and may be divided according to the output of a cochlear filter.
- the impulse response of the cochlear filter can be approximated through a Gammatone filter function expressed by Equation 1.
- f 0 is the center frequency of the cochlear filter
- n is a degree
- ⁇ is a phase difference
- b is a constant.
- the controlling unit 20 may include a cochlear filter bank having the impulse response as shown by Equation 1 for the center frequency of each sub-band, and can provide the output thereof to each of the scouts 21 a, 21 b . . . 21 k.
- the controlling unit 20 may further include a synchronizing unit so as to simultaneously drive the scouts 21 a, 21 b, . . . 21 k.
- the EC pools 22 a, 22 b, . . . 22 k include a plurality of ECs which are triggered using the local onsets detected in the scouts 21 a, 21 b, . . . 21 k.
- Each EC is triggered in response to the power suddenly being increased in the corresponding sub-band, and is stopped in response to the power falling below a zero event component level.
- the zero event component level refers to the power of an acoustical background which exists when no EC exists in the corresponding sub-band.
- the event forming unit 23 combines the ECs triggered in the EC pools 22 a, 22 b, . . . 22 k to form the event. Also, the event forming unit 23 subtracts the event from the signal output from the preprocessing unit 10 and outputs a zero event, that is, a whole background sound.
- FIG. 3 is a detailed block diagram illustrating one of the scouts 21 a, 21 b, . . . 21 k in FIG. 2 .
- the scout includes a local onset detecting unit 30 , a local estimating unit 31 , and a trigger unit 32 .
- the impulse event starts at the onset. That is, the ECs of the EC pools 22 a, 22 b, . . . 22 k start at the onset. Accordingly, by detecting every onset, the start of every event can be detected.
- the local onset detecting unit 30 detects the local onset from an amplitude spectrum and a phase spectrum of the signal input from the controlling unit 20 .
- FIG. 4 is a detailed block diagram of the local onset detecting unit 30 .
- the local onset detecting unit 30 includes an instant power measuring unit 40 , a delta power calculating unit 41 , a log power measuring unit 42 , a delta log power calculating unit 43 , a phase span unit 44 , a matched filter 45 , an onset filter unit 46 , and a multiplier 47 .
- the instant power measuring unit 40 can respectively obtain the power, the delta power, the log power, and the delta log power, expressed by Equation 2, from the amplitude spectrum.
- the instant power and the log power represent the trace of the absolute value of the energy, and the delta power and the delta log power include the variation of the energy between frames. These values increase rapidly in the onset, with the delta log power increasing particularly rapidly.
- the phase span unit 43 measures the phase variation of the linear phase component in the sub-band frequency domain.
- the signal is expressed by the amplitude spectrum and the phase spectrum.
- the amplitude encodes the frequency content of the signal, and the phase represents a temporal or spatial structure. Accordingly, the temporal location of the onset can be expressed by the slope of the linear phase component. If an unwrapped phase spectrum adjacent to the frame (t) is ⁇ (t,0), . . . , ⁇ (t, N/2) ⁇ , the unwrapped phase spectrum can be approximated by the linear function as shown by Equation 3.
- ⁇ (t) is the slope of the linear phase component.
- phase span of the frame (t) is calculated by Equation 4.
- PhaseSpan( t ) ⁇ ( t ) N/ 2 ⁇ ( t,N/ 2) ⁇ ( t, 0) (4)
- Equation 7 The output of the matched filter for the phase span result of Equation 5 is expressed by the conjugate of Equations 5 and 6 as shown by Equation 7.
- the onset filter unit 46 emphasizes the variation degree of the input signal, and includes a plurality of secondary filters to which primary filters having a delay-add filter shape are connected.
- the onset filters respectively filter the outputs of the instant power measuring unit 40 , the delta power calculating unit 41 , the log power calculating unit 42 , and the delta log power calculating unit 43 .
- the onset filter having the impulse response shown by Equation 8 is sensitive to the input which varies relatively rapidly.
- the multiplier 47 multiplies a plurality of filter outputs of the onset filter unit 46 by the output of the matched filter 45 to output the local onset for the corresponding sub-band.
- the controlling unit 20 detects the global onset from the plurality of local onsets detected by the scouts 21 a, 21 b, . . . 21 k.
- FIG. 5 defines an external and internal domain so as to combine the plurality of local onsets.
- f(t) represents the output of the onset filter which is applied to the output of the log power calculating unit 42 .
- the points t 0 and t 3 represent the zero points of f(t) closed to the main peak, and t 1 and t 2 represent the zero points of f(t)-z closed to the main peak.
- z is a constant which is selected experimentally.
- the section (t 0 , t 3 ) is defined as the external domain, and the section (t 1 , t 2 ) is defined as the internal domain.
- the controlling unit 20 combines two local onsets to make one global onset when the external domain of one local onset overlaps with the internal domain of the other local onset.
- the controlling unit 20 sends notice that the global onset is made to the local estimating unit 31 of the scout which does not detect the local onset.
- the local estimating unit 31 receives the notice and detects the power of the corresponding sub-band at the global onset time. If the power is greater than an estimate, a notice trigger EC is triggered by the trigger unit 32 .
- the local estimating unit 31 estimates the recent power before the global onset time.
- the trigger unit 32 triggers the EC according to the notice output from the local estimating unit 31 or the local onset output from the local onset detecting unit 30 .
- the EC pools 22 a, 22 b, . . . 22 k include the plurality of ECs triggered by the trigger unit 32 .
- the duration and the power during the duration of each EC are estimated.
- Each EC becomes either a masking state or a masked state, and one EC of the masking state exists in one sub-band. At this time, any ECs other than the masking EC become the masked state. If a new EC is triggered by the trigger unit 32 , it becomes the masking state.
- the EC pools 22 a, 22 b, . . . 22 k also include a zero EC.
- the zero EC sets a zero event component level for each sub-band and represents the acoustic background in that sub-band.
- the zero EC becomes the masking state if it is the only EC in the sub-band, and otherwise becomes masked by the other ECs. If the zero EC is in the masking state, the local estimated value rapidly converges to the acoustic background of the corresponding sub-band.
- the power of the zero EC is the zero event component level, and the other ECs disappear when their power falls below the zero event component level.
- the instant power of the masked EC is estimated in the local estimating unit 31 at the corresponding instant, and the instant power of the masking EC is the value obtained by subtracting the sum of the powers of the masked ECs from the total power of that frequency band.
- the event forming unit 23 tracks the ECs included in the EC pools 22 a, 22 b, . . . 22 k and estimates the power of the EC at every instant and the end point of each EC to obtain the power function of each EC.
- FIG. 6 illustrates an example of tracking ECs by a frequency sub-band.
- FIG. 6 ( a ) illustrates two impulse event components (A, B), and B overlaps with the middle of A.
- the solid line indicates the local data of the corresponding sub-band, and the dotted line indicates the estimated power of each EC.
- FIG. 6 ( c ) illustrates the result of the event forming unit 32 separating the data of FIG. 6 ( a ) into three ECs, that is, the zero EC 60 , the EC (A) 61 , and the EC (B) 62 .
- the impulse EC does not exist, and thus the zero EC becomes the masking EC, and the power of the zero EC becomes the zero event component level.
- the EC (A) occurs, and the zero EC is masked.
- the EC (B) occurs and becomes the masking state, and thus the EC (A) and the zero EC are masked. Since the power of the EC (B) becomes lower than that of the EC (A) in section 4 , the EC (B) disappears.
- the EC (A) becomes the masking EC, until the power of the EC (A) becomes lower than the zero event component level at the end of section 5 , and thus the EC (A) disappears.
- the zero EC becomes the masking state again.
- the tracking of the event component is accomplished according to the variation of the power of the masking ECs at every instant.
- the event forming unit 23 determines the duration with reference to the start point and the end point of each EC, and forms the event if the above-mentioned event tracking process is completed. That is, referring to FIGS. 6 A( a ) through 6 ( c ), the time at which the power of the masking EC becomes greater than the zero event component level is the start point of the event, and the time at which the power of the masking EC becomes less than the zero event component level is the end point of the event.
- the event buffer 12 temporarily stores the events formed in the event forming unit 23 .
- the impulse event determining unit 13 determines whether the events stored in the event buffer 12 are impulse events or not, with reference to a common property of the impulse events.
- the log power of the section during which the signal is attenuated is substantially linear from the peak to a noise level. This pattern is equal to the attenuation pattern of the single mode damped oscillation.
- Equation 9 can be quantitated using the power function expressed by Equation 10.
- Equation 10 satisfies the inequality of Equation 9 when ⁇ is a value between 0 and 1. If ⁇ >>1, it is difficult to be considered as an impulse event. An ideal ⁇ approaches 1, and most impulse events are not greater than 3.
- FIGS. 7A through 7D illustrate the result of approximating a log power signal of an input signal according to Equation 10.
- Reference numerals 90 - 1 , 90 - 2 , 90 - 3 , and 90 - 4 indicate original signals, and 91 - 1 , 91 - 2 , 91 - 3 , and 91 - 4 respectively illustrate the log powers of the original signals.
- 92 - 1 , 92 - 2 , 92 - 3 , and 92 - 4 illustrate the results of approximating the log power signals.
- FIG. 7B illustrates an ideal impulse event in which ⁇ approaches 1.
- FIGS. 7A and 7C illustrate the impulse events.
- the power level increases rapidly.
- FIG. 7D is difficult to be considered as an impulse event, because ⁇ >>1.
- FIG. 7D illustrates a speech signal, not an impulse event.
- the invention can also be embodied as computer readable code on a computer readable recording medium.
- the computer readable recording medium is any data storage device that can store data which can be thereafter read by a computer system. Examples of the computer readable recording medium include read-only memory (ROM), random-access memory (RAM), CD-ROMs, magnetic tapes, floppy disks, optical data storage devices, and carrier waves (such as data transmission through the internet).
- ROM read-only memory
- RAM random-access memory
- CD-ROMs compact discs
- magnetic tapes magnetic tapes
- floppy disks optical data storage devices
- carrier waves such as data transmission through the internet
- carrier waves such as data transmission through the internet.
- the computer readable recording medium can also be distributed over network coupled computer systems so that the computer readable code is stored and executed in a distributed fashion. Also, functional programs, code, and code segments for accomplishing the present invention can be easily construed by programmers skilled in the art to which the present invention pertains.
- an impulse event can be separated by separating the successive audio stream into frequency bands to detect local onsets, forming the events using the detected onsets, and examining the log powers of the events. Since the present invention determines an impulse event, for example, a glass-breaking sound, a gunshot, or footsteps, from the sound generated in surroundings, it can be applied to a security system and can diagnose a defect of a structure through acoustic diagnosis.
- an impulse event for example, a glass-breaking sound, a gunshot, or footsteps
Landscapes
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Measurement Of Mechanical Vibrations Or Ultrasonic Waves (AREA)
Abstract
Description
- This application claims the benefit of Korean Patent Application No.10-2004-0091451, filed on Nov. 10, 2004, in the Korean Intellectual Property Office, the disclosure of which is incorporated herein by reference.
- 1. Field of the Invention
- The present invention relates to an impulse event separating apparatus and method, and, more particularly, to a method of separating an impulse event from a successive sound, and an apparatus to perform the method.
- 2. Description of the Related Art
- An impulse event, that is, an impact sound, is generated by mechanical interaction between objects, and has a short duration and a high intensity. The impact sound occurs suddenly in background sounds which are relatively stable and can be estimated. According to signal processing theory, the impact sound can be modeled into a zero-state impulse response of a linear system.
- Examples of impact sounds include a simplex sound, such as the sound made by striking glass with a rod, and a complex sound, such as an explosive sound or the sound made when a coin falls to the floor.
- The impact sound generally has an onset stage and an attenuating stage. In the onset stage, the physical event making the impact sound has a short duration and a high intensity. If the onset is detected, the start of the impact sound can be determined.
- Generally, an ideal impulse signal is linearly attenuated in the attenuating stage. That is, the energy of a log function substantially has a linear attenuation slope. According to this property, the event can be tracked, and the energy distribution of the impact sound can be calculated.
- Since the successive sounds in which the impact sound and the non-impact sound are mixed generally share frequency bands and overlap each other in the time domain, the impact sound must be distinguished from these successive sounds.
- Conventional techniques for separating the impact sound include U.S. Pat. No. 6,249,749, U.S. Pat. No. 6,182,018 and U.S. Pat. No. 5,831,936.
- The present invention provides an impulse event separating method, and an apparatus to perform the method, of detecting an onset from an input audio signal in each frequency band, detecting an event using the onset, and determining whether the event is an impulse event.
- Additional aspects and/or advantages of the invention will be set forth in part in the description which follows and, in part, will be apparent from the description, or may be learned by practice of the invention.
- According to an aspect of the present invention, there is provided an impulse event separating apparatus comprising a preprocessing unit which divides an input signal into frame units; an event detecting unit which divides the frame into a plurality of frequency sub-bands, obtains power variations and phase variations of the signals of each of the sub-bands to detect a plurality of onsets, and detects a plurality of events using the detected onsets; an event buffer which stores the detected events; and an impulse event determining unit which determines whether the detected events comprise an impulse event with reference to an impulse event property.
- According to another aspect of the present invention, there is provided an impulse event separating method comprising dividing an input signal into frame units and dividing each frame into a plurality of frequency sub-bands; obtaining a power variation and phase variation of the signal of each of the frequency sub-bands, and detecting a plurality of local onsets using the power variation and the phase variation; obtaining a global onset from the local onsets and triggering a plurality of event components using the local onsets and the global onset; tracking and combining the event components in each of the frequency sub-bands to form events; and determining whether the events comprise an impulse event with reference to an impulse event property.
- These and/or other aspects and advantages of the invention will become apparent and more readily appreciated from the following description of the embodiments, taken in conjunction with the accompanying drawings of which:
-
FIG. 1 is a block diagram illustrating an impulse event separating apparatus according to an embodiment of the present invention; -
FIG. 2 is a block diagram illustrating an event detecting unit shown inFIG. 1 ; -
FIG. 3 is a block diagram illustrating a scout shown inFIG. 2 ; -
FIG. 4 is a block diagram illustrating a local onset detecting unit shown inFIG. 3 ; -
FIG. 5 defines an external and internal domain so as to combine a plurality of local onsets; -
FIG. 6 illustrates an example of tracking ECs by a frequency sub-band; and -
FIGS. 7A through 7D illustrate the result of approximating a log power signal of an input signal. - Reference will now be made in detail to the embodiments of the present invention, examples of which are illustrated in the accompanying drawings, wherein like reference numerals refer to the like elements throughout. The embodiments are described below to explain the present invention by referring to the figures.
-
FIG. 1 is a block diagram illustrating an impulse event separating apparatus according to an embodiment of the present invention. The impulse event separating apparatus includes a preprocessingunit 10, anevent detecting unit 11, anevent buffer 12, and an impulseevent determining unit 13. - The preprocessing
unit 10 divides an input audio signal into frame units, extracts a frequency band corresponding to an impulse event from each frame, and samples and converts the frequency band into a digital signal. - The
event detecting unit 11 detects an event from the digital signal, and theevent buffer 12 buffers the event detected in theevent detecting unit 11. The impulseevent determining unit 13 determines whether the event stored in theevent buffer 12 is an impulse event, and separates the impulse event therefrom. -
FIG. 2 is a block diagram of theevent detecting unit 11 inFIG. 1 . Theevent detecting unit 11 includes a controllingunit 20, a plurality ofscouts pools event forming unit 23. - The controlling
unit 20 divides a frame output from the preprocessingunit 10 into a plurality of sub-bands and outputs them to thescouts scouts unit 20. At this time, the controllingunit 20 combines the local onsets detected in thescouts scouts - Here, each sub-band may be uniformly divided from the frequency band of the corresponding frame, and may be divided according to the output of a cochlear filter. The impulse response of the cochlear filter can be approximated through a Gammatone filter function expressed by
Equation 1.
g(t)=t n-1 exp(−2πbt)cos(2πf 0 t+φ) (1)
Wherein f0 is the center frequency of the cochlear filter, n is a degree, φ is a phase difference, and b is a constant. - The controlling
unit 20 may include a cochlear filter bank having the impulse response as shown byEquation 1 for the center frequency of each sub-band, and can provide the output thereof to each of thescouts unit 20 may further include a synchronizing unit so as to simultaneously drive thescouts - The
EC pools scouts - The
event forming unit 23 combines the ECs triggered in theEC pools event forming unit 23 subtracts the event from the signal output from the preprocessingunit 10 and outputs a zero event, that is, a whole background sound. -
FIG. 3 is a detailed block diagram illustrating one of thescouts FIG. 2 . The scout includes a localonset detecting unit 30, alocal estimating unit 31, and atrigger unit 32. The impulse event starts at the onset. That is, the ECs of the EC pools 22 a, 22 b, . . . 22 k start at the onset. Accordingly, by detecting every onset, the start of every event can be detected. - The local
onset detecting unit 30 detects the local onset from an amplitude spectrum and a phase spectrum of the signal input from the controllingunit 20.FIG. 4 is a detailed block diagram of the localonset detecting unit 30. The localonset detecting unit 30 includes an instantpower measuring unit 40, a deltapower calculating unit 41, a logpower measuring unit 42, a delta logpower calculating unit 43, aphase span unit 44, a matchedfilter 45, anonset filter unit 46, and amultiplier 47. - If the amplitude spectrum of the input signal of the frame (t) is {Y(t,1), Λ, Y(t, N)}, the instant
power measuring unit 40, the deltapower calculating unit 41, the logpower measuring unit 42, and the delta logpower calculating unit 43 can respectively obtain the power, the delta power, the log power, and the delta log power, expressed byEquation 2, from the amplitude spectrum.
Wherein power(t) is the instant power, DPower(t) is the delta power, Logpower(t) is the log power, and DlogPower(t) is the delta log power. - The instant power and the log power represent the trace of the absolute value of the energy, and the delta power and the delta log power include the variation of the energy between frames. These values increase rapidly in the onset, with the delta log power increasing particularly rapidly.
- The
phase span unit 43 measures the phase variation of the linear phase component in the sub-band frequency domain. According to the Fourier analysis theory, the signal is expressed by the amplitude spectrum and the phase spectrum. The amplitude encodes the frequency content of the signal, and the phase represents a temporal or spatial structure. Accordingly, the temporal location of the onset can be expressed by the slope of the linear phase component. If an unwrapped phase spectrum adjacent to the frame (t) is {φ(t,0), . . . , φ(t, N/2)}, the unwrapped phase spectrum can be approximated by the linear function as shown byEquation 3.
{circumflex over (φ)}(t,n)=α(t)·n+{circumflex over (φ)}( t,0), n=0, . . . , N/2 (3)
Wherein α(t) is the slope of the linear phase component. - According to
Equation 3, the phase span of the frame (t) is calculated by Equation 4.
PhaseSpan(t)=α(t)N/2≅φ(t,N/2)−φ(t,0) (4) - Since the general phase span of the onset is linear, it can be expressed by
Equation 5. - Since the matched
filter 44 is used for matching the pattern, it has the impulse response expressed by Equation 6. - The output of the matched filter for the phase span result of
Equation 5 is expressed by the conjugate ofEquations 5 and 6 as shown by Equation 7.
Wherein c is a constant. - The constant (c) has a value of c=24/(N-2)(N-1)/Nπ2, so that the maximum of the result of Equation 7 becomes 1.
- The
onset filter unit 46 emphasizes the variation degree of the input signal, and includes a plurality of secondary filters to which primary filters having a delay-add filter shape are connected. The onset filters respectively filter the outputs of the instantpower measuring unit 40, the deltapower calculating unit 41, the logpower calculating unit 42, and the delta logpower calculating unit 43. Each onset filter has the impulse response expressed by Equation 8.
h of(t)=Ae t/T1 −Be t/T2 (8)
Wherein A=1−e −1/T1 , B=1−e −1/T2 , and T 1 <T 2 - The onset filter having the impulse response shown by Equation 8 is sensitive to the input which varies relatively rapidly.
- The
multiplier 47 multiplies a plurality of filter outputs of theonset filter unit 46 by the output of the matchedfilter 45 to output the local onset for the corresponding sub-band. - The controlling
unit 20 detects the global onset from the plurality of local onsets detected by thescouts FIG. 5 defines an external and internal domain so as to combine the plurality of local onsets. - Referring to
FIG. 5 , f(t) represents the output of the onset filter which is applied to the output of the logpower calculating unit 42. The points t0 and t3 represent the zero points of f(t) closed to the main peak, and t1 and t2 represent the zero points of f(t)-z closed to the main peak. Here, z is a constant which is selected experimentally. The section (t0, t3) is defined as the external domain, and the section (t1, t2) is defined as the internal domain. The controllingunit 20 combines two local onsets to make one global onset when the external domain of one local onset overlaps with the internal domain of the other local onset. - If the global onset is made, the controlling
unit 20 sends notice that the global onset is made to thelocal estimating unit 31 of the scout which does not detect the local onset. Thelocal estimating unit 31 receives the notice and detects the power of the corresponding sub-band at the global onset time. If the power is greater than an estimate, a notice trigger EC is triggered by thetrigger unit 32. Thelocal estimating unit 31 estimates the recent power before the global onset time. - The
trigger unit 32 triggers the EC according to the notice output from thelocal estimating unit 31 or the local onset output from the localonset detecting unit 30. - The EC pools 22 a, 22 b, . . . 22 k include the plurality of ECs triggered by the
trigger unit 32. The duration and the power during the duration of each EC are estimated. Each EC becomes either a masking state or a masked state, and one EC of the masking state exists in one sub-band. At this time, any ECs other than the masking EC become the masked state. If a new EC is triggered by thetrigger unit 32, it becomes the masking state. - The EC pools 22 a, 22 b, . . . 22 k also include a zero EC. The zero EC sets a zero event component level for each sub-band and represents the acoustic background in that sub-band. The zero EC becomes the masking state if it is the only EC in the sub-band, and otherwise becomes masked by the other ECs. If the zero EC is in the masking state, the local estimated value rapidly converges to the acoustic background of the corresponding sub-band. The power of the zero EC is the zero event component level, and the other ECs disappear when their power falls below the zero event component level. The instant power of the masked EC is estimated in the
local estimating unit 31 at the corresponding instant, and the instant power of the masking EC is the value obtained by subtracting the sum of the powers of the masked ECs from the total power of that frequency band. - The
event forming unit 23 tracks the ECs included in the EC pools 22 a, 22 b, . . . 22 k and estimates the power of the EC at every instant and the end point of each EC to obtain the power function of each EC. -
FIG. 6 illustrates an example of tracking ECs by a frequency sub-band.FIG. 6 (a) illustrates two impulse event components (A, B), and B overlaps with the middle of A. InFIG. 6 (b), the solid line indicates the local data of the corresponding sub-band, and the dotted line indicates the estimated power of each EC.FIG. 6 (c) illustrates the result of theevent forming unit 32 separating the data ofFIG. 6 (a) into three ECs, that is, the zeroEC 60, the EC (A) 61, and the EC (B) 62. Insection 1, the impulse EC does not exist, and thus the zero EC becomes the masking EC, and the power of the zero EC becomes the zero event component level. Insection 2, the EC (A) occurs, and the zero EC is masked. Insection 3, the EC (B) occurs and becomes the masking state, and thus the EC (A) and the zero EC are masked. Since the power of the EC (B) becomes lower than that of the EC (A) in section 4, the EC (B) disappears. Insection 5, the EC (A) becomes the masking EC, until the power of the EC (A) becomes lower than the zero event component level at the end ofsection 5, and thus the EC (A) disappears. In section 6, the zero EC becomes the masking state again. - Accordingly, the tracking of the event component is accomplished according to the variation of the power of the masking ECs at every instant. The
event forming unit 23 determines the duration with reference to the start point and the end point of each EC, and forms the event if the above-mentioned event tracking process is completed. That is, referring to FIGS. 6A(a) through 6(c), the time at which the power of the masking EC becomes greater than the zero event component level is the start point of the event, and the time at which the power of the masking EC becomes less than the zero event component level is the end point of the event. - The
event buffer 12 temporarily stores the events formed in theevent forming unit 23. - The impulse
event determining unit 13 determines whether the events stored in theevent buffer 12 are impulse events or not, with reference to a common property of the impulse events. - In order to identify impulse events, two examining processes are needed. Between them, it is determined whether the power of the detected onset increases rapidly. This is performed in the local
onset detecting unit 30, which searches the start point of as many of the impulse events as possible. However, three tests are used to identify impulse events in a given time period [a, b]. First, whether the instant power function of the signal between the onset and the power peak point reaches a sufficiently large value at time (b); second, whether the instant power function has largely increased during the time period [a, b]; and third, whether the time period [a, b] is sufficiently small. - Here, determining whether the instant power function has largely increased must satisfy the following requirement for damped oscillation.
- The log power of the section during which the signal is attenuated is substantially linear from the peak to a noise level. This pattern is equal to the attenuation pattern of the single mode damped oscillation. The attenuation pattern of the damped oscillation can be expressed by Equation 9.
- If the power peak time is tp, the noise level is n1, and the time when the power falls below the noise level is te, then using these parameters, the inequality of Equation 9 can be quantitated using the power function expressed by
Equation 10.
z(t)=c(1 31 t)λ (10) - Here, c is a constant determined by z(tp)=Power(tp) and z(te)=Power(te), λ is a value for representing the impulsiveness of the sound, and z(t) is the instant power.
- The function of
Equation 10 satisfies the inequality of Equation 9 when λ is a value between 0 and 1. If λ>>1, it is difficult to be considered as an impulse event. An ideal λ approaches 1, and most impulse events are not greater than 3. -
FIGS. 7A through 7D illustrate the result of approximating a log power signal of an input signal according toEquation 10. Reference numerals 90-1, 90-2, 90-3, and 90-4 indicate original signals, and 91-1, 91-2, 91-3, and 91-4 respectively illustrate the log powers of the original signals. 92-1, 92-2, 92-3, and 92-4 illustrate the results of approximating the log power signals. It is noted that the approximated log power signal 0 for the corresponding log power between the noise level set to each signal and a threshold value higher than the noise level and λ is approximated to 0.520, 0.959, 1.435, and 37.59 for the log power which is attenuated from the onset to the threshold value.FIG. 7B illustrates an ideal impulse event in which λ approaches 1.FIGS. 7A and 7C illustrate the impulse events. InFIG. 7D , the power level increases rapidly. However,FIG. 7D is difficult to be considered as an impulse event, because λ>>1. Actually,FIG. 7D illustrates a speech signal, not an impulse event. - The invention can also be embodied as computer readable code on a computer readable recording medium. The computer readable recording medium is any data storage device that can store data which can be thereafter read by a computer system. Examples of the computer readable recording medium include read-only memory (ROM), random-access memory (RAM), CD-ROMs, magnetic tapes, floppy disks, optical data storage devices, and carrier waves (such as data transmission through the internet). The computer readable recording medium can also be distributed over network coupled computer systems so that the computer readable code is stored and executed in a distributed fashion. Also, functional programs, code, and code segments for accomplishing the present invention can be easily construed by programmers skilled in the art to which the present invention pertains.
- According to the present invention, an impulse event can be separated by separating the successive audio stream into frequency bands to detect local onsets, forming the events using the detected onsets, and examining the log powers of the events. Since the present invention determines an impulse event, for example, a glass-breaking sound, a gunshot, or footsteps, from the sound generated in surroundings, it can be applied to a security system and can diagnose a defect of a structure through acoustic diagnosis.
- Although a few embodiments of the present invention have been shown and described, it would be appreciated by those skilled in the art that changes may be made in these embodiments without departing from the principles and spirit of the invention, the scope of which is defined in the claims and their equivalents.
Claims (29)
h of(t)=Ae t/T
wherein A=1−e −1/T
z(t)=c(1−t)λ,
h of(t)=Ae t/T
wherein A=1−e −1/T
z(t)=c(1−t)λ,
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
KR10-2004-0091451 | 2004-11-10 | ||
KR1020040091451A KR100612870B1 (en) | 2004-11-10 | 2004-11-10 | Appratus and method for seperating implusive event |
Publications (2)
Publication Number | Publication Date |
---|---|
US20060100828A1 true US20060100828A1 (en) | 2006-05-11 |
US7444853B2 US7444853B2 (en) | 2008-11-04 |
Family
ID=36317417
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US11/270,622 Active 2026-11-05 US7444853B2 (en) | 2004-11-10 | 2005-11-10 | Impulse event separating apparatus and method |
Country Status (2)
Country | Link |
---|---|
US (1) | US7444853B2 (en) |
KR (1) | KR100612870B1 (en) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20150245136A1 (en) * | 2011-12-23 | 2015-08-27 | Bose Corporation | Headset noise-based pulsed attenuation |
US10595325B2 (en) * | 2015-07-24 | 2020-03-17 | Lg Electronics Inc. | Method for transmitting and receiving terminal grouping information in non-orthogonal multiple access scheme |
Families Citing this family (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
KR100851602B1 (en) * | 2008-02-14 | 2008-08-12 | (주) 펄스피어 | Apparatus and method for detecting a crashing sound and rhythm |
Citations (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US5552770A (en) * | 1994-04-08 | 1996-09-03 | Detection Systems, Inc. | Glass break detection using multiple frequency ranges |
US5729145A (en) * | 1992-07-30 | 1998-03-17 | Siemens Energy & Automation, Inc. | Method and apparatus for detecting arcing in AC power systems by monitoring high frequency noise |
US5831936A (en) * | 1995-02-21 | 1998-11-03 | State Of Israel/Ministry Of Defense Armament Development Authority - Rafael | System and method of noise detection |
US6182018B1 (en) * | 1998-08-25 | 2001-01-30 | Ford Global Technologies, Inc. | Method and apparatus for identifying sound in a composite sound signal |
US6249749B1 (en) * | 1998-08-25 | 2001-06-19 | Ford Global Technologies, Inc. | Method and apparatus for separation of impulsive and non-impulsive components in a signal |
US6907368B2 (en) * | 2002-02-22 | 2005-06-14 | Framatome Anp Gmbh | Method and device for detecting a pulse-type mechanical effect on a system part |
US6947449B2 (en) * | 2003-06-20 | 2005-09-20 | Nokia Corporation | Apparatus, and associated method, for communication system exhibiting time-varying communication conditions |
US7234340B2 (en) * | 2004-02-10 | 2007-06-26 | Samsung Electronics Co., Ltd. | Apparatus, method, and medium for detecting and discriminating impact sound |
Family Cites Families (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2003271168A (en) | 2002-03-15 | 2003-09-25 | Nippon Telegr & Teleph Corp <Ntt> | Method, device and program for extracting signal, and recording medium recorded with the program |
-
2004
- 2004-11-10 KR KR1020040091451A patent/KR100612870B1/en active IP Right Grant
-
2005
- 2005-11-10 US US11/270,622 patent/US7444853B2/en active Active
Patent Citations (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US5729145A (en) * | 1992-07-30 | 1998-03-17 | Siemens Energy & Automation, Inc. | Method and apparatus for detecting arcing in AC power systems by monitoring high frequency noise |
US5552770A (en) * | 1994-04-08 | 1996-09-03 | Detection Systems, Inc. | Glass break detection using multiple frequency ranges |
US5831936A (en) * | 1995-02-21 | 1998-11-03 | State Of Israel/Ministry Of Defense Armament Development Authority - Rafael | System and method of noise detection |
US6182018B1 (en) * | 1998-08-25 | 2001-01-30 | Ford Global Technologies, Inc. | Method and apparatus for identifying sound in a composite sound signal |
US6249749B1 (en) * | 1998-08-25 | 2001-06-19 | Ford Global Technologies, Inc. | Method and apparatus for separation of impulsive and non-impulsive components in a signal |
US6907368B2 (en) * | 2002-02-22 | 2005-06-14 | Framatome Anp Gmbh | Method and device for detecting a pulse-type mechanical effect on a system part |
US6947449B2 (en) * | 2003-06-20 | 2005-09-20 | Nokia Corporation | Apparatus, and associated method, for communication system exhibiting time-varying communication conditions |
US7234340B2 (en) * | 2004-02-10 | 2007-06-26 | Samsung Electronics Co., Ltd. | Apparatus, method, and medium for detecting and discriminating impact sound |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20150245136A1 (en) * | 2011-12-23 | 2015-08-27 | Bose Corporation | Headset noise-based pulsed attenuation |
US9854356B2 (en) * | 2011-12-23 | 2017-12-26 | Bose Corporation | Headset noise-based pulsed attenuation |
US10595325B2 (en) * | 2015-07-24 | 2020-03-17 | Lg Electronics Inc. | Method for transmitting and receiving terminal grouping information in non-orthogonal multiple access scheme |
Also Published As
Publication number | Publication date |
---|---|
KR100612870B1 (en) | 2006-08-14 |
KR20060042700A (en) | 2006-05-15 |
US7444853B2 (en) | 2008-11-04 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US7234340B2 (en) | Apparatus, method, and medium for detecting and discriminating impact sound | |
JP4587160B2 (en) | Signal processing apparatus and method | |
JP3423906B2 (en) | Voice operation characteristic detection device and detection method | |
US8666527B2 (en) | System for elimination of acoustic feedback | |
JP2002508891A (en) | Apparatus and method for reducing noise, especially in hearing aids | |
WO2006041735A2 (en) | Reverberation removal | |
JPH0121519B2 (en) | ||
EP0976208B1 (en) | Acoustic feedback elimination using adaptive notch filter algorithm | |
US6718302B1 (en) | Method for utilizing validity constraints in a speech endpoint detector | |
US7444853B2 (en) | Impulse event separating apparatus and method | |
De Cesaris et al. | Extraction of the envelope from impulse responses using pre-processed energy detection for early decay estimation | |
US4164626A (en) | Pitch detector and method thereof | |
US20020198704A1 (en) | Speech processing system | |
US10386510B2 (en) | Earthquake Detection System and Method | |
US8788265B2 (en) | System and method for babble noise detection | |
US20040240690A1 (en) | Oscillation detection | |
JPH0251200B2 (en) | ||
JP2913105B2 (en) | Sound signal detection method | |
US6633847B1 (en) | Voice activated circuit and radio using same | |
JPH08292787A (en) | Voice/non-voice discriminating method | |
JP2932996B2 (en) | Harmonic pitch detector | |
RU2082988C1 (en) | Process of optimal detection of pulse signals with unmodulated carrier frequency | |
Green et al. | Assessing the suitability of the magnitude slope deviation detection criterion for use in automatic acoustic feedback control | |
EP0920744B1 (en) | Methods and apparatus for echo suppression | |
JP3285178B2 (en) | Sound signal rising detection method |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
AS | Assignment |
Owner name: SAMSUNG ELECTRONICS CO., LTD., KOREA, REPUBLIC OF Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:LEE, YONGBEOM;ZHU, XUAN;LEE, JAEWON;REEL/FRAME:017243/0185 Effective date: 20051110 |
|
STCF | Information on status: patent grant |
Free format text: PATENTED CASE |
|
FEPP | Fee payment procedure |
Free format text: PAYOR NUMBER ASSIGNED (ORIGINAL EVENT CODE: ASPN); ENTITY STATUS OF PATENT OWNER: LARGE ENTITY |
|
CC | Certificate of correction | ||
FEPP | Fee payment procedure |
Free format text: PAYER NUMBER DE-ASSIGNED (ORIGINAL EVENT CODE: RMPN); ENTITY STATUS OF PATENT OWNER: LARGE ENTITY Free format text: PAYOR NUMBER ASSIGNED (ORIGINAL EVENT CODE: ASPN); ENTITY STATUS OF PATENT OWNER: LARGE ENTITY |
|
FPAY | Fee payment |
Year of fee payment: 4 |
|
FPAY | Fee payment |
Year of fee payment: 8 |
|
MAFP | Maintenance fee payment |
Free format text: PAYMENT OF MAINTENANCE FEE, 12TH YEAR, LARGE ENTITY (ORIGINAL EVENT CODE: M1553); ENTITY STATUS OF PATENT OWNER: LARGE ENTITY Year of fee payment: 12 |