CN105027541B - Noise suppressed based on content - Google Patents

Noise suppressed based on content Download PDF

Info

Publication number
CN105027541B
CN105027541B CN201480010777.4A CN201480010777A CN105027541B CN 105027541 B CN105027541 B CN 105027541B CN 201480010777 A CN201480010777 A CN 201480010777A CN 105027541 B CN105027541 B CN 105027541B
Authority
CN
China
Prior art keywords
signal
audio signal
content
noise
noise signal
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201480010777.4A
Other languages
Chinese (zh)
Other versions
CN105027541A (en
Inventor
金莱轩
阿西夫·伊克巴勒·穆罕默德
埃里克·维瑟
辛钟元
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Qualcomm Inc
Original Assignee
Qualcomm Inc
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Qualcomm Inc filed Critical Qualcomm Inc
Publication of CN105027541A publication Critical patent/CN105027541A/en
Application granted granted Critical
Publication of CN105027541B publication Critical patent/CN105027541B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10KSOUND-PRODUCING DEVICES; METHODS OR DEVICES FOR PROTECTING AGAINST, OR FOR DAMPING, NOISE OR OTHER ACOUSTIC WAVES IN GENERAL; ACOUSTICS NOT OTHERWISE PROVIDED FOR
    • G10K11/00Methods or devices for transmitting, conducting or directing sound in general; Methods or devices for protecting against, or for damping, noise or other acoustic waves in general
    • G10K11/16Methods or devices for protecting against, or for damping, noise or other acoustic waves in general
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04MTELEPHONIC COMMUNICATION
    • H04M9/00Arrangements for interconnection not involving centralised switching
    • H04M9/08Two-way loud-speaking telephone systems with means for conditioning the signal, e.g. for suppressing echoes for one or both directions of traffic
    • H04M9/082Two-way loud-speaking telephone systems with means for conditioning the signal, e.g. for suppressing echoes for one or both directions of traffic using echo cancellers
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/02Speech enhancement, e.g. noise reduction or echo cancellation
    • G10L21/0208Noise filtering
    • G10L21/0216Noise filtering characterised by the method used for estimating noise
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/02Speech enhancement, e.g. noise reduction or echo cancellation
    • G10L21/0208Noise filtering
    • G10L2021/02082Noise filtering the noise being echo, reverberation of the speech

Landscapes

  • Engineering & Computer Science (AREA)
  • Signal Processing (AREA)
  • Physics & Mathematics (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Computational Linguistics (AREA)
  • Quality & Reliability (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Soundproofing, Sound Blocking, And Sound Damping (AREA)
  • Circuit For Audible Band Transducer (AREA)

Abstract

The present invention is disclosed for the apparatus and method of audio-frequency noise decay.Audio signal analysis device can determine that whether the input audio signal received from microphone apparatus includes the noise signal with recognizable content.If there is the noise signal with recognizable content, then access content source to obtain the copy of the noise signal.Audio arrester can produce the audio signal through processing of the noise signal with decay based on the copy of the noise signal with the input audio signal.Additionally or alternatively, independent media apparatus can be passed data to over the communication channels to receive at least a portion of the copy of the noise signal from the individually media apparatus, or receives the content identification data corresponding to the content source.

Description

Noise suppressed based on content
Technical field
It is described below being directed to Audio Signal Processing.Specifically, description is suppressed for audio-frequency noise.
Background technology
It is partly due to the progress of battery, processing and the communication technology, personal device become increasingly to move, powerful and connected Connect.With these technical progress, user has bigger spirit in terms of the mode that its device can be used in they and is interacted with its device Activity.Specifically, speech recognition can be used to allow user to control mobile device with voice command in mobile device.It is in addition, right In speech recognition and voice remote measurement, user wants mobile device and normally grasped in a variety of environment (including acoustics harsh and unforgiving environments) Make.
Reduce using various noise suppression proposals or mitigate the ambient noise when user just interacts with mobile device Adverse effect.Frequency selectivity filtering (such as) it can be used to suppress the noise associated with some frequency bands.Other noise suppresseds Scheme using statistical model suppress statistically with noise about or statistically with the audio signal of plan is incoherent is caught Obtain some aspects of audio signal.Other noise suppression proposals are eliminated from by mobile device generation and then using internal signal again Noise (for example, echo noise) caused by the sound of sensing.
The content of the invention
System, the method and apparatus of the present invention each have some aspects, wherein being solely responsible for its conjunction without single aspect The attribute needed., now will be succinct in the case where not limiting the scope of the present invention such as expressed by claims below Discuss some features in ground.After considering this discussion, and especially read entitled " embodiment " chapters and sections it Afterwards, it is understood that how the feature of the present invention is provided comprising ambient noise is reduced to improve the advantage of audio frequency process.
One embodiment is a kind of device of attenuation audio noise.Described device can include microphone, and it is configured to connect Receive input audio signal.Described device can also include audio signal analysis device, and it is configured to determine the input audio signal Whether the noise signal with recognizable content is included.If there is the noise signal with recognizable content, then institute Content source can be accessed to obtain the copy of the noise signal by stating audio signal analysis device.Described device can also include audio and eliminate Device, it is configured to the copy based on the noise signal has decay with the input audio signal to produce The audio signal through processing of noise signal.
Another embodiment is a kind of method of attenuation audio noise.Methods described can include and receive input audio signal.Institute Stating method can also include whether the determination input audio signal includes the noise signal with recognizable content.If there is Noise signal with recognizable content, then methods described can include access content source to obtain the pair of the noise signal This.Methods described can further include to be had based on the copy of the noise signal with the input audio signal to produce The audio signal through processing of the noise signal of decay.
Another embodiment is a kind of non-transitory computer-readable media, and it makes computing device side when being stored in through performing The instruction of method.Methods described, which includes, receives input audio signal, and determines whether the input audio signal includes and have and can know The noise signal of other content.If there is the noise signal with recognizable content, then methods described is included in access Rong Yuan is to obtain the copy of the noise signal.Methods described further includes the copy and institute based on the noise signal Input audio signal is stated to produce the audio signal through processing of the noise signal with decay.
Brief description of the drawings
Fig. 1 is to include mobile phone according to one embodiment to suppress the sound of the noise from one or more media apparatus The frame schematic diagram of display system.
Fig. 2 is the block diagram of the illustrative embodiment for the apparatus for processing audio for suppressing audio-frequency noise.
Fig. 3 is the block diagram of the particular illustrative embodiment for the audio signal analysis device implemented by Fig. 2 apparatus for processing audio.
Fig. 4 is the block diagram of the particular illustrative embodiment for the audio suppressor system implemented by Fig. 2 apparatus for processing audio.
Fig. 5 is the block diagram of the particular illustrative embodiment for the audio suppressor system implemented by Fig. 2 apparatus for processing audio.
Fig. 6 is the flow chart of the method suppressed according to the audio-frequency noise of various embodiments.
Fig. 7 A are to be used to determine whether audio input signal to be included with recognizable content according to one embodiment The illustrative flow of the case method of noise.
Fig. 7 B are for accessing content source to obtain the illustrative stream of the case method of source signal according to one embodiment Cheng Tu.
Fig. 8 is the illustrative flow according to the case method for attenuation audio noise of embodiment.
Fig. 9 is the flow chart of the certain illustrative method suppressed according to the audio-frequency noise of embodiment.
Embodiment
Embodiment is related to the system for suppressing the undesired audio-frequency noise in the audio signal by electronic installation reception And method.In one embodiment, system suppresses to represent identifiable media content (for example, the prevalence just played in the background Song) audio-frequency noise.The system can obtain the copy of media content, and undesired audio is produced from the copy of media content The copy of noise, and remove undesired audio-frequency noise from audio signal.For example, in operation, system determines undesired sound The media content that the acoustic patterns or fingerprint and the use pattern identification of frequency noise are represented by audio-frequency noise is (for example, specific song It is bent).Identity can be used to search for media content sources, for example, the digital record of identified song., can once song is identified The copy of song is downloaded into electronic installation, and then be used to subtract the song from audio signal.In one embodiment, it is described System can be implemented by portable computing (for example, cellular phone).For example, cellular phone can press down during telephone conversation Make the song played in the background or other media contents.
In a particular instance, the system can be by the cellular phone with microphone and the storage for being stored in phone Digital music storehouse is implemented in device.When people is talking near the radio of particular song played on phone, system can be from Mike The audio signal of wind extracts audio frequency characteristics to show the acoustic patterns of the song or fingerprint.The pattern shown can then be used to search for Containing by these acoustic patterns index with find just by radio play song song identity database.If song body Part matched patterns, then phone can search for its music libraries then for the copy of identified song.Alternatively, phone can be via The copy of identified song of the network connecting request from server.Once it is accessed, the copies synchronized of song can be arrived The time location (when it is being played on radio) of song is to suppress the song from the audio signal received.It is with this The phone of system will allow user for telephone conversation by addition excessively the harsh region of acoustics (for example, outdoor music venue or Concert hall) in operation phone.
In another particular instance, noise suppressing system can be by controlling the independent media apparatus with wireless communication ability The remote control of the Voice command of (for example, TV (TV)) is implemented.The controller directly can receive content information from TV.For example, TV can transmit the effect mid band of face remote controller displays, and that information can be used to access frequency by Internet connection for remote control The audio in road.Alternatively, the copy of broadcast can be sent to remote control by TV.Remote control can be eliminated by TV using the copy of broadcast again Caused audio.This works the electronic installation for allowing Voice command by producing the media apparatus of audio.
The method, apparatus and system of announcement can be used to improve existing noise reduction techniques.Specifically, in some situations In, after estimating and/or identifying the content of audio-frequency noise, it is possible to find audio-frequency noise is substantially deterministic.By means of saying Bright, this situation is that pre-recorded song is noise source.In the case, if (such as) known play a song The specific time sequence of bent, what particular songs and song, then song can be substantially deterministic.It is if relevant with above content Information is known or recognizable, then the copy of song or audio signal can be used to decay or eliminate audio signal and correspond to song Component.The suppression to song can improve the speech recognition via mobile device or the quality of Speech Communication in this way.
The example of media apparatus includes TV, radio, laptop computer/netbook computer, tablet PC, table Laptop computer and the like for being configured to play media content (including audio media content).Audio media content Example include represent music, video and with audio other similar media data or signal.
In order to further illustrate, Fig. 1 displayings include the shifting for being configured to suppress the noise from one or more media apparatus The block diagram of the special audio configuration 100 of mobile phone 102.Specifically, mobile phone 102 has microphone 104 and antenna 106. Voice-and-data signal can be delivered to network 108 or other electronic installations by mobile phone 102.Network 108 can be wired or nothing Gauze network, and one or more content numbers to storing various content sources (for example, music and audio-visual data file) can be provided According to the access in storehouse 110.In one embodiment, network is internet.
In operation, user 112 to for (such as) Mike of the mobile phone 102 of Speech Communication and/or speech recognition Wind 104 is talked, to control mobile phone 102 or control to be communicatively coupled to other electronic installations of mobile phone 102.Mobile phone 102 microphone 104 captures the voice command 114 of user to produce input audio signal.In some cases, mobile phone 102 can most close independent media apparatus, for example, possessing the TV (TV) 116 or radio 118 of network savvy.These dresses The background sound 120,122 for the undesired background audio noise for serving as the operation on mobile phone 102 can be produced by putting.
For example, storage or stream transmission music can just be played by possessing the TV 116 of network function or radio 118.Wheat Gram wind 104 can be in background sound by possessing the TV 116 of network function or capture comes from user 112 while radio 118 produces Voice command 114.In these cases, can significantly be done from the TV 116 for possessing network function or the sound of radio 118 The voice command 114 of user is disturbed, and session or speech recognition is become difficult to user.Various embodiments are related to suppression input sound The noise component(s) of frequency signal.
Mobile phone 102 can suppress noise signal, more specifically, if the content of noise signal can recognize that.At one In embodiment, mobile phone 102 analyzes input audio signal to determine whether input audio signal has recognizable content, example Such as, particular songs or audio from television broadcasting.For example, one embodiment is by extracting the feature of input audio signal and connecing Search, download, stream transmission or access content source in addition to determine content identification information (for example, title of the song, album name, skill Astrologist's name or fellow).For example, referring to Fig. 1, mobile phone 102 can search for content data base 110 to access content source, its In content source determined based on matching source identification information.Mobile phone 102 can access content source, can obtain the pair of audio-frequency noise This (" source signal "), it can be used to specifically decay or suppress the audio-frequency noise for corresponding to the sound as caused by media apparatus.
Additionally or alternatively, mobile phone 102 can directly or via network 108 and possess network function TV 116 and/ Or radio 118 communicates to identify content source.For example, mobile phone 102 be able to may ask (such as) from possessing network work( The TV 116 of energy channel information, wherein the TV 116 for possessing network function can be communicated by using its communication antenna 124.Base In the channel information of reception, mobile phone 102 can access the content source from content data base 110.It is mobile as another example Phone 102 can access the content source of the device (not shown) for the broadcast media contents of TV 116 for possessing network function from face, For example, by be tuned to identification channel.As yet another embodiment, mobile phone 102 can be accessed from the TV for possessing network function 116 content source.In other words, mobile phone can be launched or be relayed to content source directly by the TV 116 for possessing network function 102。
Turning now to Fig. 2, displaying is configured to suppress the illustrative reality of the apparatus for processing audio 202 of undesired audio-frequency noise Apply the block diagram of example.Apparatus for processing audio 202 include interconnected by bus 214 processor 204, microphone 206, communication interface 208, Data storage device 210 and memory 212.In addition, memory 212 can include audio signal analysis device module 216, audio eliminates Device module 218 and communication module 212.The example of apparatus for processing audio 202 includes any electronic installation applicatory, for example, moving Dynamic computing device, cellular phone, all-purpose computer and fellow.
Processor 204, which includes, to be configured to perform the instruction from memory 212 and controls and operate microphone 206, leads to Believe the circuit (for example, microprocessor or microcontroller) of interface 208, data storage device 210, memory 212 and bus 214. Specifically, processor 204 can be general purpose single-chip or multi-chip microprocessor (for example, ARM), special microprocessor (for example, Digital signal processor (DSP)), microcontroller, programmable gate array etc..Although it show only in apparatus for processing audio 202 Single-processor, but in alternate configuration, the combination (for example, ARM and DSP) of processor can be used.
Microphone 206 is configured to capture acoustical sound, and as response, produces as performing from memory 212 The input audio signal that the processor 204 of specific instruction controls.The example of microphone 206, which includes, to be used to convert tones into electric sound Any applicable sensor or transducer of frequency signal, for example, condenser microphone, dynamic microphones, piezoelectric microphones and similar Person.In certain embodiments, microphone 206 is optional, and input audio signal (such as) from from data storage device 210 Or the data of memory 212 produce, or received from communication interface 208, as that will discussed below referring to Fig. 3.
Communication interface 208, which includes, is configured to allow for apparatus for processing audio 202 to launch and receive data (for example, for knowing , retrieve or access content source data) electronic device.Communication interface 208 may be communicatively coupled to wireless antenna, WLAN/LAN With other types of router and similar communicator.
Data storage device 210 and memory 212, which include, is configured to chemistry, magnetic, electricity, optics or similar fashion The mechanism of storage information.For example, data storage device 210 and memory 212 can be respectively non-volatile memory device (for example, flash memory or hard disk drive), or volatile memory devices are (for example, dynamic random access memory (DRAM) or static RAM (SRAM)).In certain embodiments, processor 204 can be by accessing data storage The content metadata storehouse of device 210 accesses content source.Data storage device 210 is shown as apparatus for processing audio 202 by Fig. 2 Part.In other embodiments, data storage device 210 can be located on isolated system and can be accessed by communication channel, for example, through By network.Audio signal analysis device module 216 will be discussed in further detail on Fig. 3.
In memory 212 for audio signal analysis device module 216, it includes configuration processor 204 to originate input Instruction of the identification of the content of audio signal to provide the access to corresponding content source and/or receive identified source signal. It such as will in further detail be discussed on Fig. 3, in certain embodiments, feature extracted from input audio signal.The feature of extraction Can be used to determine by input audio signal represent media content Content identity, and can be used Content identity access with it is described interior The associated content source of part of sheltering oneself.Audio signal analysis device module 216 will be discussed in further detail on Figure 4 and 5.
In memory 212 for audio canceller module 218, it includes configuration processor 204 has through knowing to handle The input audio signal of other source signal is with the instruction of attenuation audio noise.Specifically, by input audio signal with it is identified Source signal compare.In one embodiment, identified source signal is filtered to consider room acoustics.Carry out this wherein one Individual reason is because being partly due to the acoustic effect for the acoustic space that electronic installation is located at, the sound as caused by media apparatus It may differ from identified source signal.Acoustic effect can include acoustics suppression and echo.In another embodiment, audio letter is inputted Number with identified source signal it is synchronized with consider from calculate, communicate and acoustic factor caused by various delays.Will on Fig. 4 and 5 discuss audio canceller module 218 in further detail.
In memory 212 for communication module 220, it includes configuration processor 204 to control communication interface 208 to launch Or receive the instruction of data.In certain embodiments, origination audio processing unit 202 and independent media apparatus are (for example, Fig. 1 Possess the TV 116 of network function) between communication, following article discusses in further detail.
In operation, processor 204 may perform to from the instruction of memory 212 with receive captured by microphone 206 it is defeated Enter audio signal.Input audio signal can contain voice signal and audio-frequency noise signal.For example, voice signal can represent user's Voice, and audio-frequency noise signal can represent the sound as caused by nearby media device.Processor 204 may perform to from audio signal The instruction of analyzer module 216 is to identify the content of audio-frequency noise signal.Processor 204 can then for identified content Associated content source search data storage device 210.Additionally or alternatively, processor 204 may perform to from audio signal point The instruction of parser and/or communication module 212 on network via communication interface 208 to search for database.Once audio frequency process fills Content source can be accessed and with corresponding identified source signal by putting 202, then processor 204 may perform to eliminate from audio The instruction of device module 218 is with the copy by comparing noise signal (for example, filtered or unfiltered identified source signal) Suppress or at least part of attenuation audio noise signal with input audio signal.
Referring to Fig. 3, the specific theory for the audio signal analysis device 300 that block diagram shows are implemented by Fig. 2 apparatus for processing audio 202 Bright property embodiment.Audio signal analysis device 300 can be implemented with the computer executable instructions performed by processor 204 (for example, sound The instruction of frequency signal analyzer module 216).Fig. 3 audio signal analysis device 300 includes identifier generator 302, and it is configured To receive input audio signal and produce content identification information.Content identification information can include one or more of following:Art Name, content title (title of song, film, talking book etc.), identifier and the similar identity tag of family.Audio signal point Parser 300 also has source adaptation 304, and it is configured to receive content identification information and produces identified source signal.
Fig. 3 identifier generator 302 has feature extractor 306, content identifier 308 and content-identity database 310.Feature extractor 306 can be by determining the characteristic information of input audio signal to determine content comprising configuration processor 204 Instruction module implement.For example, in operation, feature extractor 306 can analyze input audio signal with determine it is recognizable or Characterize the acoustic patterns or fingerprint of input audio signal.In one embodiment, acoustic patterns or fingerprint can be based on performing frequency Spectrogram (for example, T/F) is analyzed.It will be appreciated that it is alternatively used for other method and systems applicatory of feature extraction, example Such as, based on mel-frequency cepstrum coefficient and/or the sound of linear prediction (for example, relative spectral convert-perceive linear prediction) is perceived Frequency treatment technology.Specific non-limiting examples for the Feature Extraction System of content recognition can be found in (such as) king A kind of paper " industrial strength audio search algorithm (An industrial strength audio search algorithm) (on the international conference journal ISMIR of music retrieval, the 3rd, 2003) ".For example, the system described by king utilizes frequency spectrum Local peaking's pattern in figure improves the steadiness to ambient noise.
Content identifier 308 can by comprising configuration processor 204 with using acoustic patterns or fingerprint for acoustic patterns or The module of the instruction of Content identity search content-identity database 310 of fingerprint is implemented.Corresponded to for example, processor 204 can be directed to In or substantially matching acoustic patterns or the content identification information of fingerprint search content-identity database 310.Identifier generator 302 provide content identification information to source adaptation 304.
Fig. 3 source adaptation 304 includes source searcher 312, source database 314 and source transmitter 316.Source searcher 312 Can be by using content identification information to be directed to the module that content source searches for the instruction of source database 314 comprising configuration processor 204 Implement.For example, processor 204 can correspond to or the content source of substantially matching content identification information is (for example, the MP3 of song File) search for the source data that (or be stored in outside and accessed by means of communication interface 208) is stored on data storage device 210 Storehouse 314.
Source transmitter 316 can access the content source identified by source searcher 312 and can produce identified source signal.It can incite somebody to action Source signal is as pulse code modulation (PCM) audio sample, packet (including compressed or through decoding data) or similar number Launch according to form.Therefore, source transmitter 316 optionally pending is mapped to audio frequency process comprising vocoder/encoder 318 to produce Device 202 through decoding audio packet.In other words, source transmitter 316 can be located at server computational device, and can be in number Source signal is sent to apparatus for processing audio 202 (for example, Fig. 1 mobile phone 102) according on path or voice path.
It will be appreciated that each of function of audio signal analysis device 300 can be performed by Fig. 2 apparatus for processing audio 202. In other embodiments, one or more in function by one or more server computational devices (for example, the He of content data base 110 It is connected to other devices of network) perform.For example, communication interface 208 and server computer can be used in apparatus for processing audio 202 Via network service.All it can be provided in a manner of similar stream transmission or within the data block, with similar downloading mode via network Identified source signal.Therefore, apparatus for processing audio 202 can receive identified source signal part (need use it for Before elimination).Therefore, each of content-identity database 310 and source database 314 can be electronically stored at audio On the data storage device 210 or memory 212 of processing unit 202, or its can be stored externally to apparatus for processing audio 202 and Via network access.
The certain illustrative for the audio suppressor system 400 that Fig. 4 displayings are implemented by Fig. 2 apparatus for processing audio 202 is implemented The block diagram of example.As demonstrated, audio suppressor system 400 can be used to suppress multiple audio-frequency noise sources.For example, audio eliminates Device system 400 has n synchronization blocks 402 (1) to 402 (n) (also referred to as " signal synchronizer ") and audio eliminates corresponding to n Device 404 (1) to 404 (n), and optional post processing block 406, vocoder block 408 and speech recognition block 410.Audio arrester System 400 can be implemented with computer executable instructions, for example, the finger of the audio canceller module 218 performed by processor 204 Order.
In operation, audio suppressor system 400 receives input audio signal and n identified source signals, and n are waited to decline The corresponding identified source signal of each of possibility audio-frequency noise subtracted.For example, referring to Fig. 1, audio-frequency noise 1 can correspond to In the audio 120 from the TV 116 for possessing network function, and audio-frequency noise 2 may correspond to the audio from radio 118 122.In addition, each identified source signal may correspond to as (such as) source letter caused by Fig. 3 audio signal analysis device 300 Number.N is concatenated configuring to synchronization blocks 402 (1) to 402 (n) and audio arrester block 404 (1) to 404 (n) so that audio-frequency noise 1 is first subjected to suppression and input audio signal of the gained through processing is fed into audio arrester 2 to suppress audio-frequency noise 2, etc. Deng.It will be appreciated that other configurations applicatory may be selected, for example, n parallel audio arresters 404 (1) to 404 (n).
As stated, n identified source signals can be as n independent source transmitters (for example, source shown in Fig. 3 is launched Device) provide.Additionally or alternatively, n identified source signals can be produced by the independent media apparatus of generation audio-frequency noise.It can incite somebody to action N identified source signals (and input audio signal) respectively provide as pcm audio sample or packet.For example, at one In embodiment, it can launch using n identified source signals as through transcode voice bag, and audio suppressor system 400 is included and appointed Vocoder/decoder (not shown) of choosing, for the decoded signal before synchronization blocks 402 (1) to 402 (n) are provided signals to.
As demonstrated in Figure 4, each of audio arrester 404 (1) to 404 (n) arrives with synchronization blocks 402 (1) respectively 402 (n) is associated.Each of synchronization blocks 402 (1) to 402 (n) can synchronous input audio signal (or preceding audio eliminates The output of device) with corresponding identified source signal.Synchronization blocks 402 (1) to 402 (n) can compensate for being attributed to processing, communication and class Like the difference of injection time in delay source.In addition, synchronization blocks 402 (1) to 402 (n) can be used to compensation it is determined that or estimating just filled by media Put the error of the current time location in the source of broadcasting.Each of synchronization blocks 402 (1) to 402 (n) can have corresponding respectively (1) -416 (n) of data buffer 416, for providing for synchronous delay.In certain embodiments, delay can be tunable 's.In operation, tunable delay can be determined by performing calibration process.For calibrate and tune delay process it is non- Limitative examples can be found in U.S. provisional patent application cases the 61/681,474th filed in August in 2012 9 days.
Each of n audio arrester 404 (1) to 404 (n) can have respectively be configured to filter corresponding to source letter Number one or more sef-adapting filters 412 (1) to 412 (n).Filtering can be used to consider the audio-frequency noise and source signal of capture Between change.That is, due to many factors comprising the following, the audio-frequency noise captured by microphone 206 can be with source signal Change:(for example, echo and acoustics are prevented, it can become the dynamics of acoustic space with microphone 206 and media apparatus position Change), the dynamics of speaker/microphone, the change (for example, different record quality) of content source and fellow.
There can be one or more can to compensate each of these changes, sef-adapting filter 412 (1) to 412 (n) The filter parameter of tuning.In certain embodiments, can on line tuning filtering device parameter with based on input audio signal and source These changes of signal modeling.For example, when input audio signal largely includes the sound as caused by media apparatus 1, it is adaptive The error of output (" filtered source signal ") between input audio signal of wave filter 412 (1) is answered to can be used to believe by reduction The mode tuning filtering device parameter of error between number.Small error may indicate that sef-adapting filter 412 (1) just substantially repair by modelling Change the acoustic effect of audio-frequency noise signal, and big error indicator sef-adapting filter 412 (1) is not in modelling acoustic effect. Such as a variety of methods of " adaptive rule " or " renewal rule " can be used to adjust filter coefficient.Example includes and is based on gradient side The adaptive rule (for example, based on the adaptive rule for reducing instantaneous or holistic cost) of method, to adjust tunable wave filter Parameter is to reduce the error between filtered source signal and input audio signal.Other examples include least-square methods, Lee Ya Punuofu (Lyapunov)/stability approach and random device.It will be appreciated, however, that any suitable recurrence can be used, non-pass Return or adaptive rule adjusts tunable filter parameter in batches.
In operation, audio arrester 404 (1) receives the synchronized multiple of input audio signal and identified source signal 1 This.As stated, identified source signal 1 can be similar to the audio signal for driving the positive loudspeaker for producing audio-frequency noise.It is adaptive Answer wave filter 412 (1) that identified source signal can be filtered to consider the acoustic power of acoustic space, thus produce be similar to by The filtered source signal 1 for the audio-frequency noise 1 that microphone 206 captures.The more synchronized input sound of audio arrester 404 (1) Frequency signal shows filtered source signal 1 to decay or suppress audio-frequency noise 1.As demonstrated, audio arrester 404 (1) is from input Audio signal subtracts filtered source signal.Then by the audio signal with suppressed noise 1 be fed into the second synchronization blocks with Suppress audio-frequency noise 2 etc., until having suppressed the n audio-frequency noise from input audio signal.
In addition, each (1) -412 (n) of sef-adapting filter 412 optionally has double talk detection device respectively (" DTD ") 414 (1) -414 (n), with the adjustment for stopping in some cases or realizing its filtering parameter.When except corresponding sound The outer input audio signal (or output of preceding audio arrester) of frequency noise is also comprising other near end signals (for example, the language of user Sound or other media noises) when, corresponding sef-adapting filter 412 (n) can irrelevantly adjust.Because it is different from sound when existing Sef-adapting filter 412 (n) can be adjusted during the additional proximal signal of frequency noise, so those additional proximal signals may act as On the strongly incoherent noise of adaptive rule.Therefore, the presence of additional proximal signal can make sef-adapting filter 412 (n) Dissipate and allow uncontrolled audio-frequency noise.Therefore, each of (1) -414 (n) of DTD 414 can be used to corresponding to monitoring The input of (1) -412 (n) of sef-adapting filter 412, and the detection based on additional proximal signal stops or realized adaptive.
One the method for (1) -414 (n) of DTD 414 can relate to calculate double talk detection statistics to determine adaptively to filter Ripple device input signal includes the time of additional proximal signal.One example double talk detection statistics is by source signal power to corresponding The ratio of sef-adapting filter input signal provide.Other double talk detection statistics applicatory may be selected.In addition, both-end Message accounting can be by time domain or by frequency-domain calculations.
As demonstrated in Figure 4, optionally comprising optionally non-linear post processing block 406 with to by echo canceller device 404 (n) The signal of offer performs certain form of processing.For example, non-linear post processing block 406 can be from leaving echo canceller device 404 (n) Signal removes residual noise (for example, nonlinear component of audio-frequency noise signal).In certain embodiments, can be by estimating to input The nonlinear component of audio signal and then from input audio signal subtract (for example, by using spectral subtraction technique) estimation come Removal or attenuated non-linear noise component(s).Non-linear post processing block can be based on the dual end communication from (1) -414 (n) of DTD 414 Decision operation.Therefore, dual end communication decision-making helps in near-end to believe before signal is wiped out or removed completely to non-linear preprocessor 204 Number distinguished between remaining audio-frequency noise.
Can by the audio signal 1 with suppressed noise ..., n is provided to vocoder 408 so that audio-frequency signal coding to be arrived In voice packet.Additionally or alternatively, the audio signal with suppressed noise can be provided to speech recognition block 410 for entering one Walk Audio Signal Processing.
The number n of (1) -404 (n) of audio arrester 404 can be selected based on for example following various considerations:It is expected that noise Environment, computing capability, real-time constraint, memory, performance and/or similar consideration.It is it will be appreciated, however, that it is contemplated that other applicable Factor.Also, it should be understood that audio suppressor system can include any applicable number synchronization blocks.In certain embodiments, The number of these components can be discussed on Fig. 5 as follows on the number dynamic change of identified noise component(s).
Another certain illustrative for the audio suppressor system 500 that Fig. 5 displayings are implemented by Fig. 2 apparatus for processing audio 202 The block diagram of embodiment.400,500 common element of system for Figure 4 and 5 shares collective reference mark, and is risen for succinct See, only describe the difference between system 400,500 herein.
Audio suppressor system 500 has n (1) -402 (n) of synchronization blocks 402, n (1) -404 of audio arrester block 404 (n), source identifier detector 502 and reconfigurable arrester launcher 504.Source identifier detector 502 receives n through knowing Other source signal is active to determine which of identified source signal path.For example, source identifier detector 502 can be based on letter Number presence or the grade of energy of the signal on that path determine active source signal path.It is also, reconfigurable (1) -404 (n) of audio arrester block 404 of the activation of arrester launcher 504 corresponding to active identified source signal path. Each active audio frequency arrester block in (1) -404 (n) of audio arrester block 404 can be as described above for operation as described by Fig. 4.Sound In (1) -404 (n) of frequency arrester block 404 each passive audio arrester block (such as) can be configured as feedthrough filter.
The flow chart for the method 600 that Fig. 6 displayings suppress according to the audio-frequency noise of one embodiment.Although ensuing method Description focus on embodiment on personal audio processing unit 202 (for example, mobile phone, personal audio player), But other devices can be configured to perform methods described or its version.Methods described can be embodied as software module or with phase The non-transitory Computer Memory Unit of the computing device of apparatus for processing audio 202 is associated with (for example, RAM, ROM, hard drive Device or fellow) resident module together set.One or more processors of computing device can perform software module.
In block 602, method 600, which includes, receives input audio signal.Come from for example, apparatus for processing audio 202 can receive The microphone 206 of apparatus for processing audio 202, from data storage device 210 or the device of memory 212 or in communication interface 208 Locate the input audio signal received.
It is after receiving input audio signal at frame 602, process 600 is moved to frame 604, wherein determining audio input signal Whether the noise with recognizable content is included.For example, in one embodiment, apparatus for processing audio 202 may perform to certainly The instruction of audio signal analysis device module 216 is to determine to can be used to identify the feature of the audio input signal of the content of audio-frequency noise Information.Characteristic information can determine content identification information by content identifier 308.In one embodiment, audio frequency process fills Server can be sent to for further processing via network by characteristic information by putting 202, and then receive content recognition via network Information.In another embodiment, content identifier 308 and the function of 312 pieces of source searcher can be performed to apparatus for processing audio 202 One or more of to determine content identification information.The one of the method for operation for implementing frame 604 is described below in relation to Fig. 7 A Individual embodiment.
In another embodiment, the operation of frame 604 by from audio signal analysis device module 216 execute instruction perform with Communicated with independent media apparatus to determine whether audio input signal has recognizable content.For example, apparatus for processing audio 202 Can ask whether just playing audio frequency media (and if it is, content identification information) on media apparatus from independent media apparatus Information.As response, apparatus for processing audio 202 can receive content identification information.
Once having made the determination that audio input signal contains the ambient noise with recognizable content, method 600 is moved Frame 606 is moved to access the content source of recognizable content to obtain source signal.For example, in one embodiment, audio frequency process Device 202 can access content source or content source letter via communication interface 208 or via memory 212 or data storage device 210 Number.For example, the content identification information obtained in block 604 can be used to position and access content source.Content source can be used to generating source Signal.One embodiment of the method for operation for implementing frame 606 is described below in relation to Fig. 7 B.
After at least a portion of source signal is available, method 600 proceeds to frame 608, wherein based on compare source signal with it is defeated Enter audio signal come noise of decaying.For example, in one embodiment, apparatus for processing audio 202 performs the audio in memory 212 The instruction of canceller module 218 with according to shown in Fig. 4 or 5 audio suppressor system come attenuation audio noise.
Turning now to Fig. 7 A, show according to one embodiment carry out in block 604 be used for determine that audio input signal is The illustrative flow of the example of no the step of including the noise with recognizable content.In block 702, method 604 determines The characteristic information of input audio signal.For example, in one embodiment, apparatus for processing audio 202 performs the sound in memory 212 The instruction of frequency signal analyzer module 216 extracts feature with the feature extractor 306 according to shown in Fig. 3.Once have determined that foot Enough characteristic informations, process 604 then move to frame 704 to provide characteristic information to identify content source.For example, in an implementation In example, apparatus for processing audio 202 performs the instruction of the audio signal analysis device module 216 in memory 212 so that communication interface Characteristic information is transmitted into server unit by 208 via network, for being handled by server unit.
After characteristic information is provided, method 604 proceeds to frame 706 and is used to obtain content identification information.For example, sound Frequency processing device 202 can receive content identification information from the server unit that characteristic information is received when performing frame 704.Alternatively Or in addition, in certain embodiments, apparatus for processing audio 202 by processing unit 202 is performed steps necessary rather than with service Device device communicates to produce content identification information.For example, the processor 204 of apparatus for processing audio 202 can perform in memory 212 Audio signal analysis device module 216 instruction with the audio signal analysis device 300 of implementing Fig. 3.
Fig. 7 B are for accessing content source to obtain the illustrative of the case method 606 of source signal according to one embodiment Flow chart.In frame 708, method 606 includes the search content source associated with the content identification information of reception.For example, one In individual embodiment, apparatus for processing audio 202 performs the instruction of the audio signal analysis device module 216 in memory 212 to search for The media library being stored in data storage device 210.After search, method 606 is proceeded to frame 710 and is used to be produced based on search result Raw or reception source signal.If for example, locally find content source on apparatus for processing audio 202, then processor 204, which performs, to be referred to Make to produce source signal from content source.If do not find content source locally on device, then in one embodiment, at audio Reason device 202 performs the instruction of the audio signal analysis device module 216 in memory 212 to ask and receive via network 108 Identified source signal from content data base 110.
Turning now to Fig. 8, example the step of being used for attenuation audio noise carried out in block 608 according to embodiment is shown Illustrative flow.In frame 810, the time between input audio signal and synchronized two signals with compensation of source signal prolongs Late.In operation, for comprising variable from many reasons from various delays caused by unlike signal and calculating path, signal Obtain asynchronous.For synchronizing signal, each signal can be stored in the data buffer (for example, cyclic buffer) of variable-length In to control the sequential of each signal.For example, referring to Fig. 4, input audio signal can be received at first synchronization blocks 402 (1) place With identified source signal.Signal can be stored in the corresponding circulation stored in buffer block 416 (1) by synchronization blocks 402 (1) In buffer data structure, wherein the length of cyclic buffer can be the function of the desired delay for synchronizing signal.One In a little embodiments, for example, being calculated during calibration mode or estimating desired delay.When input audio signal is disappeared by n audio When being handled except (1) -404 (n) of device 404, the extra delay in audio signal may occur in which in n (1) -404 of audio arrester 404 Each of (n) place.For example, the filtering carried out by n (1) -412 (n) of sef-adapting filter 412 can bring delay.In addition, Be attributed to (such as) identify and finally receive the time that identified source signal is spent, n identified source signals can undergo Various delays.Therefore, it n (1) -402 (n) of synchronization blocks 402 can be used to compensate those various delays and be maintained at audio elimination Synchronous audio signal and identified source signal during reason.
After audio input signal and identified source signal is synchronized, it is identified to filter that method 608 proceeds to frame 820 Source signal with consider influence audio-frequency noise acoustic effect, for example, acoustic power, loudspeaker and Mike's wind kinetics and class Like person.Be filtered is because identified source signal may inaccurately represent the audio-frequency noise captured by microphone 206.Such as The identified source signal of fruit substantially changes with audio-frequency noise, then audio suppresses may not be effective.In order to improve noise suppression System, it can estimate that the effect of these factors is made an uproar to be shaped to identified source signal with tight fit or repetition audio on line Sound.For example, referring now to Fig. 4, synchronized audio input signal and synchronized identified source signal 1 are passed into audio-frequency noise Arrester 404 (1) and pass to sef-adapting filter 412 (1).Sef-adapting filter 412 (1) can be then to identified source signal 1 filtering is shaped to produce reference signal substantially to repeat audio-frequency noise.Sef-adapting filter 412 (1) can have influence filtering One or more filter parameters for the mode that device shapes to identified source signal are (for example, infinite impulse response filter has Limit one or more filter parameters of impulse response).Some embodiments include tunable parameter to consider broad range of sound Learn effect.
After source signal synchronous and that filtering is identified, method 608 may proceed to frame 830 with by more synchronized sound Frequency input produces the audio signal through processing with filtered source signal.In one embodiment, it is defeated from synchronized audio Enter signal and subtract filtered source signal.In order to illustrate, the output of Fig. 4 displaying sef-adapting filters 412 (1) is passed through from audio input Signal is subtracted to produce the audio signal with suppressed noise 1.In one embodiment, with suppressed noise 1 Audio signal can be through processing for communication or speech recognition application.In another embodiment, there is the audio of suppressed noise 1 Signal can be processed for further noise suppressed.For example, Fig. 4 displayings can be by the sound through processing with suppressed noise 1 Frequency signal is provided to (2) -402 (n) of synchronization blocks 402 and (2) -404 (n) of audio arrester 404 has identified source with suppression Signal 2-n additional noise 2-n.
Optionally, after frame 820 is performed, what method 608 may proceed to frame 840 to adjust sef-adapting filter 412 (1) can The filter parameter of tuning, to improve the noise suppressed on broad range of acoustic effect.In one embodiment, it is tunable Filter parameter adjustment by adaptive rule or renewal rule come keyholed back plate.For example, referring to Fig. 4, sef-adapting filter 414 (1) both synchronized audio input signal and identified source signal are received.Sef-adapting filter 414 (1) can be produced through filter The source signal of ripple." error signal " or " inhibiting factor can be produced by comparing audio input signal and filtered source signal Signal ".Error signal may indicate that sef-adapting filter is just repeating the tightness degree of audio-frequency noise.For example, if audio input is real It is made up of in matter audio-frequency noise, then the difference instruction between synchronized audio input signal and filtered source signal is filtered Identified source signal and audio-frequency noise between mismatch amount.That is, the small just close model of error indicator sef-adapting filter Change indoor actual acoustic dynamics.May be selected adaptive rule (for example, be based on gradient or recursive least square, or it is similar just Method) the tunable filter parameter of sef-adapting filter 412 (1) is adjusted in a manner of by error signal is reduced.
However, when audio input signal corresponding to the audio-frequency noise of identified source signal 1 substantially by not forming, Sef-adapting filter 414 (1) can irrelevantly adjust its tunable parameter.For example, voice of the audio signal containing user Order or from second-source audio-frequency noise.In this case, error signal can not provide sef-adapting filter positive match (example As) on audio-frequency noise 1 room acoustics tightness degree significant instruction.Therefore, when DTD blocks detect this condition When, DTD 414 (1) block can disconnect the adjustment of sef-adapting filter, be stated as previously discussed with respect to Fig. 4.
As stated, n identified source signals can be used and made an uproar to perform the step of synchronous and filtering with eliminating n audio Sound.Referring to Fig. 4, it is manifestly that, sequentially eliminate audio-frequency noise.However, in certain embodiments, it can concurrently eliminate noise.
Fig. 9 is the flow chart of the certain illustrative method 900 suppressed according to the audio-frequency noise of embodiment.In box 902, side Method 900, which includes, receives input audio signal.Can be as on performing frame 902 as described in Fig. 6.Receiving audio input signal At least a portion after, method 900 proceeds to frame 904, for receiving the letter relevant with the noise as caused by independent media apparatus Breath.For example, apparatus for processing audio 202 can be by performing the instruction from audio signal analysis device module 216 and communication module 220 Communicated with independent media apparatus, as discussed on Fig. 2.Independent media apparatus can provide independent matchmaker to apparatus for processing audio 202 Whether body device is just producing the instruction of the noise with media content.Additionally or alternatively, independent media apparatus can transmit content Content identification information can be used to search for content source for identification information, apparatus for processing audio 202.The example of content identification information includes TV Channel, radio frequency and similar media broadcast selection information.In one embodiment, source signal can be sent to sound by independent media apparatus Frequency processing device 202.
After the information relevant with noise is received in apparatus for processing audio 202, method 900 proceeds to frame 906, for based on The information of the reception relevant with the noise as caused by independent media apparatus receives source signal.If for example, apparatus for processing audio from Separated media apparatus receives the instruction that media apparatus is just producing noise, or if apparatus for processing audio 202 receives content Identification information, then apparatus for processing audio 202 can by perform Fig. 6,7A and 7B as described above method 604 and 606 come Receive source signal.In certain embodiments, apparatus for processing audio 202 receives source signal from independent media apparatus.For example, independent matchmaker Body device can launch the copy for the media that independent media apparatus is just playing.
After source signal is received, method 900 may proceed to frame 908 and be used for based on comparing source signal and input audio signal Come noise of decaying.For example, apparatus for processing audio 202 is made an uproar by performing Fig. 6 as described above and 8 method 608 come attenuation audio Sound.
The technology is operated by numerous other universal or special computing system environments or configuration.It is suitably adapted for supplying this hair The example of bright well-known computing system, environment and/or the configuration used is including (but not limited to) personal computer, server Computer, hand-held or laptop devices, multicomputer system, the system based on processor, programmable-consumer type electronic installation, Network PC, microcomputer, mainframe computer, DCE and class comprising any one of system above or device Like person.
As used herein, the computer-implemented step for the information in processing system is referred to.Instruction can be Implement in software, firmware or hardware and include any kind of programming step of the component progress by system.
Processor can be any conventional general purpose single-chip or multi-chip processor, for example,II orII processors,Processor, IntelProcessor orProcessor is appointed What embodiment.In addition, the processor can be any conventional application specific processor, comprising OMAP processors, Processor (for example,) or digital signal processor or graphics processor.Processor generally has conventional address Line, routine data line and one or more conventional control lines.
The system includes the various modules being such as discussed in detail.Such as those skilled in the art it can be appreciated that the module Each of include various subroutines, program, clearly statement and it is grand.Each of described module is generally individually compiled Translate and be linked in single executable program.Therefore, come for convenience using the description of each of the module The feature of optimum decision system is described.Therefore, the process that each of described module is undergone can arbitrarily be reassigned to it One of its module, combine together in individual module, or cause available for (such as) in sharable dynamic link library.
Can with any conventional programming language system for writing and compiling, for example, C#, C, C++, BASIC, Pascal or Java, and Run under conventional operating system.C#, C, C++, BASIC, Pascal, Java and FORTRAN are professional standard programming languages, are permitted It can be used to create executable code for more commercial compiler.Such as Perl, Python or Ruby interpretive language can also be used System for writing and compiling.
Those skilled in the art will be further understood that, the various explanations described with reference to embodiments disclosed herein Property logical block, module, circuit and algorithm steps can be embodied as the combination of electronic hardware, computer software or both.For clear theory This interchangeability of bright hardware and software, be generally concerned with its feature above and describe various Illustrative components, block, module, Circuit and step.Such feature is implemented as hardware or software depends on application-specific and forces at the design of whole system Constraint.Those skilled in the art can be implemented in various ways described feature, but these realities for each application-specific Apply decision-making and should not be interpreted as causing deviation the scope of the present invention.
General processor, digital signal processor (DSP), application specific integrated circuit (ASIC), field-programmable gate array can be used Row (FPGA) or other programmable logic devices, discrete gate or transistor logic, discrete hardware components or its be designed to perform Any combinations of functions described herein are practiced or carried out the various theorys described with reference to embodiment disclosed herein Bright property logical block, module and circuit.General processor can be microprocessor, but in alternative solution, processor can be any normal Reason device, controller, microcontroller or the state machine of rule.Processor can also be embodied as the combination of computing device, for example, DSP with it is micro- Combination, multi-microprocessor, one or more microprocessor combination DSP cores or any other such configuration of processor.
In one or more example embodiments, described function and method can be held with hardware, software or on a processor Capable firmware or its any combinations are implemented.If implemented with software, then can be using function as one or more instructions or generation Code is stored on computer-readable media or launched via computer-readable media.Computer-readable media stores comprising computer Both media and communication medium, communication medium include any media for promoting that computer program is sent to another place at one. Storage media can be can be by any useable medium of computer access.It is unrestricted by means of example, such computer-readable matchmaker Body may include RAM, ROM, EEPROM, CD-ROM or other optical disk storage apparatus, disk storage device or other magnetic storages dress Put, or available for carry or store in instruction or data structure in the form of desired program code and can be by computer access Any other media.Also, any connection is properly called computer-readable media.For example, if using coaxial electrical Cable, fiber optic cables, twisted-pair feeder, digital subscriber line (DSL) or wireless technology (for example, infrared ray, radio and microwave) from website, Server or other remote sources transmitting software, then the coaxial cable, fiber optic cables, twisted-pair feeder, DSL or wireless technology (example Such as, infrared ray, radio and microwave) it is included in the definition of media.As used herein, disk and CD include squeezed light Disk (CD), laser-optical disk, optical compact disks, digital image and sound optical disk (DVD), floppy discs and Blu-ray Disc, wherein disk generally with Magnetic means regenerate data, and CD regenerates data optically using laser.Combination above should also be included in computer In the range of readable media.
The some embodiments described above for detailing systems, devices and methods disclosed herein.It will be appreciated, however, that not How pipe foregoing teachings show in detail in the text, can all put into practice the systems, devices and methods in many ways.Also such as It is set forth above, it should be noted that to be understood not to secretly using particular term in the special some features or aspect of the description present invention Show that the term is redefined times for being limited to the feature comprising the technology associated by the term or aspect herein What particular characteristics.
It is understood by those skilled in the art that, can be carried out in the case where not departing from the scope of described technology various Modifications and changes.Wish that these modifications and changes belong to the scope of embodiment.Skilled artisan will also appreciate that one The part included in individual embodiment can exchange with other embodiments;One or more parts from the embodiment described can be with it The embodiment that it is described is included in together with any combinations.For example, the various assemblies described in described herein and/or figure In appoint a whichever can be combined, exchange or from other embodiments exclude.
On the use of substantially any plural number and/or singular references herein, those skilled in the art can be suitable Pluralized in the case of context and/or application by complex conversion singularization and/or by odd number conversion.For clarity, originally Wen Zhongke is explicitly described various singular/plural arrangements.
Those skilled in the art will appreciate that in general, term as used herein it is generally desirable to as " open " Term (for example, term " including (including) " should be construed into " including (but not limited to) ", should explain term " having " For " having at least ", term " including (includes) " should be construed to " including (but not limited to) " etc.).The technology of art Personnel will be further understood that, if it is desired to have the claim narration that given number is introduced, then this intention will be chatted clearly It is set forth in the claim, and in the case of in the absence of this narration, in the absence of this intention.For example, as to the auxiliary of understanding Help, use of the following following claims containing introductory phrase " at least one " and " one or more ", to introduce power Profit requires narration.However, the use of these phrases should be not construed as to imply that introducing claim by indefinite article " one " describes Any specific rights requirement described containing the claim so introduced is limited to the embodiment only containing this narration, very To when same claim includes introductory phrase " one or more " or " at least one " and such as indefinite article of " one " Also so (for example, " one " generally should be construed as meaning " at least one " or " one or more ");This is equally applicable to using fixed Article come introduce claim narration situation.In addition, chatted even if having enunciated the introduced claim of given number State, those skilled in the art is also it will be recognized that this narration generally should be construed as meaning the number (example at least described Such as, without other modifiers without modification narration " two narration " the generally narration of meaning at least two or two or two with Upper narration).In addition, in the case of the convention similar to " at least one of A, B and C etc. " is used, in general, this Construction wish with those skilled in the art will appreciate that the convention meaning come understand (for example, " have A, B and C in extremely The system of few one " will including (but not limited to) with independent A, independent B, independent C, A with together with B, A with together with C, B is together with C And/or the system of A, B together with C etc.).In those situations using the convention similar to " at least one of A, B or C etc. " Under, in general, this construction wish with those skilled in the art will appreciate that the convention meaning come understand (for example, " tool Have at least one of A, B or C system " will including (but not limited to) with independent A, independent B, independent C, A together with B, A with C together, B with together with C and/or system together with C of A, B etc.).Technical staff in art will be further understood that, actual Upper any turning word and/or phrase that two or more substituting terms are presented, is either wanted in specification, right Ask in book or schema, it should be understood that to cover including any one of one of described term, the term or two arts The possibility of language.For example, " A or B " are interpreted as including " A " or " B " or " A and B " possibility phrase.
Although having revealed that various aspects and embodiment herein, other side is appreciated by those skilled in the art that And embodiment.Various aspects and embodiment disclosed herein are in order at the purpose of explanation and are not intended to be restricted.

Claims (18)

1. a kind of device to attenuation audio noise, described device includes:
Microphone, it is configured to receive input audio signal;
Audio signal analysis device, it is configured to determine whether the input audio signal is comprised at least with recognizable content Noise signal and audio signal, if wherein the noise signal has recognizable content, then the audio signal point Parser is further configured obtains the copy of the noise signal to access content source;And
Audio arrester, it is configured to (i) by the copy of the noise signal obtained and the input sound Frequency signal is compared that simultaneously (ii) applies at least one wave filter to the input sound to generate at least one wave filter Frequency signal is to decay the audio signal to produce the audio signal through processing.
2. device according to claim 1, wherein the audio signal analysis device is configured to perform for by independent matchmaker Whether input audio signal described in noise signal caused by body device includes the noise signal with recognizable content The determination.
3. device according to claim 1, it further comprises communication interface, wherein the audio signal analysis device is through entering One step configuration with:
Determine the characteristic information of the input audio signal;
Using the communication interface, launch the characteristic information;And
Using the communication interface, at least the copy of the noise signal is received based on the characteristic information is launched.
4. device according to claim 3, wherein the audio signal analysis device be further configured with:
Via the communication interface, content identification information is received in response to providing the characteristic information;
Based on the content identification information and the content source for matching the reception, described device is searched for for the content source;With And
If the search causes to match the content source, then the pair of the noise signal is produced from the content source This.
5. device according to claim 1, wherein the audio signal analysis device includes:
Feature extractor, it is configured to determine the characteristic information of the input audio signal;
Content identifier, it is configured to determine the content identification information associated with the characteristic information;
Source searcher, it is configured to search for for the content source with the content source based on the matching content identification information Database;And
Source transmitter, if it, which is configured to the search, located the content source, then from described in content source generation The copy of noise signal.
6. device according to claim 1, it further comprises being configured to postponing the input audio signal and described The signal synchronizer of at least one of the copy of noise signal, wherein the audio arrester includes:
Sef-adapting filter, it has tunable filter parameter, and the sef-adapting filter is configured to make an uproar based on described The synchronized copy of acoustical signal and the tunable filter parameter produce filtered noise signal, the adaptive filter Ripple device is configured to the synchronized copy based on the synchronized input audio signal Yu the noise signal Adjust the tunable filter parameter;And
The double talk detection device of the sef-adapting filter, its be configured to when the double talk detection device detect except The outer input audio signal of the copy of the noise signal, which also has during the audio signal, disables the sef-adapting filter The tunable filter parameter adjustment,
Wherein described audio arrester passes through the filtered noise signal and the synchronized input audio signal Come the copy of noise signal described in comparison and the input audio signal.
7. device according to claim 1, its further comprise being configured to described device and independent media apparatus it Between communication channel upload delivery data communication module, wherein the communication module from it is described individually media apparatus receive described in make an uproar At least a portion of the copy of acoustical signal.
8. device according to claim 1, its further comprise being configured to described device and independent media apparatus it Between communication channel upload delivery data communication module, wherein the communication module receive corresponding to the content source content know Other data.
9. a kind of method to attenuation audio noise, methods described includes:
Receive input audio signal;
Determine whether the input audio signal comprises at least noise signal and audio signal with recognizable content;
If the noise signal has recognizable content, then accesses content source to obtain the copy of the noise signal; And
By (i) by the copy of the noise signal obtained compared with the input audio signal with generate to Simultaneously (ii) applies at least one wave filter to the input audio signal with the audio letter of decaying a few wave filter Number produce the audio signal through processing.
10. according to the method for claim 9, wherein described determine whether the input audio signal is included with recognizable Content the noise signal the step of include determining the input audio signal whether comprising being produced by independent media apparatus And with the noise signal of recognizable content.
11. according to the method for claim 9, wherein described determine whether the input audio signal is included with recognizable Content the noise signal the step of include:
Determine the characteristic information of the input audio signal;And
Launch the characteristic information,
The step of copy of the wherein described access noise signal includes at least to be connect based on the characteristic information is launched Receive the copy of the noise signal.
12. the method according to claim 11, have and can know wherein whether the determination input audio signal includes The step of noise signal of other content, includes:
Content identification information is received in response to providing the characteristic information;
Based on the content identification information and the content source for matching the reception, the content source is searched for;And
If the search causes to match the content source, then the pair of the noise signal is produced from the content source This.
13. according to the method for claim 9, wherein described determine whether the input audio signal is included with recognizable Content the noise signal the step of include:
Determine the characteristic information of the input audio signal;
It is determined that the content identification information associated with the characteristic information;
Based on the content identification information and the content source is matched, database is searched for for the content source;And
If the search located the content source, then the copy of the noise signal is produced from the content source.
14. according to the method for claim 9, it further comprises postponing the input audio signal and the noise signal At least one of the copy with the synchronous input audio signal and the copy of the noise signal, wherein producing The step of audio signal through processing, includes:
Using the sef-adapting filter with tunable filter parameter with the synchronized copy based on the noise signal Filtered noise signal is produced with the tunable filter parameter;
The synchronized copy based on the synchronized input audio signal and the noise signal is come selectivity Ground adjusts the tunable filter parameter;And
It is determined that whether the input audio signal also has the audio signal in addition to the noise signal;
When it is determined that the input audio signal also has the audio signal in addition to the noise signal, disable described adaptive The adjustment of the tunable filter parameter of wave filter is answered,
The institute of the noise signal wherein carried out by the noise signal and the input audio signal State copy and include the filtered noise signal and the synchronized input audio letter with the input audio signal Number.
15. according to the method for claim 9, it further comprises transmitting data over the communication channels with independent media apparatus To receive at least a portion of the copy of the noise signal from the individually media apparatus.
16. according to the method for claim 9, it further comprises passing data to independent media dress over the communication channels Put to receive the content identification data corresponding to the content source.
17. a kind of equipment to attenuation audio noise, the equipment includes:
For receiving the device of input audio signal;
For determining whether the input audio signal comprises at least noise signal and audio signal with recognizable content And if for the noise signal there is recognizable content so optionally to access content source and believed with obtaining the noise Number copy device;And
For by (i) by the copy of the noise signal obtained compared with the input audio signal with life Into at least one wave filter, simultaneously (ii) applies at least one wave filter to the input audio signal with the sound of decaying Frequency signal produces the device of the audio signal through processing.
18. equipment according to claim 17, wherein the device for being used to determine includes:
For determining the device of the content identification information associated with the characteristic information of the input audio signal;
For the device based on the matching content identification information with the content source for content source search database;With And
If the content source is located so from the pair of the content source generation noise signal for the search This device.
CN201480010777.4A 2013-03-06 2014-02-20 Noise suppressed based on content Active CN105027541B (en)

Applications Claiming Priority (3)

Application Number Priority Date Filing Date Title
US13/787,605 2013-03-06
US13/787,605 US9275625B2 (en) 2013-03-06 2013-03-06 Content based noise suppression
PCT/US2014/017381 WO2014137612A1 (en) 2013-03-06 2014-02-20 Content based noise suppression

Publications (2)

Publication Number Publication Date
CN105027541A CN105027541A (en) 2015-11-04
CN105027541B true CN105027541B (en) 2018-01-16

Family

ID=50236325

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201480010777.4A Active CN105027541B (en) 2013-03-06 2014-02-20 Noise suppressed based on content

Country Status (6)

Country Link
US (1) US9275625B2 (en)
EP (1) EP2965496B1 (en)
JP (1) JP2016513816A (en)
KR (1) KR20150123902A (en)
CN (1) CN105027541B (en)
WO (1) WO2014137612A1 (en)

Families Citing this family (34)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9947333B1 (en) * 2012-02-10 2018-04-17 Amazon Technologies, Inc. Voice interaction architecture with intelligent background noise cancellation
US9570087B2 (en) * 2013-03-15 2017-02-14 Broadcom Corporation Single channel suppression of interfering sources
WO2015009293A1 (en) * 2013-07-17 2015-01-22 Empire Technology Development Llc Background noise reduction in voice communication
US10149047B2 (en) * 2014-06-18 2018-12-04 Cirrus Logic Inc. Multi-aural MMSE analysis techniques for clarifying audio signals
US10325591B1 (en) * 2014-09-05 2019-06-18 Amazon Technologies, Inc. Identifying and suppressing interfering audio content
US9952095B1 (en) 2014-09-29 2018-04-24 Apple Inc. Methods and systems for modulation and demodulation of optical signals
US9979955B1 (en) * 2014-09-30 2018-05-22 Apple Inc. Calibration methods for near-field acoustic imaging systems
US9747488B2 (en) 2014-09-30 2017-08-29 Apple Inc. Active sensing element for acoustic imaging systems
JP6322125B2 (en) * 2014-11-28 2018-05-09 日本電信電話株式会社 Speech recognition apparatus, speech recognition method, and speech recognition program
CN105988049B (en) * 2015-02-28 2019-02-19 惠州市德赛西威汽车电子股份有限公司 A kind of adjustment method of noise suppressed
US9672821B2 (en) * 2015-06-05 2017-06-06 Apple Inc. Robust speech recognition in the presence of echo and noise using multiple signals for discrimination
CN108028048B (en) 2015-06-30 2022-06-21 弗劳恩霍夫应用研究促进协会 Method and apparatus for correlating noise and for analysis
US11048902B2 (en) 2015-08-20 2021-06-29 Appple Inc. Acoustic imaging system architecture
CN105489215B (en) * 2015-11-18 2019-03-12 珠海格力电器股份有限公司 A kind of Noise Sources Identification method and system
CN108353228B (en) * 2015-11-19 2021-04-16 香港科技大学 Signal separation method, system and storage medium
WO2017191249A1 (en) * 2016-05-06 2017-11-09 Robert Bosch Gmbh Speech enhancement and audio event detection for an environment with non-stationary noise
US9838815B1 (en) * 2016-06-01 2017-12-05 Qualcomm Incorporated Suppressing or reducing effects of wind turbulence
KR102515996B1 (en) * 2016-08-26 2023-03-31 삼성전자주식회사 Electronic Apparatus for Speech Recognition and Controlling Method thereof
US10554822B1 (en) * 2017-02-28 2020-02-04 SoliCall Ltd. Noise removal in call centers
JP6545419B2 (en) * 2017-03-08 2019-07-17 三菱電機株式会社 Acoustic signal processing device, acoustic signal processing method, and hands-free communication device
JP7020799B2 (en) * 2017-05-16 2022-02-16 ソニーグループ株式会社 Information processing equipment and information processing method
US10475449B2 (en) * 2017-08-07 2019-11-12 Sonos, Inc. Wake-word detection suppression
US10313218B2 (en) * 2017-08-11 2019-06-04 2236008 Ontario Inc. Measuring and compensating for jitter on systems running latency-sensitive audio signal processing
EP3692530B1 (en) 2017-10-02 2021-09-08 Dolby Laboratories Licensing Corporation Audio de-esser independent of absolute signal level
US10847178B2 (en) * 2018-05-18 2020-11-24 Sonos, Inc. Linear filtering for noise-suppressed speech detection
US10297266B1 (en) 2018-06-15 2019-05-21 Cisco Technology, Inc. Adaptive noise cancellation for multiple audio endpoints in a shared space
US10867615B2 (en) * 2019-01-25 2020-12-15 Comcast Cable Communications, Llc Voice recognition with timing information for noise cancellation
EP3709194A1 (en) 2019-03-15 2020-09-16 Spotify AB Ensemble-based data comparison
CN110047497B (en) * 2019-05-14 2021-06-11 腾讯科技(深圳)有限公司 Background audio signal filtering method and device and storage medium
US11017792B2 (en) * 2019-06-17 2021-05-25 Bose Corporation Modular echo cancellation unit
US11094319B2 (en) * 2019-08-30 2021-08-17 Spotify Ab Systems and methods for generating a cleaned version of ambient sound
US11308959B2 (en) 2020-02-11 2022-04-19 Spotify Ab Dynamic adjustment of wake word acceptance tolerance thresholds in voice-controlled devices
US11328722B2 (en) 2020-02-11 2022-05-10 Spotify Ab Systems and methods for generating a singular voice audio stream
US11950512B2 (en) 2020-03-23 2024-04-02 Apple Inc. Thin-film acoustic imaging system for imaging through an exterior surface of an electronic device housing

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2007007916A1 (en) * 2005-07-14 2007-01-18 Matsushita Electric Industrial Co., Ltd. Transmitting apparatus and method capable of generating a warning depending on sound types
CN1996307A (en) * 2000-07-31 2007-07-11 兰德马克数字服务公司 A method for recognizing a media entity in a media sample
CN101355600A (en) * 2007-07-26 2009-01-28 株式会社卡西欧日立移动通信 Noise suppression system, sound acquisition apparatus, sound output apparatus and computer-readable medium

Family Cites Families (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7747002B1 (en) * 2000-03-15 2010-06-29 Broadcom Corporation Method and system for stereo echo cancellation for VoIP communication systems
US6990453B2 (en) * 2000-07-31 2006-01-24 Landmark Digital Services Llc System and methods for recognizing sound and music signals in high noise and distortion
EP1457889A1 (en) * 2003-03-13 2004-09-15 Koninklijke Philips Electronics N.V. Improved fingerprint matching method and system
JP2005077865A (en) * 2003-09-02 2005-03-24 Sony Corp Music retrieval system and method, information processor and method, program, and recording medium
US20090012786A1 (en) 2007-07-06 2009-01-08 Texas Instruments Incorporated Adaptive Noise Cancellation
US20090034750A1 (en) 2007-07-31 2009-02-05 Motorola, Inc. System and method to evaluate an audio configuration
US9112989B2 (en) 2010-04-08 2015-08-18 Qualcomm Incorporated System and method of smart audio logging for mobile devices
US9280598B2 (en) 2010-05-04 2016-03-08 Soundhound, Inc. Systems and methods for sound recognition
US8812014B2 (en) 2010-08-30 2014-08-19 Qualcomm Incorporated Audio-based environment awareness
JP5561195B2 (en) * 2011-02-07 2014-07-30 株式会社Jvcケンウッド Noise removing apparatus and noise removing method
US9384726B2 (en) 2012-01-06 2016-07-05 Texas Instruments Incorporated Feedback microphones encoder modulators, signal generators, mixers, amplifiers, summing nodes

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1996307A (en) * 2000-07-31 2007-07-11 兰德马克数字服务公司 A method for recognizing a media entity in a media sample
WO2007007916A1 (en) * 2005-07-14 2007-01-18 Matsushita Electric Industrial Co., Ltd. Transmitting apparatus and method capable of generating a warning depending on sound types
CN101355600A (en) * 2007-07-26 2009-01-28 株式会社卡西欧日立移动通信 Noise suppression system, sound acquisition apparatus, sound output apparatus and computer-readable medium

Also Published As

Publication number Publication date
WO2014137612A1 (en) 2014-09-12
EP2965496A1 (en) 2016-01-13
EP2965496B1 (en) 2018-01-03
JP2016513816A (en) 2016-05-16
US9275625B2 (en) 2016-03-01
CN105027541A (en) 2015-11-04
KR20150123902A (en) 2015-11-04
US20140254816A1 (en) 2014-09-11

Similar Documents

Publication Publication Date Title
CN105027541B (en) Noise suppressed based on content
US11017799B2 (en) Method for processing voice in interior environment of vehicle and electronic device using noise data based on input signal to noise ratio
JP6637014B2 (en) Apparatus and method for multi-channel direct and environmental decomposition for audio signal processing
TWI520127B (en) Controller for audio device and associated operation method
RU2467406C2 (en) Method and apparatus for supporting speech perceptibility in multichannel ambient sound with minimum effect on surround sound system
CN111885275B (en) Echo cancellation method and device for voice signal, storage medium and electronic device
KR20180056752A (en) Adaptive Noise Suppression for UWB Music
CN109003617A (en) System and method for optimizing loudness and dynamic range between different playback apparatus
MX2007015446A (en) Multi-sensory speech enhancement using a speech-state model.
BRPI0812029B1 (en) method of recovering hidden data, telecommunication device, data hiding device, data hiding method and upper set box
CN104157292B (en) Anti- utter long and high-pitched sounds acoustic signal processing method and device
US11961504B2 (en) System and method for data augmentation of feature-based voice data
CN111640411B (en) Audio synthesis method, device and computer readable storage medium
US20190172477A1 (en) Systems and methods for removing reverberation from audio signals
US9832299B2 (en) Background noise reduction in voice communication
WO2014000658A1 (en) Method and device for eliminating noise, and mobile terminal
CN114792524A (en) Audio data processing method, apparatus, program product, computer device and medium
GB2516208B (en) Noise reduction in voice communications
JP2002064617A (en) Echo suppression method and echo suppression equipment
CN113516995B (en) Sound processing method and device
US20220406317A1 (en) Conference terminal and embedding method of audio watermarks
JP2011211547A (en) Sound pickup apparatus and sound pickup system
CN114724572A (en) Method and device for determining echo delay
CN115376538A (en) Voice noise reduction method, system, electronic device and storage medium for interaction

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant